…or how to run multiple commands in parallel
You can first try s3cmd and if it doesn’t work, go for an advanced solution which supports millions of files.
s3cmd restore \
--recursive s3://bucket.raw.rifiniti.com \
--restore-days=10
To bulk request files to be extracted from glacier I use this script. I hope that will be useful to you also
#!/bin/bash
#
# Get s3 objects from glacier by prefix
# The prefix is optional!
#
# How to use:
# ./export-prefix.sh bucketName 30 2019-04-30
# ./export-prefix.sh bucketName 30
#
#
export bucket=$1
# How many days to keep the objects
export day=$2
export prefix=$3
if [ -z "$prefix" ]
then
cmd="aws2 s3api list-objects --bucket $bucket"
else
cmd="aws2 s3api list-objects --bucket $bucket --prefix $prefix"
fi
readarray -t KEYS < <($cmd | jq '.Contents[] | select( .StorageClass != "STANDARD" ) | ."Key"')
for key in "${KEYS[@]}"; do
echo "aws s3api restore-object --bucket $bucket --key ${key} --restore-request '{\"Days\":$day,\"GlacierJobParameters\":{\"Tier\":\"Standard\"}}'" >> /tmp/commands.sh
done
echo "Generated file /tmp/commands.sh"
echo "Splitting the huge file into small files: /tmp/sub-commands*"
split -l 1000 /tmp/commands.sh /tmp/sub-commands.sh.
chmod a+x /tmp/sub-commands*
The script will generate in /tmp/commands.sh file with all the commands that you need to run.
When you have a lot of files it would be not possible to run the bash script because it would be killed at some point. To avoid this, we have to split the /tmp/commands.sh into parts. This is what the last part of the shell script is doing.
Now use this snippet to run the commands file by file.
for x in `ls /tmp/sub-commands*`; do
echo "working on $x"
`$x`
done
Or if you have installed “parallels” you can run much faster with
for x in `ls /tmp/sub-commands*`; do
echo "working on $x"
parallel -j 10 < $x
done
Update: Make the script work with keys containing spaces
Update2: Make it work with a lot of files and add parallel example