…or how to run multiple commands in parallel

You can first try s3cmd and if it doesn’t work, go for an advanced solution which supports millions of files.

s3cmd restore \
    --recursive s3://bucket.raw.rifiniti.com \
    --restore-days=10

To bulk request files to be extracted from glacier I use this script. I hope that will be useful to you also

#!/bin/bash
#
# Get s3 objects from glacier by prefix
# The prefix is optional!
#
# How to use:
#  ./export-prefix.sh bucketName 30 2019-04-30
#  ./export-prefix.sh bucketName 30
#
#
export bucket=$1

# How many days to keep the objects
export day=$2
export prefix=$3

if [ -z "$prefix" ]
then
  cmd="aws2 s3api list-objects  --bucket $bucket"
else
  cmd="aws2 s3api list-objects  --bucket $bucket --prefix $prefix"
fi

readarray -t KEYS < <($cmd | jq '.Contents[] |  select( .StorageClass != "STANDARD" ) | ."Key"')
for key in "${KEYS[@]}"; do
  echo "aws s3api restore-object --bucket $bucket --key ${key} --restore-request '{\"Days\":$day,\"GlacierJobParameters\":{\"Tier\":\"Standard\"}}'" >> /tmp/commands.sh
done

echo "Generated file /tmp/commands.sh"

echo "Splitting the huge file into small files: /tmp/sub-commands*"
split -l 1000 /tmp/commands.sh /tmp/sub-commands.sh.
chmod a+x /tmp/sub-commands*


The script will generate in /tmp/commands.sh file with all the commands that you need to run.

When you have a lot of files it would be not possible to run the bash script because it would be killed at some point. To avoid this, we have to split the /tmp/commands.sh into parts. This is what the last part of the shell script is doing.

Now use this snippet to run the commands file by file.

for x in `ls /tmp/sub-commands*`; do
  echo "working on $x"
  `$x`
done

Or if you have installed “parallels” you can run much faster with

for x in `ls /tmp/sub-commands*`; do
  echo "working on $x"
  parallel -j 10 < $x
done

Update: Make the script work with keys containing spaces

Update2: Make it work with a lot of files and add parallel example