Linux Directory Backup to AWS S3

cpcwood | Last updated: March 01^st, 2021

#! /bin/sh
# Directory Backup to AWS S3

echo "Starting Directory Backup..."

# Ensure all required environment variables are present
if [ -z "$GPG_KEY" ] || \
    [ -z "$GPG_KEY_ID" ] || \
    [ -z "$DIR_PATH" ] || \
    [ -z "$BACKUP_NAME" ] || \
    [ -z "$AWS_ACCESS_KEY_ID" ] || \
    [ -z "$AWS_SECRET_ACCESS_KEY" ] || \
    [ -z "$AWS_DEFAULT_REGION" ] || \
    [ -z "$S3_BUCKET" ]; then
    >&2 echo 'Required variable unset, backup failed'
    exit 1
fi

# Make sure required binaries are in path (YMMV)
export PATH=/snap/bin:/usr/local/bin:$PATH

# Import gpg public key from env
echo "$GPG_KEY" | gpg --batch --import

# Create backup params
backup_dir="$(mktemp -d)"
backup_file_name="$BACKUP_NAME--$(date +%d'-'%m'-'%Y'--'%H'-'%M'-'%S).tar.bz2.gpg"
backup_path="$backup_dir/$backup_file_name"

# Create, compress, and encrypt the backup
cp -R "$DIR_PATH" "$backup_dir/$BACKUP_NAME"
tar -cf - -C "$backup_dir" "./$BACKUP_NAME" | bzip2 | gpg --batch --recipient "$GPG_KEY_ID" --trust-model always --encrypt --output "$backup_path"

# Check backup created
if [ ! -e "$backup_path" ]; then
    echo 'Backup file not found'
    exit 1
fi

# Push backup to S3
aws s3 cp "$backup_path" "s3://$S3_BUCKET"
status=$?

# Remove tmp backup path
rm -rf "$backup_dir"

# Indicate if backup was successful
if [ $status -eq 0 ]; then
    echo "$BACKUP_NAME backup completed to '$S3_BUCKET'"

    # Remove expired backups from S3
    if [ "$ROTATION_PERIOD" != "" ]; then
        aws s3 ls "$S3_BUCKET" --recursive | while read -r line;  do
            stringdate=$(echo "$line" | awk '{print $1" "$2}')
            filedate=$(date -d"$stringdate" +%s)
            olderthan=$(date -d"-$ROTATION_PERIOD days" +%s)
            if [ "$filedate" -lt "$olderthan" ]; then
                filetoremove=$(echo "$line" | awk '{$1=$2=$3=""; print $0}' | sed 's/^[ \t]*//')
                if [ "$filetoremove" != "" ]; then
                    aws s3 rm "s3://$S3_BUCKET/$filetoremove"
                fi
            fi
        done
    fi
else
    echo "$BACKUP_NAME backup failed"
    exit 1
fi

Linux Directory Backup to AWS S3

Compress, encrypt, and backup a Linux directory to AWS S3 using cron.

Create an AWS S3 Bucket

Create a private Amazon AWS S3 bucket to store your database backups: AWS 'create bucket' guide.

Create IAM User

Create an IAM user in your AWS account with access to the S3 bucket created above: AWS 'create user' guide

The script requires, list, put, and delete access on the s3 bucket. So, the S3 policy JSON attached to the IAM user might look like:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:ListBucket",
                "s3:DeleteObject"
            ],
            "Resource": [
                "arn:aws:s3:::<bucket-name>/*",
                "arn:aws:s3:::<bucket-name>"
            ]
        }
    ]
}

Make sure to download or keep hold of the new user security credentials so you can add them to the backup script environment later.

Create PGP Keys

On a separate (ideally air-gapped) machine, install GPG:

apt install gnupg

Then create a pair of public and private encryption keys. Using public-key cryptography to encrypt the backup on the server will help prevent the database backup from being compromised if the environment variables are leaked.

Generate a keypair using your email for ID: gpg --gen-key
Export the public key: gpg --armor --export <your-email>
Export the secret key and move to secure storage.

Install Dependencies

Install the script dependencies on the VM:

GPG - Install GPG to encrypt backup files: apt install gnupg
AWS-CLI - Install AWS CLI tool to transfer the backup to AWS S3: see AWS guide
date - Ensure date is GNU core utilities date, not included in alpine linux (busybox) by default: apt install coreutils

Deploy to Server

Add Script

Add backup script (code snippet) to suitable directory:

mkdir ~/directory-backup
vim ~/directory-backup/directory-backup-s3.sh

Modify file permission to prevent unauthorized writes:

chmod 744 ~/directory-backup/directory-backup-s3.sh

Add Config

Add config file to contain config and credentials for the backup script, and modify file permission to prevent unauthorized reading of sensitive credentials:

touch ~/directory-backup/config.env
chmod 700 ~/directory-backup/config.env

Add config:

vim ~/directory-backup/config.env

Sample config:

export ROTATION_PERIOD=        # days to keep backups (exclude to stop backups from deleting)
export BACKUP_NAME=            # name of backup file
export DIR_PATH=              # path to directory to backup
export AWS_ACCESS_KEY_ID=       # AWS IAM USER ID
export AWS_SECRET_ACCESS_KEY=   # AWS IAM USER KEY
export AWS_DEFAULT_REGION=      # aws s3 bucket region
export S3_BUCKET=             # aws s3 bucket name
export GPG_KEY_ID=             # id used in gpg key generation
export GPG_KEY=               # exported amoured GPG key

Create cron Job

Add a new cron job using crontab. The job should periodically load the environment variables and then run the backup script. For example, to run the backup daily at 3.30 am:

crontab -e

30 3 * * * . $HOME/directory-backup/config.env && $HOME/directory-backup/directory-backup-s3.sh 2>&1 | logger -t psql-backup-s3

For more info on setting up a job using crontab, checkout ubuntu's guide here. crontab guru can be helpful for defining schedules.

Note: When setting up, it can be useful to set the job schedule to a short interval such as */2 * * * * so you can check for any misconfiguration or errors. Follow the /var/log/syslog to see the logger's output if using the above example.

Restore

To restore a backup:

Download the encrypted backup from aws S3
Copy to the machine containing the private gpg key
Decrypt downloaded file using gpg: gpg --output <decrypted file name>.tar.bz2 --decrypt <downloaded file name>.tar.bz2.gpg
Unzip decrypted file using bzip: bzip2 -d <decrypted file name>.tar.bz2
Untar using tar -xvf <decrypted file name>.tar