Amazon S3Â (Simple Storage Service)Â offers a flexible option to store files/folders and backups of websites in the cloud. It is a part of various web services Amazon offers like EC2, CloudFront and so on. These are collectively known as AWS (Amazon Web Services).
Creating an AWS account is free. New users get 5 GB of Amazon S3 standard storage, 20,000 Get Requests, 2,000 Put Requests, and 15GB of data transfer out for free for each month of the first year.
The way S3 works is to first make a “bucket” and then storing/uploading data in that bucket.
While uploading files directly in S3 through it’s web console is simple, it can be a bit complex when trying to access S3 directly from Linux Terminal.
Let’s take a look at how to backup an entire website (consisting of lots of files and folders) to Amazon S3 through Linux Terminal. (This is also applicable to backing up specific files/folders from local Linux systems as well).
This example assumes an already existing AWS account, a Linux server access using SSH (root access isn’t needed as this will work even on shared hosting for backing up websites to Amazon S3 as long as SSH is enabled).
Prerequisites :
- Access to Linux server from where website content needs to be backed up. This system should have Python version 2.6.3 installed or greater (which most web hosts already have).
This can be checked by the following Linux command :
- Amazon S3 bucket should be created. This can be done by signing into AWS S3 console, choosing Services > S3 and clicking “Create Bucket”.
Once the S3 bucket is created, right click on the bucket name and choose “Properties”.
From the right side of AWS pane, note down the information displayed especially “Region”. Also, from the “Permissions” tab, make sure that your username has all the permissions to the created S3 bucket.
- Access to AWS private and public keys. These can be obtained by clicking on your username in AWS and choosing “Security and Credentials”.
Then clicking “Create New Access Key”.
Note down both the keys and download the key file if needed. Keep this information somewhere secure.
So to sum up, the following components need to be available before trying to backup content from Linux Terminal to Amazon S3 :
- Python version 2.6.3 or greater installed on the host system.
- A created Amazon S3 bucket.
- Details of S3 bucket like Region and user id who has access to it.
- Access Key ID and Secret Access Key details.
Now, to access S3 bucket from Linux Terminal, AWS command line interface needs to be installed and configured (AWS CLI). This is a tool for managing AWS products directly using commands.
Setting up environment for installing AWS CLI :
Install AWS CLI using the following :
unzip awscli-bundle.zip
./awscli-bundle/install -b ~/bin/aws
This will not need root access and the AWS CLI will be installed for the current user.
Once installed, verify if the ~/bin is in PATH environment as it is a symlink through following command :
If not present, use the EXPORT command :
AWS CLI should be now be all set for use.
To verify if it is working, use the following command :
This will bring up the AWS CLI help page.
Configuring AWS CLI to “see” S3 buckets :
The command for this is :
Here is where the prior information about S3 bucket region, access key ID, secret access key are to be entered. Hit “Enter” after typing each of these details.
Note that the region name in this example is “us-west-2” as that is the corresponding region for “Oregon” which is the S3 bucket region here. Default output format can be set to none/blank.
An official list of all S3 regions can be found here.
Once this S3 configuration is done, it is time to access it.
This is done by following command :
If correctly configured, the same S3 bucket that was created from AWS S3 console should now be listed here.
Finally, now the fun part  – actually backing up stuff to S3. 🙂
To test if files can be uploaded to this bucket, simply create a test file if needed and use the cp command such that :
In this example, a test file named test.txt is copied to S3 bucket named “s3bkp” through the command :
This file should show up in the AWS S3 console of the bucket :
Now, to backup the entire directory structure from host Linux server to S3 bucket, the command to use is “sync” :
In this example, the entire “public_html” folder needs to be backed up to S3 bucket “s3bkp”. So to do this :
This will start the sync process and all the content will be uploaded to S3. The same should be now visible through the S3 console for the bucket where it is being uploaded.
All done!
Restoring from S3 :
This is exactly the reverse of backing up to S3. Simply switch the source and destination paths :
So to restore a folder named “logs” from S3 bucket named “s3bkp” to a local folder named “restore”, the command will be :
[ For copying an individual file, simply use the “cp” command with source and destination paths as those of S3 bucket and local path.]
So to sum it up, backing up files/folders to Amazon S3 through Linux Terminal consists of :
- AWS CLI installed (Python version 2.6.3 or greater should be installed on host system for this).
- Â Having the required S3 information at hand (S3 bucket region, private and secret access key IDs) for configuring and accessing S3 from Linux Terminal.
- Copying files and folders to S3.
Important : Amazon S3 is a storage service which bills only for as much as you use it. Use the S3 pricing calculator to get an idea of costs as it calculates this based on number of GET and PUT requests. The pricing details can be found here.
So to avoid a lot of individual GET and PUT requests for large number of files, it can be more economical to backup a compressed archive and restore it back when needed. This is simple to do using the tar command in Linux.
Official resources for further reference :
Installation and usage of AWS CLI
Update : There is a server backup monitoring solution – Backup Bird that can do this for S3 as well as for FTP and DropBox. Check out the article on how to set it up here. ]