I originally posted this on 15 October 2016. And I’m here to tell you, it was was little bit of a fight. I still remember going back and forth between my SSH session and the Amazon S3 bucket, looking for new files. I also remember the cheering excitement when a file finally appeared.
Last evening, I wrote about the real-life value of having backups. In this morning’s Part 2 post, I’ll take you through the coding of the shell scripts and CRON jobs I wrote along with how I was able to fix the problem with uploading to Amazon AWS S3 this morning in about 5 minutes or less.
I don’t know if last night’s post seemed abbreviated or felt like it cut short at the end where I discussed what I did to fix my lack of backups, but I can tell you for certain I did not intend it to be that short. I spent somewhere between four and five hours last night crafting a beautiful post (admittedly, my opinion) with code snippets and citation links where I obtained the knowledge necessary to do what I needed to do, and when I was all finished, I clicked Update…and saw a 403 Forbidden error message. The beautiful narrative I had just crafted was gone! *Poof* Lost to the aether of electronic oblivion!
Talk about Nerd Rage….
So, why did it happen? File permissions!
When I re-constituted all of my website files, I failed to verify the directories, sub-directories, files, etc. had the proper ownership and permissions for Apache to be able to work with them. I corrected that problem and wrote the ending of the post you may (or may not) have read.
When I woke up this morning, I must confess I did not immediately rush to my computer, intent on discovering how the backup jobs ran last night. That being said, I did check as soon as I sat down at my computer. 🙂
Lo and behold, I found this in my backup folder:
database-backup-manual.sql web-backup-manual.tar.gz database-backup-Oct-15-16.sql web-backup-Oct-15-16.tar.gz
Success! I want to re-format the file name so that the date is YYYY-MM-DD, which on the files above would read *-backup-2016-10-15.*, because I like having files ordered first by Year then by Month and then by Day. Yes, I’m picky like that. Otherwise, though, this was an excellent result. If only I had the Amazon S3 upload working…
It’s astonishing, sometimes, how much better a person thinks after a full-night’s sleep. I found the instructions for mounting an Amazon S3 bucket here, on Full Stack Notes, and I followed the instructions exactly with zero errors…but when I tested it last evening around 7pm or so, no upload occurred. I realized why this morning.
If you visit that page you’ll see a command in the “Tips” section explaining what to do if you want multiple processes to be able to upload to S3. It looks like this:
s3fs -o allow_other,use_cache=/tmp/cache mybucket /mnt/s3
What I realized this morning is that I should’ve typed this last evening:
sudo s3fs -o allow_other,use_cache=/tmp/cache mybucket /mnt/s3
Copying files into the folder last evening hadn’t yielded any upload to Amazon AWS, so I didn’t have my hopes up as I typed the command to copy all of my backups in /backup to /mnt/s3:
sudo cp /backups/* /mnt/s3
I was rewarded with a much-longer wait time on the copy process than a same-disk transfer should take, so almost as excited as a child on Christmas Morning, I logged into S3 to examine the bucket I created to store my website backups. This is what I found:
Now that I have my Amazon S3 upload working, it’s time to focus on the filenames created by the shell scripts and the CRON jobs.
sudo nano /web-backup.sh
This takes me to the nano text editor and shows me this:
#!/bin/bash #Purpose = Backup of Website Files #Created on 14-10-2016 #Author = Rob Kerns #Version 1.0 #START TIME=`date +%b-%d-%y` # This Command will prepare the date to be added to the file name. FILENAME=web-backup-$TIME.tar.gz # Here I define the file name format. SRCDIR=/var/www # Location of Website Files (Source of backup). DESDIR=/backup # Where the backup will be stored. tar -cpzf $DESDIR/$FILENAME $SRCDIR #END
Let’s go through this line by line.
First of all, the pound sign (#) is the character used to create a comment. Comments are for the reader/writer of the script or code and are not processed as instructions for the computer. The rest of the line informs us that this will be a script written for the bash shell, which is one of the most common shells for *nix…if not the most common.
The rest of the lines down to “#START” are house-keeping entries. They explain the purpose of the script, when it was created, who created it, what version it is, and delineates the beginning of the portion to be processed.
This command will display the manual page for “date” in the “TIME=” line. From my reading, I found that %Y will produce the 4-digit year and %m will produce the 2-digit month. As long as I was in there, I decided to write directly to Amazon S3 as well, so I modifed my script as follows:
#!/bin/bash #Purpose = Backup of Website Files #Created on 14-10-2016 #Author = Rob Kerns #Version 1.1 #START TIME=`date +%Y-%m-%d` #This Command will prepare the date to be added to the file name. FILENAME=web-backup-$TIME.tar.gz # Here I define the file name format. SRCDIR=/var/www # Location of Website Directory (Source of backup). DESDIR=/mnt/s3 # Destination of backup file. tar -cpzf $DESDIR/$FILENAME $SRCDIR #END
I made similar changes to my database backup script, and I should be all set. I’ll check it tomorrow to see how I did!
All right, then! Let’s talk about CRON!
CRON is the “Scheduled Tasks” of the *nix world, and which user the command runs as depends entirely on what command you use to create it. To create/edit a crontab, type the following command:
To create a crontab that will run as root, add “sudo” to it:
sudo crontab -e
For my Amazon S3 backups, I need the job to run as root, so I used the second command. When I did so the first time, I saw this:
no crontab for root - using an empty one Select an editor. To change later, run 'select-editor'. 1. /bin/ed 2. /bin/nano <---- easiest 3. /usr/bin/vim.basic 4. /usr/bin/vim.tiny Choose 1-4 : 2 crontab: installing new crontab
I had never created a crontab on my web server until last night, and the system wanted to know what text editor I wanted to use. I’ve used both vim and nano, and honestly, I like nano far more. I’m sure there are some diehard vim fans out there, but I’m not one of them. Anywho….after choosing my text editor, I saw this:
# Edit this file to introduce tasks to be run by cron. # # Each task to run has to be defined through a single line # indicating with different fields when the task will be run # and what command to run for the task # # To define the time you can provide concrete values for # minute (m), hour (h), day of month (dom), month (mon), # and day of week (dow) or use '*' in these fields (for 'any').# # Notice that tasks will be started based on the cron's system # daemon's notion of time and timezones. # # Output of the crontab jobs (including errors) is sent through # email to the user the crontab file belongs to (unless redirected). # # For example, you can run a backup of all your user accounts # at 5 a.m every week with: # 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/ # # For more information see the manual pages of crontab(5) and cron(8) # # m h dom mon dow command
The explanation provided isn’t all that intuitive, so I’m going to go through it. “m” is for minute and can be 0 to 59. “h” is for the hour of the day in 24-hour (or military) time, having a value between 0 and 23. “dom” means day of month, and it can have a value from 0 to 31. “mon” specifies which month with a value of 1 – 12, and “dow” controls which day of the week a command will run, having a value of 1 to 7.
If you use the asterisk (*) in any of the values, the system will interpret that as every hour, every day, or every month respectively. In the example in the code above, the system would make a tarball of the /home/ directory every Sunday morning at 5am.
Now then…with all this in mind, I configured my crontab last night as so:
# Edit this file to introduce tasks to be run by cron. # # Each task to run has to be defined through a single line # indicating with different fields when the task will be run # and what command to run for the task # # To define the time you can provide concrete values for # minute (m), hour (h), day of month (dom), month (mon), # and day of week (dow) or use '*' in these fields (for 'any').# # Notice that tasks will be started based on the cron's system # daemon's notion of time and timezones. # # Output of the crontab jobs (including errors) is sent through # email to the user the crontab file belongs to (unless redirected). # # For example, you can run a backup of all your user accounts # at 5 a.m every week with: # 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/ # # For more information see the manual pages of crontab(5) and cron(8) # # m h dom mon dow command 00 01 * * * /bin/bash /web-backup.sh 30 01 * * * /bin/bash /database-backup.sh
As shown at the start of this post, the crontab works perfectly. That being said, I’m not comfortable with the jobs being only thirty minutes apart now that they’re writing to S3. The copy-time this morning wasn’t horrible, but you need to be a bit forward-thinking about this kind of stuff. So, I’ve modifed the second line to read:
00 04 * * * /bin/bash /database-backup.sh
Now, if I ever have a website backup job that needs…oh, say…two hours to complete, I’ll still be good.
While I was editing my crontab, I also removed the delete_extra_backups.sh script that I wrote last night. Somewhere down the line, I may need to start worrying about how much of S3’s storage I’m using, but that’s going to take a very long time, given that my website backup is around 200 megabytes and my database backup is around 50.