How I Backup my Server for Less than a Penny per Year
Backup is one of those administrative tasks that you have to take care of if you run your own server. There’s a lot of options, ranging from just copying all your files to another server yourself, to setting up a backup package, to installing a backup service from your hosting provider.
How I backup my entire server for less than a penny per year
The easiest backup solution is to not do anything at all. I’m sure a lot of amateur admins use that “solution”. It works, as long as you never run into some sort of crisis that requires backups.
The other end of the spectrum is a full paid backup service. My hosting provider Linode, for example, offers backups for the low price of $2/month for a small server like mine. That’s very reasonable, and if you don’t want to think about handling backups it’s an attractive option.
There’s two problems with a solution like that, though. First, while $24/year isn’t a lot of money, it does add up. Second, a backup service like this is fairly coarse. What I mean is, Linode itself doesn’t know how I’ve structured the data on my drives. All it can do is backup the entire drive. My operating system, my cache files, everything. That seems wasteful to me. It’s kind of like, if you’re moving to a new home, and you pack up everything into boxes, including your garbage and the contents of your refrigerator.
Ideally, I’d like to back up all the important data. By important, I mean the essential data I would need to recreate my server from scratch. I can always re-download the operating system, but I’ve made a lot of changes and confugrations that I need to preserve. And of course, the database files and source code are critical.
Linode’s backup service create a brute-force copy. I don’t mean to disparage it–it looks like a decent service, and it’s priced well for what it does. But if we’re smart about what we copy, then instead of copying the entire filesystem we can selectively backup only the files we need to restore our configuration.
A successful restore operation should be able to recreate the most recent version of the entire server from scratch. What is the minimum set of files we need to backup in order to accomplish that?
There are three categories of files that I need to backup. If I have all these files, I can rebuild my entire server from scratch.
The first set of files are server config files. Most of these list in
/etc although there are a few others scattered in various
other places. I know where all these files are because, as I built my server in the first place, I kept a record of everything I was doing.
I’ve assembled a “playbook” which is basically just a long recipe for how to build my server. I recommend anyone else setting up a server
do the same thing. All you have to do is keep notes. Every time you change something on your server, make a note of it. If there’s some
new config file, make a note of that.
If you’ve already set up your server, and you’re not quite sure you can remember how you did everything, then you’re kind of stuck. I would recommend starting over. You don’t have to delete your old server, but you can spin up a new one alongside it. See if you can duplicate your existing server, but take notes as you do it. When you’re done, you’ll have a much more complete understanding of all the various bits of software that you’re running. Trust me, it’s well worth it.
The second set of files that need to be backed up are the databases. There’s one main Postgres database that serves as a backend for the Retierate API, and a small sqlite database I just added that handles comments on the blog.
Finally, there’s the source code. This is mainly the source for the Rails app, but I also consider Authorio part of the whole Reiterate ecosystem.
Each of these filesets is backed up in a different way. The source code is all under Git, so I consider that already backed up. The main git repositories for Reiterate and Authorio are held on my laptop, which itself gets backed up with a separate system.
The config files are tracked in a small text file. Every time I add something new to the server, or change some config, I note any new config files in my list. There’s actually not too many of them – the file is only 23 lines long. Here’s a sample:
For the databases, each database gets dumped, and then that dumpfile is what gets backed up.
Testing the Restore
They say if you’ve never actually tested your backup, you haven’t really backed up. My test is to completely rebuild the server from scratch, using only the backup files. And I’ve done this, most recently when I upgraded my server to Debian Bullseye. It’s a comforting feeling, knowing that I could lose my entire server and still have everything back up and running in a day or so.
The other benefit of a system like this is the upgrades. It’s much cleaner, when doing a full OS upgrade, to do it from scratch instead of trying to upgrade in place.
I can’t praise Tarsnap high enough. I’ve mentioned it before, but I’ll go into greater detail here. Once each fileset is created, it gets stored on Tarsnap in a rotating fashion. My script saves a backup each night for a week, end then keeps each weekly backup for two months, and then each monthly backup forever.
Here is a snippet from my tarsnap account showing how much this costs me:
The last colum there shows the amount I was billed, in US$. For the first day of this year, tarsnap changed me 3 ten-thousands of a cent for incoming bandwith. The total amount, for all bandwidth, and for storage, amounted to $0.000015575. If we multiply that daily cost by 265, we get $0.005684875. So my expected annual cost for tarsnap will be about half a cent.
Why is tarsnap so cheap? Again, I’m being very smart in the files I backup, so I’m not using a lot of space. Tarsnap doesn’t sacrifice security for cost. Indeed, all communications are encrypted and even tarsnap itself can’t access my files.
The one area where tarsnap falls short is in UI. You basically have to write your own scripts in order to use it effectively. Some people wouldn’t consider this a disadvantage though.
The first set of scripts I wrote is a ruby app called
Tarsnapctl takes a tar file and sends it to tarsnap, and handles
the backup rotations (daily for a week, etc). It’s my main wrapper around tarsnap.
meckler@reiterate-03:~$ tarsnapctl --help
tarsnapctl contents --name=NAME # List contents of archive
tarsnapctl help [COMMAND] # Describe available commands or one specific command
tarsnapctl list # List archives
tarsnapctl prune --name=NAME # Prune expired backups
tarsnapctl snapshot # Create a new snapshot
tarsnapctl version # Print version and exit
Before tarsnapctl can be run, I first use my list of config files to construct a tar file. I also dump my databases into tar files
of their own. Then these tar files can be passed into
tarsnapctl for backup.
Lastly, since this is a Debian system, there’s a set of
systemd unit files which trigger each night to run the backups.
Here in an overall picture of all the pieces of my backup system.
systemd unit files simply run the start scripts:
Description=Backup Reiterate Server Config
Description=Run server backup
Description=Run database backup
Description=Backup Reiterate Postgres DB
The scripts that create the tarfiles,
backup-db.sh are available on Github.
As well as the source code for tarsnapctl
Setup and install
In the end, I’m pretty pleased with my backup system. It satisfies all the feature points you want in a backup solution.
- It’s complete. With this backup, I can restore the entire server from scratch. If I get hacked or whatever, I can start with a fresh server and restore everything from the previous day.
- It’s automatic. I don’t have to think about it. It just runs each day doing its thing.
- It’s tested. I’ve actually used this backup to to a full server rebuild. So I know it works. I know it will be there if I need it.
- It’s secure. Tarsnap has rigorous encryption. Since I’m backing up my databases, I don’t want to put that data where anyone would be able to read it. I’m the only one who has access to the backup files.
- It’s cheap. Like, pennies cheap.
Obviously, a system like this has to be customized and tweaked for each install. That’s part of why it works, though. Instead of a one-size-fits-all approach, doing it this way encourages you to get down and dirty and truly understand every piece that makes your server tick. I’ve put enough here that should be good to get someone started if they’re interested in following this approach. If you get stuck, lt me know in a comment below.