Friday, March 14, 2014

Plans for the new steevie (and a short personal update)

steevie
Replacement & encryption

So a week or so ago I ordered a super nice hard drive - a 120 GB Intel SSD (note: that links to the 80 GB version but I did get the 120 GB version). Unfortunately, there's been snail-mail problems and the drive hasn't come yet, but I'll make sure I get it eventually. I've been working on the new server in a VM, so when I do get the drive back, I can basically just image it with CloneZilla, expand a couple partitions (because I didn't want to take up a 120 GB slice of my hard drive space for the virtual hard drive image), and voila! New server.
A lot of things aren't worked out yet, due to time constraints and the fact that I could never really get VirtualBox's networking to work properly - I could never initiate a connection from the host to the guest, and screwing with the settings sometimes messed up the guest's outbound networking, too. However, the basic system is installed and functioning. The new steevie is architected from the ground up to resist the NSA. Every partition (except for /boot) is encrypted at the block level. My current plan is to have the partitions automatically unlocked with a GPG key in /boot, but I'm considering requiring a passphrase to unlock that key (yes, this does work, even with system encryption) - both of these are currently unimplemented; in the VM I just type the password manually. In addition, backups (I'm getting to my plan for these) will be encrypted with a separate GPG key that I'll keep in a secure physical storage location (password-protected, of course). Whenever possible, I will endevour to encrypt your data server-side at the application level (as opposed to the block level) - this is because even though block-level encryption is absolutely essential for resisting data compromise attacks while the system is off, it does nothing (repeat: nothing!) when the system is on.
It's worth taking a sidebar right now to explain: why encrypt at all if you're just going to unlock it automatically? The reason is that it makes the data easier to destroy. For example, if I have an unencrypted server system and I need to get rid of all the data on it to protect my users, I have to overwrite every sector of the drive with random data - that means that the amount of time it takes for data removal is directly proportional to the size of the drive. With an encrypted system, the amount of data that you need to overwrite is fixed and small. (And, as mentioned above, it adds the possibility of unlocking using a GPG key, which can then be password-protected and used over the network, getting rid of the automatic aspect.)
In addition, the contents of the /boot partition will be checksummed and compared with known good values every boot. I'm planning to deploy HTTPS (or an equivalent, like SSH tunneling) for all services on steevie, major or minor, and specifically for web-facing services, I will turn on Strict Transport Security, which means that once you've visited that service once, your browser will remember the certificate information and will be able to tell if you're getting MitM'd. This is because I'm a strong believer in the philosophy that anything that can be encrypted should be encrypted. It's not exactly incredibly expensive (you can get basic SSL/TLS certs for free, and it's not computationally expensive, either).

Filesystems
Stepping away from all the crypto, btrfs is used as the filesystem inside the LUKS containers instead of LVM (because compared to the management of btrfs, I hated LVM). (/boot is ext4.) /home and / are separate partitions, and there's something like 2 GB of swap (I forget). The drive is GPT. Honestly, there's not much else to say about filesystems.
Edit April 9: I forgot to mention that /usr is on a separate, unencrypted partition. This is because /usr is basically data managed by the package manager, so there's no need to encrypt it - and by not encrypting it, we can achieve a performance benefit.

Backups
And that brings us to our final topic - backups. I have a much better/more defined backup plan this time (i.e. it actually exists). Here it is:
First off, the internal drive will be regularly snapshotted with btrfs snapshots. What I'm thinking of doing is keeping yearly, permanent snapshots (i.e. they never go away, ever), and then keeping monthly snapshots that I rotate to make room.
Of course, that doesn't protect against the physical drive failing. For that, I'm going to do a monthly backups to an external drive with a nice tool (I'm currently looking at rdiff-backup as the probable solution). Said tool would perform incremental backups - ideally, it would produce diffs between files, but I'd also be okay with copying the entire file. (However, copying the entire directory structure even if it hasn't changed is unacceptable, and it will become clear why in a moment.)
As a second layer of protection, I plan to backup the entire directory hierarchy (no diffs) at specific intervals. Currently I'm thinking 6 months to a year, but I may change that to 1 month if I decide to take rotatable btrfs snapshots more often (like every day or something). I'm still deciding whether I will manually cp files for this, or use a tool (possibly the same tool used to make incremental backups).
As stated above, backups will be encrypted and stored in a secure, off-site location (i.e. my mom's house - the server is at my dad's). As stated earlier, the new steevie is designed from the ground up to be NSA-resistant, and because of that, I will never use something like Amazon S3 to store backups.

Miscellaneous
I'm thinking of getting a UPS for steevie, just in case. I'm still pondering this; they're very expensive.
This was basically an unwritten rule with the way I adminsitered the first steevie, but I'm making it written this time: with the possible exception of firmware, I will never put closed-source applications or services on steevie.
I'm also going to offer steevie's services to friends this time around. Maybe I can get them to ditch Facebook and Twitter if I provide a nice enough alternative.
If you have any questions, let me know in the comments or tweet me, either @strugee2 or @strugee_dot_net.

Small update on my life
I've been super busy with robotics and homework. At the beginning of the robotics season, I was like, "oh man, I gotta blog about the start of robotics" and then I pushed it off until the end of the season. It's now the off-season, and we're doing some work but not at as intense a pace as during the season, which is nice. We placed fifth at states.
Also, I don't go on IRC a lot anymore due to not having a bouncer. Instead, I go on the Stack Exchange network chat. I like to hang out in the Unix & Linux main room, The DMZ (main room for IT Security), and The Comms Room (main room for Server Fault). Even though they don't recieve a lot of noise, I'm also usually in the Bitcoin Lounge (main room for Bitcoin - duh), and finally, The Exit Node (main room for Tor).
Speaking of which, I've been running a Tor relay on Amazon EC2 for a while now, named strugees. I'm having some problems with it exhausting the bandwidth quota in like 2 days, then sitting there idling for 5 days, but I'm hoping to work it out eventually.
I've started using an online hosted instance of OwnCloud as a placeholder until I get OwnCloud set up on steevie. It's super nice - the main thing I use it for right now is the built-in cloud RSS reader. It's awesome.
Finally, I'm going to LinuxFest Northwest again this year, and I will actually be giving a talk on Arch GNU/Linux. I'm super, super pumped for that.

Have some extra money?
No, it's not for me. If you have some money that you want to donate, the Pitivi project, which is trying to make an entirely free and open source video editing suite, is running a fundraiser, and they deserve your support. So does the MediaGoblin fundraiser - MediaGoblin is trying to make an entirely federated replacement to media-sharing platforms like YouTube and Flikr. Again, go check them out, they're awesome.