ramblings from me? (who knows really): sysadmin

Showing posts with label sysadmin. Show all posts

Friday, March 14, 2014

Plans for the new steevie (and a short personal update)

steevie
Replacement & encryption

So a week or so ago I ordered a super nice hard drive - a 120 GB Intel SSD (note: that links to the 80 GB version but I did get the 120 GB version). Unfortunately, there's been snail-mail problems and the drive hasn't come yet, but I'll make sure I get it eventually. I've been working on the new server in a VM, so when I do get the drive back, I can basically just image it with CloneZilla, expand a couple partitions (because I didn't want to take up a 120 GB slice of my hard drive space for the virtual hard drive image), and voila! New server.
A lot of things aren't worked out yet, due to time constraints and the fact that I could never really get VirtualBox's networking to work properly - I could never initiate a connection from the host to the guest, and screwing with the settings sometimes messed up the guest's outbound networking, too. However, the basic system is installed and functioning. The new steevie is architected from the ground up to resist the NSA. Every partition (except for /boot) is encrypted at the block level. My current plan is to have the partitions automatically unlocked with a GPG key in /boot, but I'm considering requiring a passphrase to unlock that key (yes, this does work, even with system encryption) - both of these are currently unimplemented; in the VM I just type the password manually. In addition, backups (I'm getting to my plan for these) will be encrypted with a separate GPG key that I'll keep in a secure physical storage location (password-protected, of course). Whenever possible, I will endevour to encrypt your data server-side at the application level (as opposed to the block level) - this is because even though block-level encryption is absolutely essential for resisting data compromise attacks while the system is off, it does nothing (repeat: nothing!) when the system is on.
It's worth taking a sidebar right now to explain: why encrypt at all if you're just going to unlock it automatically? The reason is that it makes the data easier to destroy. For example, if I have an unencrypted server system and I need to get rid of all the data on it to protect my users, I have to overwrite every sector of the drive with random data - that means that the amount of time it takes for data removal is directly proportional to the size of the drive. With an encrypted system, the amount of data that you need to overwrite is fixed and small. (And, as mentioned above, it adds the possibility of unlocking using a GPG key, which can then be password-protected and used over the network, getting rid of the automatic aspect.)
In addition, the contents of the /boot partition will be checksummed and compared with known good values every boot. I'm planning to deploy HTTPS (or an equivalent, like SSH tunneling) for all services on steevie, major or minor, and specifically for web-facing services, I will turn on Strict Transport Security, which means that once you've visited that service once, your browser will remember the certificate information and will be able to tell if you're getting MitM'd. This is because I'm a strong believer in the philosophy that anything that can be encrypted should be encrypted. It's not exactly incredibly expensive (you can get basic SSL/TLS certs for free, and it's not computationally expensive, either).

Filesystems
Stepping away from all the crypto, btrfs is used as the filesystem inside the LUKS containers instead of LVM (because compared to the management of btrfs, I hated LVM). (/boot is ext4.) /home and / are separate partitions, and there's something like 2 GB of swap (I forget). The drive is GPT. Honestly, there's not much else to say about filesystems.
Edit April 9: I forgot to mention that /usr is on a separate, unencrypted partition. This is because /usr is basically data managed by the package manager, so there's no need to encrypt it - and by not encrypting it, we can achieve a performance benefit.

Backups
And that brings us to our final topic - backups. I have a much better/more defined backup plan this time (i.e. it actually exists). Here it is:
First off, the internal drive will be regularly snapshotted with btrfs snapshots. What I'm thinking of doing is keeping yearly, permanent snapshots (i.e. they never go away, ever), and then keeping monthly snapshots that I rotate to make room.
Of course, that doesn't protect against the physical drive failing. For that, I'm going to do a monthly backups to an external drive with a nice tool (I'm currently looking at rdiff-backup as the probable solution). Said tool would perform incremental backups - ideally, it would produce diffs between files, but I'd also be okay with copying the entire file. (However, copying the entire directory structure even if it hasn't changed is unacceptable, and it will become clear why in a moment.)
As a second layer of protection, I plan to backup the entire directory hierarchy (no diffs) at specific intervals. Currently I'm thinking 6 months to a year, but I may change that to 1 month if I decide to take rotatable btrfs snapshots more often (like every day or something). I'm still deciding whether I will manually cp files for this, or use a tool (possibly the same tool used to make incremental backups).
As stated above, backups will be encrypted and stored in a secure, off-site location (i.e. my mom's house - the server is at my dad's). As stated earlier, the new steevie is designed from the ground up to be NSA-resistant, and because of that, I will never use something like Amazon S3 to store backups.

Miscellaneous
I'm thinking of getting a UPS for steevie, just in case. I'm still pondering this; they're very expensive.
This was basically an unwritten rule with the way I adminsitered the first steevie, but I'm making it written this time: with the possible exception of firmware, I will never put closed-source applications or services on steevie.
I'm also going to offer steevie's services to friends this time around. Maybe I can get them to ditch Facebook and Twitter if I provide a nice enough alternative.
If you have any questions, let me know in the comments or tweet me, either @strugee2 or @strugee_dot_net.

Small update on my life
I've been super busy with robotics and homework. At the beginning of the robotics season, I was like, "oh man, I gotta blog about the start of robotics" and then I pushed it off until the end of the season. It's now the off-season, and we're doing some work but not at as intense a pace as during the season, which is nice. We placed fifth at states.
Also, I don't go on IRC a lot anymore due to not having a bouncer. Instead, I go on the Stack Exchange network chat. I like to hang out in the Unix & Linux main room, The DMZ (main room for IT Security), and The Comms Room (main room for Server Fault). Even though they don't recieve a lot of noise, I'm also usually in the Bitcoin Lounge (main room for Bitcoin - duh), and finally, The Exit Node (main room for Tor).
Speaking of which, I've been running a Tor relay on Amazon EC2 for a while now, named strugees. I'm having some problems with it exhausting the bandwidth quota in like 2 days, then sitting there idling for 5 days, but I'm hoping to work it out eventually.
I've started using an online hosted instance of OwnCloud as a placeholder until I get OwnCloud set up on steevie. It's super nice - the main thing I use it for right now is the built-in cloud RSS reader. It's awesome.
Finally, I'm going to LinuxFest Northwest again this year, and I will actually be giving a talk on Arch GNU/Linux. I'm super, super pumped for that.

Have some extra money?
No, it's not for me. If you have some money that you want to donate, the Pitivi project, which is trying to make an entirely free and open source video editing suite, is running a fundraiser, and they deserve your support. So does the MediaGoblin fundraiser - MediaGoblin is trying to make an entirely federated replacement to media-sharing platforms like YouTube and Flikr. Again, go check them out, they're awesome.

Wednesday, October 16, 2013

Update on steevie's downtime

So you all probably deserve an update on steevie. That update is this update.
steevie has been down for approximately a month. Here's what happened:

I upgraded steevie.
I rebooted steevie, due to systemd cgroup hierarchy changes.
steevie refused to boot (he failed to mount the root partition).

So basically, here's what's supposed to happen on a normal boot:

GRUB loads.
GRUB loads the Linux kernel.
GRUB loads the initial ramdisk.
LVM, in the initial ramdisk, in userspace, searches for Volume Groups.
LVM creates the device nodes that represent the LVM Logical Volumes in /dev.
systemd mounts (or swapons); the created devices as filesystems. One as /home, one as /, and one as swap.
The initial ramdisk exits, the Linux kernel changes all the mounts to be mounted on the real root, and the system boots.

The problem is that somehow, the system cannot properly complete step 5. This means that the boot process "completes" like this:

Steps 1-4 above complete normally.
LVM tries to create the device nodes. For some reason, this hangs forever.
Eventually, something (possibly systemd, I'm not sure) times out waiting for the device to be created, and kicks you back to a ramdisk shell (which means Busybox).
The shell waits for you to do something to fix the boot attempt.

This is extremely unfortunate. Right now, it's looking like the LVM problem is being caused by a hard drive failure.
You can read all the gory details at this Stack Exchange question, and then this followup, but the tl;dr is that there isn't much I can do. There's still a little more to try, but I don't hold out much hope.
Worst case, I have to completely wipe the drive. Any data in your home directory will be preserved, because there are no problems mounting the /home partition. But if you have any data anywhere else, it will probably be lost. I'll run data recovery tools, of course, but I don't hold out much hope. Unfortunately, this also means that my beautiful README will be lost. :(
I'm not sure what I'll end up doing once the drive it's wiped. It's possible I'll use btrfs on the new root, since it seems to be pretty resistant to this kind of stuff (and at the filesystem level instead of the block level, so it will probably be more effective).
Sorry for the downtime! If you have any questions or any concerns, feel free to reach out to me in the comments or on Twitter (mention either @strugee2 or @strugee_dot_net).

Sunday, August 25, 2013

I'm back, y'all!

So I've been away for a while, doing things in Outside, aka Not The Internet. Scary.
Also, I haven't really had a lot of internet access when I haven't been outside, so I haven't been able to blog or do anything interesting.
But I got back a week ago... and then dived straight into robotics. We've been doing a lot of cool stuff (among other things, I2C bus programming and my favorite revision control system, Git), preparing for the season. I left yesterday for a robotics retreat and got back today, which was awesome.
However, I've had some free time and I've been doing some stuff.
First, I'm ditching Debian (in the words of my mom, "well, that was a short romance"). And here's why. Debian installs a lot of things by default for you. It is graphics-oriented: the default network connection daemon is NetworkManager running in GNOME. It installs a desktop environment at installation. And not only that, but it's way, way, way too liberal with dependencies. When I booted my Debian system, I found the xul-ext-adblock-plus package installed. And when I tried to uninstall it, it also removed the GNOME metapackage due to the AdBlock package being a dependency of GNOME. Not a suggests. Not a recommends. A required dependency. In other words: I couldn't remove the AdBlock Plus extension without removing all of GNOME. The way I eventually solved it? I created an empty package. Someone please explain to me why the hell I had to create a useless, empty package to keep my desktop environment but get rid of a XUL extension. And someone please explain to me what idiot decided that AdBlock Plus should be a part of GNOME and why.
That's ridiculous. Not only that, but I don't understand the composition of my system. Sometimes, my WiFi will disconnect and when I go to reconnect, it doesn't show anything until I turn the network card off and on again in the GNOME Control Center. But I can't figure out where to start diagnosing this issue, because I have no idea what's installed on my system and affecting the wireless. Not only that, but Debian patches things so. Freakin. Much. I hate that.
My GDM has a Debian background. I don't want a Debian background, but that's too bad because some Debian developer has helpfully added branding. I have a Debian menu in my Awesome menu (with a couple of screensaver options that don't work anymore, no less, due to GNOME Screensaver getting merged into gnome-shell or some shtick like that). I don't want a Debian menu in my Awesome menu, I just want Awesome. But ooooh noo, the Debian menu is "helpful", so someone added it. Even if I figured out the things that were affecting my wireless, I still wouldn't understand the whole picture, because the upstream documentation doesn't cut it. I'd also have to go look at Debian's documentation to see what ridiculous things they've added or changed.
Plus, despite the fact that I'm on Debian Sid - the unstable branch that's supposed to be more like a rolling distro because it's the development branch, where updated packages land first - I still get moldy packages. Even though Sid is where new things land, they're still developing for a non-rolling distro. So even though Emacs 2.4 is in the package pool, and has been for at least just about a year (since I remember seeing it back when I used Ubuntu), I still get Emacs 2.3 when I install the Emacs package, because Debian isn't ready to move to 2.4 on stable, and unstable is ultimately going to become stable. Not just Emacs. The other program I use every day - my web browser - is also moldy. Because it turns out that Debian uses the Firefox/Iceweasel ESR releases instead of regular releases. So I had the dubious pleasure of pulling a newer package from Debian Experimental. I mean, seriously. The mold is clear. In the words of the Linux Action Show, when I'm using Arch, I feel closer to upstream.
In the end, Debian is not KISS. So I'm leaving it for Arch.
Edit: Debian also uses SysV init, which is old and bugs me, especially since I've grown up on the relative speed and feeling of cleanness of Upstart (from back when I used Ubuntu), and now the awesomeness that is systemd on Arch. It's possible to install systemd (or Upstart) in Debian but it's impossible to effectively replace the init system, because the SysV init package is marked as essential, which means it gets automagically reinstalled when you do a system upgrade. Or you could patch the GRUB files, which I don't want to do. (In short, SysV init bugs me and it bugs me that I can half-switch to systemd, but not really).
Update about steevie: X11 forwarding has been theoretically turned on, but my cursory attempt to launch gedit failed. I think I had some client-side things configured wrong, so I'm not sure if it actually works.
Also, files will be served from ~/public_html automagically by Apache. They'll show up under people.strugee.net/~[your username]/ - just make sure that the folder is readable by the httpd user. Details in the README (although I think there are currently some half-written parts).

Friday, August 2, 2013

Update on the new server

tl;dr, here is what's done:

SSH (kind of)
LVM

I haven't had a lot of time to do server stuff for today and yesterday, because I've been hanging out with people IRL *gasp*
However, the new server lives, albeit weirdly. Yesterday I spent a lot of time trying to fix the filesystem on the server before finally giving up and just making a tarball. So that took up like 6 hours of just waiting. Ugh! However then, as I said, I made a tarball and backed it up, and then proceeded to install Arch Linux. Funny story: I had to bring the server into my bathroom because it is the only room in the house that a. provides grounded sockets and b. is reachable with an Ethernet cable from the router (since the new server doesn't have a WiFi card), which I needed because Arch is a netinst distro these days. Then I had to go to bed. However, since LVM is part of the install process, I got that done.
Today I had very little time as I've been packing for a trip tomorrow. Therefore, I wasn't able to get a perfect setup, but it is workable for remote administration (so I can get most stuff done while traveling). The major flaw that you will notice in the current configuration is that if you have an existing account, you will end up back in alex-ubuntu-server. This is because something is wrong with my router and it is still forwarding connections to alex-ubuntu-server (which is still plugged in via Ethernet to allow for remote file migration). Therefore, if you previously had an account on alex-ubuntu-server, you will need to ssh to 192.168.0.19 from the Ubuntu console. Then you'll end up at steevie (which is the new server's hostname, btw).
Note: if you have a new account, you don't have to worry about this. I've put together some hackery on alex-ubuntu-server to allow you to login to steevie automagically. The only difference is you will have to type a very bad, very weak password that doesn't matter before you type your real password.
Other things will be done or turned on in the coming days, e.g. X11 forwarding, mail, etc.
9P will not be turned on, because I will need physical access to install Plan 9 and to reconfigure the router again. Anything external won't be turned on properly because, again, I'll need to reconfigure the router. For example, internal mail will be turned on but SMTP won't.
Anyway, I have to go pack.

Monday, July 29, 2013

Upcoming changes coming to alex-ubuntu-server

Recently a friend offered me a new server with much better specs than the 15+-year-old computer that I use now. It has 4GB of RAM (compared with the 256 MB that the current server has), and it has a dual-core AMD processor running at 2800MHz. I'm not sure what the processor specs are for the current server, but honestly, I'm sure they're just as crappy as the RAM.

Getting this new server will open up a lot of possibilities, so here's some important changes that are coming to the server, if you are the one person that uses it.

X11 forwarding will be installed and turned on for SSH connections

This means that if you have an account (i.e. are able to SSH into the current server), you will be able to remotely login to a graphical environment. This means that you can e.g. carry your graphical application settings around with you (or at least, it will seem like that, in reality you'll be loading them from my server, which will require internet access).
It's unknown if I will offer GNOME. I will be open to any lightweight window manager such as awesome, Openbox, Fluxbox, twm, etc., without further thought. However, I will have to experiment with what system load looks like with GNOME installed. Therefore, I'll start with GNOME, but you should be aware that GNOME could eventually be removed again.

LVM will be turned on and partitions will be reconfigured

This won't affect you in any measurable way if you use the server. It just means that if there's ever a need for more storage, there won't have to be server downtime in order to install and use it. If you don't know what LVM is, read the Wikipedia article on it.
/home will become a separate partition. This is mostly to allow for easier backups (currently there is zero backup policy) and easier transitions in the event of another server move.

There will be a fresh installation. I will not just be dding or rsyncing files over to the new install.

There are several reasons for this. The first and foremost is that I installed and set up this server a couple of years ago, back when I was around 11 or 12, and thus didn't know exactly what I was doing, and I didn't have a very good idea of how to be a sysadmin. Because of this I didn't really keep a record of changes that I'd made, and thus, I don't know exactly how the system is structured and cannot effectively perform changes or diagnostics (because I don't know how changes would affect the system).
I may or may not transition to Arch Linux as the distribution of choice for my server, and this requires a reinstall. At first blush this may seem like a bad idea, since Arch is rolling and you need stability for a server (this is why Debian and Debian derivatives are so good for servers - they're stable and don't change often). However, it's worth noting that with Arch, you can deal with problems as they come along, instead of all at once every 6 months. This is actually pretty useful, because you can tell exactly which package changes may have broken something, instead of 5-10 things potentially breaking all at once. In short, problems are isolated. Note that if I do run Arch on my server, I will of course do my utmost to maximize stability - for example, I'll use an LTS kernel instead of the latest. Another reason that I'm thinking of Arch is that it makes it easy for me to understand exactly what's going on. Ubuntu and Debian both come with batteries included, which is generally a Good Thing™ but can be unfortunate if you want to understand the exact composition of your system (which you should if you want to be a good sysadmin). In particular, Ubuntu and Debian are very generous when installing optional things (not helped by the fact that installing Recommends is turned on by default in the APT configuration). It gets to the point where the GNOME metapackage in Debian depends (not recommends - depends) on the AdBlock Plus XUL extension. What?? Finally, I just like Arch better than Ubuntu. pacman vs. apt-get, apt-cache, apt-mark, apt-cdrom, apt-<5 other things here>, anyone?
LVM (see above) is much easier to set up with a fresh install.
Services operation will not be impacted. Anything that works on the server now will work in the new server. Primarily, this means mail and SSH access. I'll also ensure that a lot of currently-installed packages are still available (for example Emacs). If you encounter something that you could do before and can't with the new server, I will consider it a configuration bug and will fix it.
Note that the two exceptions to this is /home and /etc.

/home I will transfer over for obvious reasons: I don't want you to lose data. That being said, be cautious because configuration formats may change if I move to Arch.
/etc is version-controlled with etckeeper. Therefore I'll just add a remote and git push, but I may take the opportunity to do some pruning.

I will overwrite the current server setup with an installation of Plan 9 From Bell Labs, and I will set up that installation to be a private 9P server.

The new server will be set up to forward all incoming traffic directed towards 9p.strugee.net to the new Plan 9 server.
The Plan 9 server will run a Fossil filesystem backed by Venti, allowing rewinds, etc.
If you have an account on the main server you will have an account on the Plan 9 server (I'll either set up a script to make this happen or I'll just go into each server and create a new user twice).

Note: this means downtime.

Most likely this will happen in the coming weeks or even months. It won't take that long, especially because I'll basically need to swap out machines (I'll have configured the new server while the old server was running), but just in case of extended downtime, be aware.

In order to prepare please rack your brains to figure out if you have any files not in your home folder. If you do, please either move them to your home folder or make backups.
If you lose data, I will be able to recover it, but I don't relish the thought as I'll probably have to mess around will loops and mounts and stuff (see the second paragraph). Assume that there will be no backups.

Thursday, July 18, 2013

Goings-on

It's summer! Yay!
I've been to Ultimate Camp and the interwebs. And my room. Hmm.
Actually though, this week from last Wednesday to last Friday, a couple of people from the SAAS robotics team have been prepping for a camp that we're doing for middle schoolers next week. And this week, we get to actually be councilors for middle schoolers. It's very exciting and very fun!
Also, I've switched to Arch Linux. Ubuntu just makes me too angry these days, and I no longer recommend it for GNU/Linux newbies (I'm recommending Mint now). Canonical is making more and more proprietary decisions - for example, Unity cannot be used on any distribution besides Ubuntu without serious effort. Also, take Mir - Mir fragments the already little-used GNU/Linux desktop, and it doesn't even do anything new. Developers are already putting GNU/Linux behind Windows and Mac - and now they potentially have to think about two display servers, meaning that the platform will look even less attractive. Not only that, but none of the concerns that the Mir team had about Wayland hold up - in fact, a Mir developer showed that he in fact knew nothing about how Wayland worked. Canonical's insane - they want to take on the burden of porting all the upstream toolkits themselves (oh, except for old ones like GTK+2 - but as we all know, GTK+2 is still in wide use). IMHO, this is crazy. It's a waste of resources. Canonical cannot play with others, and that's extremely frustrating. For example, Canonical thought that their upstream Wayland contributions wouldn't be accepted. They even offered that as a justification for Mir. But they never even tried. That's simply ridiculous, and not only that, but it's selfish. As the vendor of the most widely-used GNU/Linux distribution in the planet, Canonical has a responsibility to not do things that screw over the ecosystem. But recently it seems like they're getting Not Built Here syndrome more and more, and they're willing to do almost anything to meet that feeling, even at the cost of the rest of the ecosystem. It's saddening.
Anyway, I'm going to stop talking about that because it makes me angry. Other miscellaneous things that I'm doing: I'm planning to fully install and try Gentoo, NetBSD, Linux from Scratch, and finally, Plan 9 from Bell Labs (note that this is the only one that isn't a UNIX).

Yesterday (Tuesday) I attended a LibrePlanet Washington meeting, which was really fun. Among other things I am now into PGP/GPG and will be doing stuff with it soon.

Also, I am thinking of doing dev work on my favorite AUR wrapper, Yaourt. I'm also thinking I might work on grive, since Insync is no longer free (as in free beer).

I also attended GSLUG last Saturday, which was really cool.

I'm also getting into IRC again. I usually hang out in #archlinux, #plan9, #gnome, #gslug and (just recently - we only created it yesterday!) #libreplanet-wa, all on Freenode. Especially cool is the fact that I set up an irssi proxy on my server (which is now on the live internet, although strugee.net is still hosted on GitHub pages. The only problem is that it interferes with byobu/screen.

Also, I set up Postfix, so mail between local system users is enabled on my server (but external mail @strugee.net is not).

Anyway, I have to go to bed. There's probably more that I want to talk about, but whatever.

Oh, one last thing: I'm using Emacs now. Yay!

Thursday, July 11, 2013

Firewall configuration on alex-ubuntu-server

So about a month or two ago, in preparation for putting my server live on the internet, I configured my firewall, which was an interesting process that I want to document.
I had previously searched for "firewall" in aptitude and installed the first result, which gave me a lovely error on service init telling me that I needed to edit /etc/apf-firewall/firewall.conf, and set something-or-other to true. Obviously I generally ignored said error.

So I went looking for documentation but it turns out that Ubuntu already comes with a firewall. Therefore I got rid of apf-firewall. Then I ran sudo ufw enable.
Now, I've read the six dumbest ideas in computer security. And of course, number one is default allow. Luckily, ufw was written by people smart enough to put a default deny policy in place by default:

alex@alex-ubuntu-server:~$ sudo ufw status verbose
[sudo] password for alex:
Status: active
Logging: on (low)
Default: deny (incoming), allow (outgoing)
New profiles: skip
alex@alex-ubuntu-server:~$

So that was covered. I decided, however, to also institute a default deny policy for outgoing traffic, on the basis of "why not" - meaning that I might as well unless it became a huge issue. So far though, it's actually ok. An interesting thing that happened on my first pass, though, was that while I had port 80 open, I didn't have port 53 open. So I could download web pages but I couldn't actually resolve addresses, causing connection problems.
Anyway, the last thing I have to do is figure out ping. It's supposed to work automagically, but it doesn't. So I'll look at that.

Pages