Holistic Engineering

A random assortment of shit with sprinkles.

Login Accounting Explained

| Comments

I put a call out for blog ideas: Devin Austin came up with “Login Accounting”. So let’s talk about that for a bit.

First off, a little disclaimer

This is not security advice, I am not a security expert, or an expert at pretty much anything. Use your head, silly.

Let’s also be clear about something

Login Accounting is pretty much a mess on unix. Programs that manage logins can “opt-in” to login accounting; the system does not inherently do this for you, largely by side effect of why it works at all. Most tools can even be configured to write to the basic accounting systems or not, or provide the option as a runtime argument. This means that your login accounting system can lie. Additionally, the systems we’re going to look at are the first thing an intruder will mess with. We’ll look at a few techniques to mitigate the lack of information later, but rest assured there’s not much you can do to make this bulletproof.

All code examples in this article expect Ubuntu 11.10 to be the platform. You will see deviation between systems so be certain you’ve absorbed this article before trying anywhere else.

utmp, wtmp, lastlog

These are the core systems in unix login accounting; they are append-only databases, more or less, with a system-dependent structure. You can usually read about the structure by typing man utmp or reading the /usr/include/utmp.h file. Note this will be dramatically different between Linux, FreeBSD, Mac OS X, etc.

One can navigate the utmp structure pretty simply, or use the w, who and last commands to navigate them. They exist as three files:

  • /var/run/utmp is what’s currently going on.
  • /var/log/wtmp is what’s happened in the past.
  • /var/log/lastlog is the last account for each event (e.g., a specific user logging in)

Anyhow, let’s have some fun. As for navigating the structure, while system-dependent that’s really easy. Here’s a small program that navigates utmp and sends the pty and username to figlet for output for all user-related information. apt-get install build-essential figlet and then gcc -std=c99 -o fig_utmp fig_utmp.c to use.

fig_utmp.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <utmp.h>
#include <stdio.h>

int main(int argc, char **argv)
{
  char buf[1024];
  struct utmp ut;

  int fd = open("/run/utmp", 'r');

  if (fd < 0) {
    perror("could not open file");
    exit(1);
  }

  while(read(fd, &ut, sizeof(struct utmp)) == sizeof(struct utmp)) {
    if(ut.ut_type == USER_PROCESS) {
      snprintf(buf, 1024, "figlet %s - %s", ut.ut_line, ut.ut_user);
      system(buf);
    }
  }

  close(fd);
  return(0);
}

It outputs something like this (I’m holding two logins to the box):

       _          ___                     _ _    _     
 _ __ | |_ ___   / / |           ___ _ __(_) | _| |__  
| '_ \| __/ __| / /| |  _____   / _ \ '__| | |/ / '_ \ 
| |_) | |_\__ \/ / | | |_____| |  __/ |  | |   <| | | |
| .__/ \__|___/_/  |_|          \___|_|  |_|_|\_\_| |_|
|_|                                                    
       _          _____                      _ _    _     
 _ __ | |_ ___   / / _ \            ___ _ __(_) | _| |__  
| '_ \| __/ __| / / | | |  _____   / _ \ '__| | |/ / '_ \ 
| |_) | |_\__ \/ /| |_| | |_____| |  __/ |  | |   <| | | |
| .__/ \__|___/_/  \___/           \___|_|  |_|_|\_\_| |_|
|_|                

Try modifying it to output the hostname as well! (Hint: the struct member is called ut_host)

utmp carries a lot more than just user logins though, it’s responsible for recording most of the events that happen at a system level. For example, here’s some last output:

erikh@utmptest:~$ last
erikh    pts/0        speyside.local   Sun Mar 25 10:18 - 10:18  (00:00)    
erikh    pts/0        speyside.local   Sun Mar 25 09:17 - 10:06  (00:48)    
reboot   system boot  3.0.0-16-server  Sun Mar 25 09:13 - 10:22  (01:08)    
erikh    pts/1        speyside.local   Sun Mar 25 09:13 - crash  (00:00)    
erikh    pts/0        speyside.local   Sun Mar 25 09:12 - down   (00:00)    
reboot   system boot  3.0.0-16-server  Sun Mar 25 09:11 - 09:13  (00:01)    
reboot   system boot  3.0.0-16-server  Thu Mar 22 00:35 - 00:53  (00:18)    
reboot   system boot  3.0.0-16-server  Thu Mar 22 00:32 - 00:33  (00:01)    
erikh    pts/0        speyside.local   Thu Mar 22 00:31 - down   (00:00)    
erikh    tty1                          Thu Mar 22 00:27 - down   (00:04)    
erikh    tty1                          Thu Mar 22 00:27 - 00:27  (00:00)    
reboot   system boot  3.0.0-16-server  Thu Mar 22 00:20 - 00:31  (00:11)    
erikh    tty1                          Thu Mar 22 00:16 - down   (00:03)    
erikh    tty1                          Thu Mar 22 00:16 - 00:16  (00:00)    
reboot   system boot  3.0.0-12-server  Thu Mar 22 00:16 - 00:20  (00:03)    

Notice all the reboots in there? This is why we filter on USER_PROCESS above. The ut_type contains a lot more information than what we care about. Anyhow, this is explained better in man utmp, so go read that. There is also the POSIX utmpx which isn’t really any more consistent than utmp is across different unices.

So, about this lossy login accounting issue…

What to do about it? There are really two options:

  • Make sure your things are logging utmp entries.
  • Use something else, like log scanning.

In reality only one of these is the serious choice – there are other things like auditd and PAM controls that can assist here, but not much really. Log scanning and having tight control over how users get into your systems is the way to go. Since log scanning is such a deep article, we’ll cover it in a separate one. Stay Tuned.

Conclusion

The utmp system is typically relied on for a lot more than it should be; it’s inconclusive and generally flawed especially for non-interactive … interaction.

Rsync: The Swiss Army Chainsaw of Backup Utilities

| Comments

Update: after writing this, Phil Hollenback told me about rsnapshot which looks to be a better solution for most use cases. Additionally, I have added chef recipes that implement some of the things seen in this article.

Time Machine is a pretty neat little thing, but it’s not the mother of all backup utilities; that title belongs to rsync. This article goes into automating your backup process across Unix derivatives with the versatile tool, and some orchestration to feed remote backups to a home network that exists on a possibly-dynamic IP address.

Likely if you’re here, you’re familiar with both tools and more-or-less what they do. However, before I go into detail, some history here is required…

So, the late great Steve Jobs is of the mind that the I/O bus Thunderbolt should be the new standard for high performance external storage. While that’s fine and dandy at all and largely remains to be seen, tell that to my USB3 and SATA capable home RAID enclosure I like to store backups on. At some point I flipped out at seeing Time Machine take days to finish a backup when I knew just copying files over the network (or even faster methods, we’ll talk about that below) would be several orders of magnitude faster, and I don’t even have anything fancy network-wise at home. To add insult to injury, anyone who’s built a hackintosh and actually tried that eSATA or USB3 port… knows it works. :)

I had two real options. I could drop another $1200+ on one of these (I actually have one of these at work and they are wonderful. They’re just not realistic or cost-effective for what I want to do at home.), or I could get old school and go back to rsync, since Time Machine isn’t really how I restore machines anyhow; usually by the time I’m ready to rebuild a machine I want to eliminate the cruft on my machine, not restore it. rsync gives me the best of both worlds by giving me a full machine copy AND the ease of use to pull individual files/trees out when I need that.

While we’re here singing the praises of rsync, let’s sing the praises of hard links as well. rsync + hard links is a battle-tested backup method used on sites with a lot of real data and gives you incremental backups with a minimal amount of work.

So let’s start with the basics:

What’s a hard link?

The reason I’m asking this question, is that this is a surprisingly oft-missed question in interviews, and confused when working with co-workers, etc. Comprehending it properly means that you need to have a deeper understanding of filesystems than the file/directory high level.

Let’s get some basic axioms down before we go into discussion:

  • Hard links (provided by ln with no arguments) are not symbolic links (ln -s).
  • Hard links cannot span filesystems. Symbolic links can.
  • Hard links cannot reference directories. Symbolic links can.
  • Symbolic links are separate files on disk. Hard links are not.

So fundamentally a filesystem is a hash table of references, or pointers if you want to be more precise. The key is the name of the file and the value is the pointer to the head of the data. In the event of a directory, the directory key is the name and the value is…. another hash table.

(For you advanced readers: yes, I am glossing over a lot of shit; I only have so many characters to share with the world in a narcissistic attempt to show how awesome I am.)

To use hash syntax from perl and/or ruby, it looks a bit like this:

filesystem_hash.pl
1
2
3
4
5
6
7
8
/ => {
    file1 => 0xDEADBEEF,
    file2 => 0xDEADFACE,
    dir1  => {
        file3 => 0xCAFEBEAD,
        file4 => 0xBEADCAFE
    }
}

What a hard link does is create another entry in a directory of your choosing to a file that references a pointer that coincides with data.

To use language garbage-collection terms: the crux of the issue here is that all files are hard links, with a reference count of one. Creating another hard link with ln increases the reference count. Files with a reference count of zero that are not held by processes in memory are reclaimed as free storage.

These pointers are called inodes, and you’ll see them with tools like df -i and mkfs. They have a minimum size and lots of other important properties and data that I’m glossing over. You can read about them here.

So, doing this:

ln /file1 /file3

Creates this:

filesystem_hash_with_hard_link.pl
1
2
3
4
5
6
7
8
9
/ => {
    file1 => 0xDEADBEEF,
    file2 => 0xDEADFACE,
    file3 => 0xDEADBEEF, # right here, guys
    dir1  => {
        file3 => 0xCAFEBEAD,
        file4 => 0xBEADCAFE
    }
}

Easy, right? Let’s get to why I’m telling you about all this.

The magic of rsync’s --link-dest

First off, let’s RTFM. From man rsync:

--link-dest=DIR         hardlink to files in DIR when unchanged

This means: if you specify a directory here, it’ll be consulted for differential purposes. Unless the file is different, it will be hard linked into the target directory (specified at the end of your command line).

Here is some orchestration that exploits this property:

backup.pl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#!/usr/bin/env perl

use strict;
use warnings;

use DateTime;
use DateTime::Duration;
use DateTime::Format::Strptime;

our $HOSTNAME    = `hostname -s`;
chomp $HOSTNAME;
our $BACKUP_DIR  = "/backups/$HOSTNAME";
our $BACKUP_USER = "erikh";

my $today = DateTime->now;
my $yesterday = $today - DateTime::Duration->new(days => 1);
my $host = $ARGV[0];

my $yesterday_ymd = $yesterday->ymd;
my $today_ymd = $today->ymd;

my $user_string = "$BACKUP_USER\@$host:";

if ($host eq 'localhost')
{
    $user_string = "";
}

system("sudo rsync -a --numeric-ids --exclude /backups --exclude /dev --exclude /proc --exclude /sys --rsync-path='sudo rsync' --link-dest=$yesterday_ymd / $user_string$BACKUP_DIR/$today_ymd");

This is a small backup tool I wrote which exploits --link-dest. Backup directories are organized by date, and the previous date is used for --link-dest when performing the new rsync. Files are hard-linked in if they’re equivalent, reducing transfer times and by side effect space used. On the local network (1 Gbps), this takes about 4 minutes for a daily backup of my main mac workstation.

Directories look like this:

[2] erikh@chef10 /backups% ls
./  ../  chef10/  coffee/  extra/  lost+found/  speyside/

And inside one of them:

[4] erikh@chef10 /backups/speyside% ls
./  ../  2012-02-25/  2012-02-26/  2012-02-27/  2012-02-28/  2012-02-29/  2012-03-01/  2012-03-02/  2012-03-03/

Each directory has a full backup, but only the files that are changed take space. Each machine has roughly 250GB of stored space save the server which has around 30GB. The mount point is using 1.2T and the extra directory is the exception taking around 800G of that. Not bad for a week’s worth of backups across 3 machines.

Over a month, it’ll be even more amazing. Speaking of which, here’s a small script which prunes the backups older than 30 days (space saving or not, you’ll eventually run out of space if you leave these unattended):

prune_backups.pl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#!/usr/bin/env perl

use DateTime;
use DateTime::Duration;
use DateTime::Format::Strptime;

use constant BACKUP_DIR => "/backups";
use constant DAYS => 30;

my $today       = DateTime->now;
my $last_month  = $today - DateTime::Duration->new(days => DAYS);
my $dt_format   = DateTime::Format::Strptime->new(pattern => "%Y-%m-%d");

my $backup_dir = opendir(HOST_DIRS, BACKUP_DIR);

for my $host_dir (readdir(HOST_DIRS)) {

    my $full_host_dir = BACKUP_DIR . "/$host_dir";
    next unless -d $full_host_dir;
    next if $host_dir =~ /^\./;
    next if $host_dir eq 'lost+found';

    opendir(DIR, $full_host_dir);

    for my $dir (readdir(DIR)) {
        my $full_dir = "$full_host_dir/$dir";

        next unless -d $full_dir;
        next if $dir =~ /^\./;

        my $date = $dt_format->parse_datetime($dir);

        system("rm -rf $full_dir") if ($date < $last_month);
    }

}

It just walks each tree and prunes any dirs which are earlier than the last 30 days.

Off-site Backups

Now that we have on-site backups, what happens when we have machines that are outside of the firewall’s reach that we want to bring to the same backup repository?

The great thing about rsync is that it can be used in conjunction with ssh allowing for lots of tunneling options and increased security. However, we’re going to do something simpler: open a port.

Airport Configuration

So now that’s established, but we need to be able to find out where we’re sending the data. We’re on a dynamic IP here at home, so some accounting for that is also required; and we can’t just take ifconfig’s values since we’re behind a NAT and so on and so forth.

So the route here is effectively:

coffee.hollensbe.org -> home.hollensbe.org (NAT) -> chef10.local

Let me introduce you to my little friend: jsonip. A really straightforward, just-give-me-an-ip solution that doesn’t require scraping or any other business. I decided to implement this myself since, you know, the internet is fleeting and lots of things tend to disappear. Sinatra is a great utility for such small services. Here’s the code:

jsonip.rb
1
2
3
4
5
6
7
8
require 'rubygems'
require 'sinatra'
require 'yajl'

get '/' do
  headers["Content-Type"] = "application/json"
  return Yajl.dump({:ip => request.ip })
end

This runs at jsonip.hollensbe.org. If you’d like an example of how to set this up with unicorn, look here.

Now we have a way to get the external IP address of our network, but we need a way to communicate any IP changes to our remote machines.

Enter BIND. BIND is the swiss-army-chainsaw of DNS servers and if you don’t know it, you should. BIND has a variety of ways to update an IP address, but we’re going to use the nsupdate strategy here, since it’s the simplest.

nsupdate uses a simple pre-shared key approach which is transferred as a hash – similar to the way your /etc/passwd and /etc/shadow files work. The solution here config-wise is pretty simple:

key "hollensbe.org" {
    algorithm hmac-md5;
    secret "my-secret";
};

zone "hollensbe.org" {
        type master;
        file "/etc/bind/domains/hollensbe.org";
        allow-update { key hollensbe.org; };
};

What this allows us to do is send a message to our BIND server that says ‘let me publish updates to the zone file’. There are a few caveats mainly revolving around rndc freeze and rndc thaw to this so I suggest reading the manual before continuing with this approach.

Here is a small script that I wrote that:

  • Retrieves credentials from a YAML file in /etc.
  • Pulls the IP from jsonip.hollensbe.org.
  • Resolves DNS for home.hollensbe.org.
  • If they differ, it sends an nsupdate to the BIND server with the jsonip address.
dynamic_nsupdate.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/env ruby

require 'yaml'
require 'resolv'
require 'json'
require 'open-uri'

HOSTNAME = "home.hollensbe.org"
JSONIP_SERVICE = "http://jsonip.hollensbe.org"

DNS_INFO = YAML.load_file(ENV["TEST"] ? "dns_info.yaml" : "/etc/dns_info.yaml")

ip = JSON.load(open(JSONIP_SERVICE).read)["ip"] rescue nil
resolved_ip = Resolv.getaddress(HOSTNAME) rescue nil

if ip.nil? or resolved_ip.nil? or ip != resolved_ip
  puts "Updating #{HOSTNAME} to be ip '#{ip}' (previously '#{resolved_ip}')"

  IO.popen("nsupdate -y #{DNS_INFO["key"]}:#{DNS_INFO["secret"]} -v", 'r+') do |f|
    f << <<-EOF
      server #{DNS_INFO["server"]}
      zone #{DNS_INFO["key"]}
      update delete #{HOSTNAME} A
      update add #{HOSTNAME} 60 A #{ip}
      show
      send
    EOF

    f.close_write
    puts f.read
  end
end

Now our backup.pl can point at home.hollensbe.org over SSH and perform backups normally, all without having to worry or care about IP changes.

That’s all I’ve got! I hope you found this informative.

How Tmux and Lion Full Screen Make My Life Better

| Comments

I had a “huzzah” moment about a week ago where I really tried to drink the OS X kool-aid, and was marginally successful. Here’s how I did it.

My goal was to minimize distractions… something that comes pretty commonplace with being an ops guy, supporting developers and watching a pager, and lots of other things. I will freely admit to not handling distractions as well as I should, but I do work best when I can just focus in on something and get it done, so I seek to improve that situation when a solution presents itself.

My previous solution was pretty simple, I came back to OS X from Linux with (good) tiling window managers such as wmii and configurability suites like rumai. (If you’re on linux and are interested in these tools, I strongly suggest checking out subtle instead… they have gone a long way towards improving a lot of the warts wmii has.) Anyhow, I started using a tool called sizeup which is more-or-less like a manual tiling window manager for OS X. There are others, like divvy and tyler that perform similar functions.

About 3 weeks ago iTerm2 released tmux “integration”. This requires a separate build of tmux provided by the iTerm guys (located here), and provides gui-level integration with tmux window splitting and friends, along with all the features of attachment & detachment that screen, dtach, and tmux fans revel in.

This, and some comments by a co-worker (hi, Michael!) praising tmux’s worth got me intrigued. I also wanted to try the full-screen support in Lion and start using virtual desktops again as my inability to keep on task with SizeUp was starting to become a serious productivity drain. I don’t know about anyone reading this, but I get immensely frustrated when I’ve been sitting in a chair for hours and have nothing to show for it. :)

So I tried tmux by itself; it turns out that iTerm stuff is mostly fluff and not particularly useful (to me, at least) and with all the effort required to get a separate build going it’s much easier to walk away with brew install tmux and forget about it.

So let’s get to the point. Here’s my vim as screencapped earlier when I started writing this thing:

Vim in Full Screen

(Click on the image to see it in full glory.)

This is MacVim. Notice the distinct lack of anything but the meat of the matter; the code?

Note that you will have trouble starting macvim with -r (or --remote) this way. You will want to create an alias like this:

alias ov='open -a MacVim'

To start your MacVim process. If you don’t do this, mvim -r or mvim from inside tmux won’t allow you it to integrate with the Mac Clipboard properly. This is a known bug related to login shells.

My terms look a bit like this:

iTerm 2 and tmux in Full Screen

I use two bits here that are really important:

An alias:

alias t="tmux attach"

that saves me typing, and the below tmux configuration:

set-window-option -g mode-keys vi
set-option -g status off
set-option -g default-shell /bin/zsh
set-option -g prefix C-f
set-option -g prefix2 C-b
bind-key f last-pane
bind-key C-f select-pane -t :.+
new-session
split-window -h
split-window -v

That performs all the splits and window configuration on the first run, along with rebinding some keys that tmux binds to be really annoying. If tmux is already running, it just attaches me to the existing one.

To get mission control to work the way I wish, I make as many windows full screen as possible, and I use this configuration:

Mission Control Set Up

Then I navigate through windows with cmd+tab and ctrl+left/right for quicker switching and sub-application switching.

Additionally, I keep a “slurry” desktop for all those extra things that might not make sense to keep full screen or simply don’t play nicely with the paradigm. Adium, Twitter, etc. This actually turns out to be around 3-4 apps at any given time.

I don’t really use the mouse but this is an overview of what everything looks like at-a-glance, from the ctrl+up feature that Mission Control provides:

Desktop Overview

Abusing the Chef API for Fun and Profit

| Comments

This post is about Chef, but perhaps not in the way you’ve experienced it before. Believe it or not, Chef has a beautiful API underneath all those tools, and very few of us in our day-to-day work exploit the opportunities available to us.

A Quick Start: Chef Search

Chef Search is one of the things that differentiates it from other tools of the same mind. There are a lot of opportunities in our daily environment to (ab)use chef search to automate tasks.

Here’s a small script that looks for all nodes that match a query, and prints their names:

chef_search_1.rb
1
2
3
4
5
6
7
8
9
10
11
#!ruby

require 'rubygems'
require 'chef/rest'
require 'chef/search/query'

Chef::Config.from_file(File.expand_path("~/.chef/knife.rb"))
query = Chef::Search::Query.new
nodes = query.search('node', ARGV.shift).first rescue []

puts nodes.map(&:name).join("\n")

You can try this out like so:

ruby chef_search_1.rb 'roles:my_role'

And it will print all your nodes that have “role[my_role]” in their run_list.

This isn’t very useful by itself. How about a script that does something with those nodes… maybe like printing out their ec2.instance_id’s?

chef_search_2.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!ruby

require 'rubygems'
require 'chef/rest'
require 'chef/node'
require 'chef/search/query'

Chef::Config.from_file(File.expand_path("~/.chef/knife.rb"))
query = Chef::Search::Query.new
nodes = query.search('node', ARGV.shift).first rescue []

nodes.each do |node|
  puts node.name
  puts "\tinstance_id: " + node.ec2.instance_id
end

Still not very useful. Are you using EC2 tagging for your cloud machines? If not, you should. Our tags from ec2-describe-instances look something like this (and a lot more stuff I can’t show):

TAG     instance        i-f3220280      Name    backend.test.example.com
TAG     instance        i-f3220280      environment     test

Now, we can extract this information with a variety of tools, like Fog and RightAws. The EC2 API is arguably the simplest, but shelling out to all that java is a time sink. RightAws is my favorite not because it’s pretty code or that it’s especially fancy or elaborate, but because it’s documented.

ri RightAws::Ec2
ri RightAws::Ec2.describe_tags

Gets us what we want.

Anyhow, the below script takes a search, finds out its instance id, and then proceeds to extract and print the tags with right_aws’s describe_tags method. It uses the same environment variables you’d use for the EC2 API Tools, so if you use EC2, you’re probably good to go.

chef_search_3.rb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
#!ruby

require 'rubygems'
require 'right_aws'
require 'chef/rest'
require 'chef/node'
require 'chef/search/query'

Chef::Config.from_file(File.expand_path("~/.chef/knife.rb"))
query = Chef::Search::Query.new
nodes = query.search('node', ARGV.shift).first rescue []
right_aws = RightAws::Ec2.new(ENV["AWS_ACCESS_KEY_ID"], ENV["AWS_SECRET_ACCESS_KEY"] )

nodes.each do |node|
  puts node.name

  unless (node.ec2.instance_id rescue nil)
    puts "\tNot an EC2 machine"
    next
  end

  puts "\tinstance_id: " + node.ec2.instance_id

  right_aws.describe_tags(:filters => { 'resource-id' => node.ec2.instance_id }).each do |tag|
    puts "\t\t%s:\t%s" % [tag[:key], tag[:value]]
  end
end

So now we have something marginally useful. I’m sure you can come up with all sorts of applications for this! Additionally, check out the code for these classes:

  • Chef::Node
  • Chef::Role
  • Chef::DataBagItem

For a lot more things to explore!

Blogorama

| Comments

Trying octopress. I’m really happy with this platform so far; being able to get started quickly is a big, important feature and with a site compiler this has always been a hard thing to do.

Anyhow, just testing the updates feature.