UPDATE: I wrote this charm and blog post before I saw the unfortunate news that the UbuntuForums.org along with Apple's Developer websites had been very recently compromised and user database stolen. Given that such events have sadly become commonplace, the instructions below can actually be used by proactive administrators to identify and disable weak user passwords, expecting that the bad guys are already doing the same.
It's been about 2 years since I've written a
Juju charm. And even those that I wrote were not scale-out applications.
I've been back at
Canonical for two weeks now, and I've been spending some time bringing myself up to speed on the
cloud projects that form the basis for the
Cloud Solution products, for which I'm responsible. First, I deployed
MAAS, and then brought up a small
Ubuntu OpenStack cluster. Finally, I decided to tackle
Juju and rather than deploying one of the existing charms, I wanted to write my own.
Installing Juju
Juju was originally written in
Python, but has since been ported to
Golang over the last 2+ years. My previous experience was exclusively with the Python version of Juju, but all new development is now focused on the Golang version of Juju, also known as
juju-core. So at this point, I decided to install
juju-core from the 13.04 (raring) archive.
sudo apt-get install juju-core
I immediately hit a couple of bugs in the version of
juju-core in 13.04 (
1.10.0.1-0ubuntu1~ubuntu13.04.1), particularly
Bug #1172973. Life is more fun on the edge anyway, so I upgraded to a daily snapshot from the
PPA.
sudo apt-add-repository ppa:juju/devel
sudo apt-get update
sudo apt-get install juju-core
Now I'm running
juju-core 1.11.2-3~1414~raring1, and it's currently working.
Configuring Juju
Juju can be configured to use a number of different cloud backends as "providers", notably,
Amazon EC2,
OpenStack,
MAAS, and
HP Cloud.
For my development, I'm using Canonical's internal deployment of OpenStack, and so I configured my environment accordingly in
~/.juju/environments.yaml:
default: openstack
environments:
openstack:
type: openstack
admin-secret: any-secret-you-choose-randomly
control-bucket: any-bucket-name-you-choose-randomly
default-series: precise
auth-mode: userpass
Using OpenStack (or even AWS for that matter) also requires defining a number of environment variables in an
rc-file. Basically, you need to be able to launch instances using
euca2ools or
ec2-api-tools. That's outside of the scope of this post, and expected as a prerequisite.
The official documentation for configuring your Juju environment
can be found here.
Choosing a Charm-able Application
I have previously charmed two small (but useful!) webapps that I've written and continue to maintain --
Pictor and
Musica. These are both standalone web applications that allow you to organize, serve, share, and stream your picture archive and music collection. But neither of these "scale out", really. They certainly could, perhaps, use a caching proxy on the front end, and shared storage on the back end. But, as I originally wrote them, they do not. Maybe I'll update that, but I don't know of anyone using either of those charms.
In any case, for this experiment, I wanted to write a charm that would "scale out", with Juju's
add-unit command. I wanted to ensure that adding more units to a deployment would result in a bigger and better application.
For these reasons, I chose the program known as
John-the-Ripper, or just
john. You can trivially install it on any Ubuntu system, with:
sudo apt-get install john
John has been used by Linux system administrators for over a decade to test the quality of their user's passwords. A root user can view the hashes that protect user passwords in files like
/etc/shadow or even application level password hashes in a database. Effectively, it can be used to "crack" weak passwords. There are almost certainly evil people using programs like
john to do malicious things. But as long as the good guys have access to a program like
john too, they can ensure that their own passwords are impossible to crack.
John can work in a number of different "modes". It can use a dictionary of words, and simply hash each of those words looking for a match. The
john-data package ships a word list in
/usr/share/john/password.lst that contains 3,000+ words. You can find much bigger wordlists online as well, such as
this one, which contains over 2 million words.
John can also generate "twists" on these words according to some rules (like changing E's to 3's, and so on). And it can also work in a complete brute force mode, generating every possible password from various character sets. This, of course, will take exponentially longer run times, depending on the length of the password.
Fortunately, John can run in parallel, with as many workers as you have at your disposal. You can run multiple processes on the same system, or you can scale it out across many systems. There are many
different approaches to parallelizing John, using
OpenMP,
MPI, and others.
I took a very simple approach, explained in the
manpage and configuration file called "External". Basically, in the
/etc/john/john.conf configuration file, you tell each node how many total nodes exist, and which particular node they are. Each node uses the same wordlist or sequential generation algorithm, and indexes these. The node modulates the current index by the total number of nodes, and tries the candidate passwords that match their own id. Dead simple :-) I like it.
# Trivial parallel processing example
[List.External:Parallel]
/*
* This word filter makes John process some of the words only, for running
* multiple instances on different CPUs. It can be used with any cracking
* mode except for "single crack". Note: this is not a good solution, but
* is just an example of what can be done with word filters.
*/
int node, total; // This node's number, and node count
int number; // Current word number
void init()
{
node = 1; total = 2; // Node 1 of 2, change as appropriate
number = node - 1; // Speedup the filter a bit
}
void filter()
{
if (number++ % total) // Word for a different node?
word = 0; // Yes, skip it
}
This does, however, require some way of sharing the inputs, logs, and results across all nodes. Basically, I need a shared filesystem. The Juju charm collection has a number of shared filesystem charms already implemented. I chose to use NFS in my deployment, though I could have just as easily used Ceph, Hadoop, or others.
Writing a Charm
The official documentation on writing charms
can be found here. That's certainly a good starting point, and I read all of that before I set out. I also spent considerable time in the
#juju IRC channel on irc.freenode.net, talking to
Jorge and
Marco. Thanks, guys!
The base template of the charm is pretty simple. The convention is to create a charm directory like this, and put it under revision control.
mkdir -p precise/john
bzr init .
I first needed to create the metadata that will describe my charm to Juju. My charm is named
john, which is an application known as "John the Ripper", which can test the quality of your passwords. I list myself as the maintainer. This charm requires a shared filesystem that implements the mount interface, as my charm will call some hooks that make use of that mount interface. Most importantly, this charm may have other peers, which I arbitrarily called
workers. They have a dummy interface (not used) called
john. Here's the
metadata.yaml:
name: john
summary: "john the ripper"
description: |
John the Ripper tests the quality of system passwords
maintainer: "Dustin Kirkland"
requires:
shared-fs:
interface: mount
peers:
workers:
interface: john
I also have one optional configuration parameter, called
target_hashes. This configuration string will include the input data that john will work on, trying to break. This can be one to many different password hashes to crack. If this isn't specified, this charm actually generates some random ones, and then tries to break those. I thought that would be nice, so that it's immediately useful out of the box. Here's
config.yaml:
options:
target_hashes:
type: string
description: input password hashes
There's a couple of other simple files to create, such as copyright:
Format: http://dep.debian.net/deps/dep5/
Files: *
Copyright: Copyright 2013, Dustin Kirkland , All Rights Reserved.
License: GPL-3
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
README and revision are also required.
And the Magic -- Hooks!
The real magic happens in a set of very specifically named hooks. These are specially named executables, which can be written in any language. For my purposes, shell scripts are more than sufficient.
The Install Hook
The install hook is what is run at installation time on each worker node. I need to install the
john and
john-data packages, as well as the
nfs-common client binaries. I also make use of the
mkpasswd utility provided by the
whois package. And I will also use the
keep-one-running tool provided by the
run-one package. Finally, I need to tweak the configuration file,
/etc/john/john.conf, on each node, to use all of the CPU, save results every 10 seconds (instead of every 10 minutes), and to use the much bigger wordlist that we're going to fetch. Here's
hooks/install:
#!/bin/bash
set -eu
juju-log "Installing all components"
apt-get update
apt-get install -qqy nfs-common john john-data whois run-one
DIR=/var/lib/john
mkdir -p $DIR
ln -sf $DIR /root/.john
sed -i -e "s/^Idle = .*/Idle = N/" /etc/john/john.conf
sed -i -e "s/^Save = .*/Save = 10/" /etc/john/john.conf
sed -i -e "s:^Wordlist = .*:Wordlist = $DIR\/passwords.txt:" /etc/john/john.conf
juju-log "Installed packages"
The Start Hook
The start hook defines how to start this application. Ideally, the
john package would provide an init script or upstart job that cleanly daemonizes its workers, but it currently doesn't. But for a poor-man's daemonizer, I love the
keep-one-running utility (
written by yours truly). I'm going to start two copies of
john utility, one that runs in wordlist mode, trying every one of the 2 million words in my wordlist, as well as a second, which tries every combinations of characters in an incremental, brute force mode. These binaries are going to operate entirely in the shared
/var/lib/john NFS mount point. Each copy on each worker node will need to have their own session file. Here's
hooks/start:
#!/bin/bash
set -eu
juju-log "Starting john"
DIR=/var/lib/john
keep-one-running john -incremental -session:$DIR/session-incremental-$(hostname) -external:Parallel $DIR/target_hashes &
keep-one-running john -wordlist:$DIR/passwords.txt -session:$DIR/session-wordlist-$(hostname) -external:Parallel $DIR/target_hashes &
The Stop Hook
The stop hook defines how to stop the application. Here, I'll need to kill the
keep-one-running processes which wrap
john, since we don't have an upstart job or init script. This is perhaps a little sloppy, but perfectly functional. Here's
hooks/stop:
#!/bin/bash
set -eu
juju-log "Stopping john"
killall keep-one-running || true
The Workers Relation Changed Hook
This hook defines the actions that need to be taken each time another
john worker unit is added to the service. Basically, each worker needs to recount how many total workers there are (using the
relation-list command), determine their own id (from $JUJU_UNIT_NAME), update their
/etc/john/john.conf (using
sed), and then restart their
john worker processes. The last part is easy since we're using
keep-one-running; we simply need to
killall john processes, and
keep-one-running will automatically respawn new processes that will read the updated configuration file. This is
hooks/workers-relation-changed:
#!/bin/bash
set -eu
DIR="/var/lib/john"
update_unit_count() {
node=$(echo $JUJU_UNIT_NAME | awk -F/ '{print $2}')
node=$((node+1))
total=$(relation-list | wc -l)
total=$((total+1))
sed -i -e "s/^\s\+node = .*; total = .*;.*$/ node = $node; total = $total;/" /etc/john/john.conf
}
restart_john() {
killall john || true
# It'll restart itself via keep-one-running, if we kill it
}
update_unit_count
restart_john
The Configuration Changed Hook
All
john worker nodes will operate on a file in the shared filesystem called
/var/lib/john/target_hashes. I'd like the administrator who deployed this service to be able to dynamically update that file and signal all of her worker nodes to restart their
john processes. Here, I used the
config-get juju command, and again restart by simply killing the
john processes and letting
keep-one-running sort out the restart. This is handled here in
hooks/config-changed:
#!/bin/bash
set -e
DIR=/var/lib/john
target_hashes=$(config-get target_hashes)
if [ -n "$target_hashes" ]; then
# Install the user's supplied hashes
echo "$target_hashes" > $DIR/target_hashes
# Restart john
killall john || true
fi
The Shared Filesystem Relation Changed Hook
By far, the most complicated logic is in
hooks/shared-fs-relation-changed. There's quite a bit of work we need to here, as soon as we can be assured that this node has successfully mounted its shared filesystem. There's a bit of boilerplate mount work that I borrowed from the
owncloud charm. Beyond that, there's a bit of
john-specific work. I'm downloading the aforementioned larger wordlist. I install the target hash, if specified in the configuration; otherwise, we just generate 10 random target passwords to try and crack. We also symlink a bunch of
john's runtime shared data into the NFS directory. For no good reason,
john expects a bunch of stuff to be in the same directory. Of course, this code could really use some cleanup. Here it is again, non-perfect, but functional
hooks/shared-fs-relation-changed:
#!/bin/bash
set -eu
remote_host=`relation-get private-address`
export_path=`relation-get mountpoint`
mount_options=`relation-get options`
fstype=`relation-get fstype`
DIR="/var/lib/john"
if [ -z "${export_path}" ]; then
juju-log "remote host not ready"
exit 0
fi
local_mountpoint="$DIR"
create_local_mountpoint() {
juju-log "creating local mountpoint"
umask 022
mkdir -p $local_mountpoint
chown -R ubuntu:ubuntu $local_mountpoint
}
[ -d "${local_mountpoint}" ] || create_local_mountpoint
share_already_mounted() {
`mount | grep -q $local_mountpoint`
}
mount_share() {
for try in {1..3}; do
juju-log "mounting share"
[ ! -z "${mount_options}" ] && options="-o ${mount_options}" || options=""
mount -t $fstype $options $remote_host:$export_path $local_mountpoint \
&& break
juju-log "mount failed: ${local_mountpoint}"
sleep 10
done
}
download_passwords() {
if [ ! -s $DIR/passwords.txt ]; then
# Grab a giant dictionary of passwords, 20MB, 2M passwords
juju-log "Downloading password dictionary"
cd $DIR
# http://www.breakthesecurity.com/2011/12/large-password-list-free-download.html
wget http://dazzlepod.com/site_media/txt/passwords.txt
juju-log "Done downloading password dictionary"
fi
}
install_target_hashes() {
if [ ! -s $DIR/target_hashes ]; then
target_hashes=$(config-get target_hashes)
if [ -n "$target_hashes" ]; then
# Install the user's supplied hashes
echo "$target_hashes" > $DIR/target_hashes
else
# Otherwise, grab some random ones
i=0
for p in $(shuf -n 10 $DIR/passwords.txt); do
# http://openwall.info/wiki/john/Generating-test-hashes
printf "user${i}:%s\n" $(mkpasswd -m md5 $p) >> $DIR/target_hashes
i=$((i+1))
done
fi
fi
for i in /usr/share/john/*; do
ln -sf $i /var/lib/john
done
}
apt-get -qqy install rpcbind nfs-common
share_already_mounted || mount_share
download_passwords
install_target_hashes
Deploying the Service
If you're still with me, we're ready to deploy this service and try cracking some passwords! We need to bootstrap our environment, and deploy the stock nfs charm. Next, branch my charm's source code, and deploy it. I deployed it here across a whopping 18 units! I currently have a quota of 20 small instances I can run our private OpenStack. Two of those instances are used by the Juju bootstrap node and by the NFS server. So the other 18 will be NFS clients running
john processes.
juju bootstrap
juju deploy nfs
bzr branch lp:~kirkland/+junk/john precise
juju deploy -n 18 --repository=precise local:precise/john
juju add-relation john nfs
juju status
Once everything is up and ready, running and functional, my status looks like this:
machines:
"0":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.230
instance-id: 98090098-2e08-4326-bc73-22c7c6879b95
series: precise
"1":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.7
instance-id: 449c6c8c-b503-487b-b370-bb9ac7800225
series: precise
"2":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.193
instance-id: 576ffd6f-ddfa-4507-960f-3ac2e11ea669
series: precise
"3":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.215
instance-id: 70bfe985-9e3f-4159-8923-60ab6d9f7d43
series: precise
"4":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.221
instance-id: f48364a9-03c0-496f-9287-0fb294bfaf24
series: precise
"5":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.223
instance-id: 62cc52c4-df7e-448a-81b1-5a3a06af6324
series: precise
"6":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.231
instance-id: f20dee5d-762f-4462-a9ef-96f3c7ab864f
series: precise
"7":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.239
instance-id: 27c6c45d-18cb-4b64-8c6d-b046e6e01f61
series: precise
"8":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.240
instance-id: 63cb9c91-a394-4c23-81bd-c400c8ec4f93
series: precise
"9":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.242
instance-id: b2239923-b642-442d-9008-7d7e725a4c32
series: precise
"10":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.249
instance-id: 90ab019c-a22c-41d3-acd2-d5d7c507c445
series: precise
"11":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.252
instance-id: e7abe8e1-1cdf-4e08-8771-4b816f680048
series: precise
"12":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.254
instance-id: ff2b6ba5-3405-4c80-ae9b-b087bedef882
series: precise
"13":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.60.255
instance-id: 2b019616-75bc-4227-8b8b-78fd23d6b8fd
series: precise
"14":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.1
instance-id: ecac6e11-c89e-4371-a4c0-5afee41da353
series: precise
"15":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.3
instance-id: 969f3d1c-abfb-4142-8cc6-fc5c45d6cb2c
series: precise
"16":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.4
instance-id: 6bb24a01-d346-4de5-ab0b-03f51271e8bb
series: precise
"17":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.5
instance-id: 924804d6-0893-4e56-aef2-64e089cda1be
series: precise
"18":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.11
instance-id: 5c96faca-c6c0-4be4-903e-a6233325caec
series: precise
"19":
agent-state: started
agent-version: 1.11.0
dns-name: 10.99.61.15
instance-id: 62b48da2-60ea-4c75-b5ed-ffbb2f8982b5
series: precise
services:
john:
charm: local:precise/john-3
exposed: false
relations:
shared-fs:
- nfs
workers:
- john
units:
john/0:
agent-state: started
agent-version: 1.11.0
machine: "2"
public-address: 10.99.60.193
john/1:
agent-state: started
agent-version: 1.11.0
machine: "3"
public-address: 10.99.60.215
john/2:
agent-state: started
agent-version: 1.11.0
machine: "4"
public-address: 10.99.60.221
john/3:
agent-state: started
agent-version: 1.11.0
machine: "5"
public-address: 10.99.60.223
john/4:
agent-state: started
agent-version: 1.11.0
machine: "6"
public-address: 10.99.60.231
john/5:
agent-state: started
agent-version: 1.11.0
machine: "7"
public-address: 10.99.60.239
john/6:
agent-state: started
agent-version: 1.11.0
machine: "8"
public-address: 10.99.60.240
john/7:
agent-state: started
agent-version: 1.11.0
machine: "9"
public-address: 10.99.60.242
john/8:
agent-state: started
agent-version: 1.11.0
machine: "10"
public-address: 10.99.60.249
john/9:
agent-state: started
agent-version: 1.11.0
machine: "11"
public-address: 10.99.60.252
john/10:
agent-state: started
agent-version: 1.11.0
machine: "12"
public-address: 10.99.60.254
john/11:
agent-state: started
agent-version: 1.11.0
machine: "13"
public-address: 10.99.60.255
john/12:
agent-state: started
agent-version: 1.11.0
machine: "14"
public-address: 10.99.61.1
john/13:
agent-state: started
agent-version: 1.11.0
machine: "15"
public-address: 10.99.61.3
john/14:
agent-state: started
agent-version: 1.11.0
machine: "16"
public-address: 10.99.61.4
john/15:
agent-state: started
agent-version: 1.11.0
machine: "17"
public-address: 10.99.61.5
john/16:
agent-state: started
agent-version: 1.11.0
machine: "18"
public-address: 10.99.61.11
john/17:
agent-state: started
agent-version: 1.11.0
machine: "19"
public-address: 10.99.61.15
nfs:
charm: cs:precise/nfs-3
exposed: false
relations:
nfs:
- john
units:
nfs/0:
agent-state: started
agent-version: 1.11.0
machine: "1"
public-address: 10.99.60.7
Obtaining the Results
And now, let's monitor the results. To do this, I'll
ssh to any of the
john worker nodes, move over to the shared NFS directory, and use the
john -show command in a watch loop.
keep-one-running juju ssh john/0
sudo su -
cd /var/lib/john
watch john -show target_hashes
And the results...
Every 2.0s: john -show target_hashes
user:260775
user1:73832100
user2:829171kzh
user3:pf1vd4nb
user4:7788521312229
user5:saksak
user6:rongjun2010
user7:2312010
user8:davied
user9:elektrohobbi
10 password hashes cracked, 0 left
Within a few seconds, this 18-node cluster has cracked all 10 of the randomly chosen passwords from the dictionary. That's only mildly interesting, as my laptop can do the same in a few minutes, if the passwords are already in the wordlist. What's far more interesting is in randomly generating a password and passing that as a new configuration to our running cluster and letting it crack that instead.
Modifying the Configuration Target Hash
Let's generate a random password using
apg. We'll then need to hash this and create a string in the form of
username:pwhash that
john can understand. Finally, we'll pass this to our cluster using Juju's
set action.
passwd=$(apg -a 0 -n 1 -m 6 -x 6)
target=$(printf "user0:%s\n" $(mkpasswd -m md5 $passwd))
juju set john target_hashes="$target"
This was a 6 character password, consisting of 52 random characters (a-z, A-Z), almost certainly
not in our dictionary. 52
6 = 19,770,609,664, or about 19 billion letter combinations we need to test. According to the
john -test command, a single one of my instances can test about 12,500 MD5 hashes per second. So with a single instance, this would take a maximum of 52
6 / 12,500 / 60 / 60 = 439 hours. Or 18 days :-) Well, I happen to have exactly 18 instances, so we should be able to test the entire wordspace in about 24 hours.
So I threw all 18 instances at this very problem and let it run over a weekend. And voila, we got a little lucky, and cracked the password,
Uvneow, in 16 hours!
In Conclusion
I don't know if this charm will ever land in the official charm store. That really wasn't the goal of this exercise for me. I simply wanted to bring myself back up to speed on Juju, play with the port to Golang, experiment with OpenStack as a provider for Juju, and most importantly, write a scalable Juju charm.
This particularly application,
john, is actually just one of a huge class of MPI-compatible parallelizable applications that could be charmed for Juju. The general design, I think, should be very reusable by you, if you're interested. Between the shared file system and the keep-one-running approach, I bet you could charm any one of a number of scalable applications. While I'm not eligible, perhaps you might consider competing for cash prizes in the
Juju Charm Championship.
Happy charming,
:-Dustin