From the Canyon Edge -- :-Dustin
Showing posts with label maas. Show all posts
Showing posts with label maas. Show all posts

Saturday, December 21, 2013

What you need to know about Intel AMT and the Intel NUC with Ubuntu


A couple of weeks ago, I waxed glowingly about Ubuntu running on a handful of Intel NUCs that I picked up on Amazon, replacing some aging PCs serving various purposes around the house.  I have since returned all three of those, and upgraded to the i5-3427u version, since it supports Intel AMT.  Why would I do that?  Read on...
When my shiny new NUCs arrived, I was quite excited to try out this fancy new AMT feature.  In fact, I had already enabled it and experimented with it on a couple of my development i7 Thinkpads, so I more or less knew what to expect.

But what followed was 6 straight hours of complete and utter frustration :-(  Like slam your fist into the keyboard and shout obscenities into cheese.
Actually, on that last point, I find it useful, when I'm mad, to open up cheese on my desktop and get visibly angry.  Once I realize how dumb I look when I'm angry, its a bit easier to stop being angry.  Seriously, try it sometime.
Okay, so I posted a couple of support requests on Intel's community forums.

Basically, I found it nearly impossible (like 1 in 100 chances) of actually getting into the AMT configuration menu using the required Ctrl-P.  And in the 2 or 3 times I did get in there, the default password, "admin", did not work.

After putting the kids to bed, downing a few pints of homebrewed beer, and attempting sleep (with a 2-week-old in the house), I lay in bed, awake in the middle of the night and it crossed my mind that...
No, no.  No way.  That couldn't be it.  Surely not.  That's really, really dumb.  Is it possible that the NUC's BIOS...  Nah.  Maybe, though.  It's worth a try at this point?  Maybe, just maybe, the NumLock key is enabled at boot???  It can't be.  The NumLock key is effin retarded, and almost as dumb as its braindead cousin, the CapsLock key.  OMFG!!!
Yep, that was it.  Unbelievable.  The system boots with the NumLock key toggled on.  My keyboard doesn't have an LED indicator that tells me such inane nonsense is the case.  And the BIOS doesn't expose a setting to toggle this behavior.  The "P" key is one of the keys that is NumLocked to "*".


So there must be some incredibly unlikely race condition that I could win 1 in 100 times where me pressing Ctrl-P frantically enough actually sneaks me into the AMT configuration.  Seriously, Intel peeps, please make this an F-key, like the rest of the BIOS and early boot options...

And once I was there, the default password, "admin", includes two more keys that are NumLocked.  For security reasons, these look like "*****" no matter what I'm typing.  When I thought I was typing "admin", I was actually typing "ad05n".  And of course, there's no scratch pad where I can test my keyboard and see that this is the case.  In fact, I'm not the only person hitting similar issues.  It seems that most people using keyboards other than US-English are quite confused when they type "admin" over and over and over again, to their frustration.

Okay, rant over.  I posted my solution back to my own questions on the forum.  And finally started playing with AMT!

The synopsis: AMT is really, really impressive!

First, you need to enter bios and ensure that it's enabled.  Then, you need to do whatever it takes to enter Intel's MEBx interface, using Ctrl-P (NumLock notwithstanding).  You'll be prompted for a password, and on your first login, this should be "admin" (NumLock notwithstanding).  Then you'll need to choose your own strong password.  Once in there, you'll need to enable a couple of settings, including networking/dhcp auto setup.  You can, at your option, also install some TLS certificates and secure your communications with your device.

AMT has a very simple, intuitive web interface.  Here are a comprehensive set of screen shots of all of the individual pages.

Once AMT is enabled on the target system, point a browser to port 16992, and click "Log On..."

The username is always "admin".  You'll set this password in the MEBx interface, using Ctrl-P just after BIOS post.

Here's the basic system status/overview.

The System Information page contains basic information about the system itself, including some of its capabilities.

The processor information page gives you the low down on your CPU.  Search ark.intel.com for your Intel CPU type to see all of its capabilities.

Check your memory capacity, type, speed, etc.

And your disk type, size, and serial number.

NUCs don't have battery information, but my Thinkpad does.

An event log has some interesting early boot and debug information here.

Arguably the most useful page, here you can power a system on, off, or hard reboot it.

If you have wireless capability, you choose whether you want that enabled/disabled when the system is off, suspended, or hibernated.

Here you can configure the network settings.  Unlike a BMC (Board Management Controller) on most server class hardware, which has its own dedicated interface, Intel AMT actually shares the network interface with the Operating System.

AMT actually supports IPv6 networking as well, though I haven't played with it yet.

Configure the hostname and Dynamic DNS here.

You can set up independent user accounts, if necessary.

And with a BIOS update, you can actually use Intel AMT over a wireless connection (if you have an Intel wireless card)
So this pointy/clicky web interface is nice, but not terribly scriptable (without some nasty screenscraping).  What about the command line interface?

The amttool command (provided by the amtterm package in Ubuntu) offers a nice command line interface into some of the functionality exposed by AMT.  You need to export an environment variable, AMT_PASSWORD, and then you can get some remote information about the system:

kirkland@x230:~⟫ amttool 10.0.0.14 info
### AMT info on machine '10.0.0.14' ###
AMT version:  7.1.20
Hostname:     nuc1.
Powerstate:   S0
Remote Control Capabilities:
    IanaOemNumber                   0
    OemDefinedCapabilities          IDER SOL BiosSetup BiosPause
    SpecialCommandsSupported        PXE-boot HD-boot cd-boot
    SystemCapabilitiesSupported     powercycle powerdown powerup reset
    SystemFirmwareCapabilities      f800

You can also retrieve the networking information:

kirkland@x230:~⟫ amttool 10.0.0.14 netinfo
Network Interface 0:
    DhcpEnabled                     true
    HardwareAddressDescription      Wired0
    InterfaceMode                   SHARED_MAC_ADDRESS
    LinkPolicy                      31
    MACAddress                      00-aa-bb-cc-dd-ee
        DefaultGatewayAddress       10.0.0.1
        LocalAddress                10.0.0.14
        PrimaryDnsAddress           10.0.0.1
        SecondaryDnsAddress         0.0.0.0
        SubnetMask                  255.255.255.0
Network Interface 1:
    DhcpEnabled                     true
    HardwareAddressDescription      Wireless1
    InterfaceMode                   SHARED_MAC_ADDRESS
    LinkPolicy                      0
    MACAddress                      ee-ff-aa-bb-cc-dd
        DefaultGatewayAddress       0.0.0.0
        LocalAddress                0.0.0.0
        PrimaryDnsAddress           0.0.0.0
        SecondaryDnsAddress         0.0.0.0
        SubnetMask                  0.0.0.0

Far more handy than WoL alone, you can power up, power down, and power cycle the system.

kirkland@x230:~⟫ amttool 10.0.0.14 powerdown
host x220., powerdown [y/N] ? y
execute: powerdown
result: pt_status: success

kirkland@x230:~⟫ amttool 10.0.0.14 powerup
host x220., powerup [y/N] ? y
execute: powerup
result: pt_status: success

kirkland@x230:~⟫ amttool 10.0.0.14 powercycle
host x220., powercycle [y/N] ? y
execute: powercycle
result: pt_status: success

I was a little disappointed that amttool's info command didn't provide nearly as much information as the web interface.  However, I did find a fork of Gerd Hoffman's original Perl script in Sourceforge here.  I don't know the upstream-ability of this code, but it worked very well for my part, and I'm considering sponsoring/merging it into Ubuntu for 14.04.  Anyone have further experience with these enhancements?

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data BIOS
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'BIOS' (1 item):
  (data struct.ver. 1.0)
   Vendor:       'Intel Corp.'
   Version:      'RKPPT10H.86A.0028.2013.1016.1429'
   Release date: '10/16/2013'
   BIOS characteristics: 'PCI' 'BIOS upgradeable' 'BIOS shadowing
allowed' 'Boot from CD' 'Selectable boot' 'EDD spec' 'int13h 5.25 in
1.2 mb floppy' 'int13h 3.5 in 720 kb floppy' 'int13h 3.5 in 2.88 mb
floppy' 'int5h print screen services' 'int14h serial services'
'int17h printer services'

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data ComputerSystem
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'ComputerSystem' (1 item):
  (data struct.ver. 1.0)
   Manufacturer: '                                 '
   Product:      '                                 '
   Version:      '                                 '
   Serial numb.: '                                 '
   UUID:         7ae34e30-44ab-41b7-988f-d98c74ab383d

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data Baseboard
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'Baseboard' (1 item):
  (data struct.ver. 1.0)
   Manufacturer: 'Intel Corporation'
   Product:      'D53427RKE'
   Version:      'G87971-403'
   Serial numb.: '27XC63723G4'
   Asset tag:    'To be filled by O.E.M.'
   Replaceable:  yes

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data Processor
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'Processor' (1 item):
  (data struct.ver. 1.0)
   ID:                  0x4529f9eaac0f
   Max Socket Speed:    2800 MHz
   Current Speed:       1800 MHz
   Processor Status:    Enabled
   Processor Type:      Central
   Socket Populated:    yes
   Processor family:    'Intel(R) Core(TM) i5 processor'
   Upgrade Information: [0x22]
   Socket Designation:  'CPU 1'
   Manufacturer:        'Intel(R) Corporation'
   Version:             'Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz'

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data MemoryModule
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'MemoryModule' (2 items):
  (* No memory device in the socket *)
  (data struct.ver. 1.0)
   Size:         8192 Mb
   Form Factor:  'SODIMM'
   Memory Type:  'DDR3'
   Memory Type Details:, 'Synchronous'
   Speed:        1333 MHz
   Manufacturer: '029E'
   Serial numb.: '123456789'
   Asset Tag:    '9876543210'
   Part Number:  'GE86sTBF5emdppj '

kirkland@x230:/tmp⟫ ./amttool 10.0.0.37 hwasset data VproVerificationTable
## '10.0.0.37' :: AMT Hardware Asset
 Data for the asset 'VproVerificationTable' (1 item):
  (data struct.ver. 1.0)
   CPU: VMX=Enabled SMX=Enabled LT/TXT=Enabled VT-x=Enabled
   MCH: PCI Bus 0x00 / Dev 0x08 / Func 0x00
        Dev Identification Number (DID): 0x0000
        Capabilities: VT-d=NOT_Capable TXT=NOT_Capable Bit_50=Enabled
Bit_52=Enabled Bit_56=Enabled
   ICH: PCI Bus 0x00 / Dev 0xf8 / Func 0x00
        Dev Identification Number (DID): 0x1e56
   ME:  Enabled
        Intel_QST_FW=NOT_Supported Intel_ASF_FW=NOT_Supported
Intel_AMT_FW=Supported Bit_13=Enabled Bit_14=Enabled Bit_15=Enabled
        ME FW ver. 8.1 hotfix 40 build 1416
   TPM: Disabled
        TPM on board = NOT_Supported
   Network Devices:
        Wired NIC - PCI Bus 0x00 / Dev 0xc8 / Func 0x00 / DID 0x1502
   BIOS supports setup screen for (can be editable): VT-d TXT
        supports VA extensions (ACPI Op region) with maximum ver. 2.6
        SPI Flash has Platform Data region reserved.

On a different note, I recently sponsored a package, wsmancli, into Ubuntu Universe for Trusty, at the request of Kent Baxley (Canonical) and Jared Dominguez (Dell), which provides the wsman command.  Jared writes more about it here in this Dell technical post.  With Kent's help, I did manage get wsman to remotely power on a system.  I must say that it's a bit less user friendly than the equivalent amttool functionality above...

kirkland@x230:~⟫  wsman invoke -a RequestPowerStateChange -J request.xml http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_PowerManagementService?SystemCreationClassName="CIM_ComputerSystem",SystemName="Intel(r)AMT",CreationClassName="CIM_PowerManagementService",Name="Intel(r) AMT Power Management Service" --port 16992 -h 10.0.0.14 --username admin -p "ABC123abc123#" -V -v

I'm really enjoying the ability to remotely administer these systems.  And I'm really, really looking forward to the day when I can use MAAS to provision these systems!

:-Dustin

Why I returned all of my i3 Intel NUCs...

and bought 3 more with the i5-3427u CPU!


A couple of weeks ago, I waxed glowingly about Ubuntu running on a handful of Intel NUCs that I picked up on Amazon, replacing some aging PCs serving various purposes around the house.  I have since returned all three of those...and upgraded to the i5 version!!!  Read on to find out why...
Whenever I publish an article here, the Blogger/G+ integration immediately posts a link to my G+ feed.  In that thread, Mark Shuttleworth asked if these NUCs supported IPMI or a similar technology, such that they could be enabled in MAAS.  I responded in kind, that, sadly, no, they only support tried-and-trusty-but-dumb-old-Wake-on-LAN.

Alas, an old friend, fellow homebrewer, and new Canonicaler, Ryan Harper, noted that the i5-3427u version of the NUC (performance specs here) actually supports Intel AMT, which is similar to IPMI.  Actually, it's an implementation of WBEM, which itself is fundamentally an implementation of the CIM standard.

That's a health dose of alphabet soup for you.  MAAS, NUC, AMT, IPMI, WEBM, CIM.  What does all of this mean?

Let's do a quick round of introductions for the uninitiated!
  • NUC - Intel's Next Unit of Computing.  It's a palm sized computer, probably intended to be a desktop, but actually functions quite well as a Linux server too.  Drawing about 10W, it's has roughly the same power of an AWS m1.xlarge, and costs about as much as 45 days of an m1.xlarge's EC2 bill.
  •  MAAS - Metal as a Service.  Installing Ubuntu servers (or desktops, for that matter), one by one, with a CD/DVD/USB-key is so 2004.  MAAS is your PXE/DHCP/TFTP/DNS (shit, more alphabet soup...) solution, all-in-one, ready to install Ubuntu onto lots of systems at scale!  Oh, and good news...  Juju supports MAAS as one of its environments, which is cool, in that you can deploy any charmed Juju workload to bare metal, in addition to AWS and OpenStack clouds.
  • AMT - Intel's Asset Management Technology.  This is a feature found on some Intel platforms (specifically, those whose CPU and motherboard support vPro technology), which enables remote management of the system.  Specifically, if you can authenticate successfully to the system, you can retrieve detailed information about the hardware, power cycle it on and off, and modify the boot sequence.  These are the essential functions that MAAS requires to support a system.
  • IPMI - Intelligent Platform Management Interface.  Also pioneered by Intel, this is a more server focused remote network management of systems, providing power on/off and other capabilities.
  • WBEM - Web Based Enterprise Management.  Remote system management technology available through a web browser, based on some internet standards, including CIM.
  • CIM - Common Information Model.  An open open standard that defines how systems in an IT environment are represented and managed.  Does that sound meta to you?  Well, yes, yes it is.
Okay, we have our vocabulary...now what?

So I actually returned all 3 of my Intel NUCs, which had the i3 processor, in favor of the more powerful (and slightly more expensive) i5 versions.  Note that I specifically bought the i5 Ivy Bridge versions, rather than the newer i5 Haswell, because only the Ivy Bridge actually supports AMT (for reasons that I cannot explain).  In fact, in comparison to Haswell, the Ivy Bridge systems:
  1. have AMT
  2. are less expensive
  3. have a higher maximum clock speed
  4. support a higher maximum memory
The only advantage I can see of the newer Haswells is a slightly lower energy footprint, and a slightly better video processor.

When 3 of my shiny new NUCs arrived, I was quite excited to try out this fancy new AMT feature.  In fact, I had already enabled it and experimented with it on a couple of my development i7 Thinkpads, so I more or less knew what to expect.

At this point, I split this post in two.  You're welcome to read on, to learn what you need to know about Intel AMT + Ubuntu + the i5-3427u NUC...

:-Dustin

Monday, July 22, 2013

JohnJohn -- A Scalable Juju Charm Tutorial

UPDATE: I wrote this charm and blog post before I saw the unfortunate news that the UbuntuForums.org along with Apple's Developer websites had been very recently compromised and user database stolen.  Given that such events have sadly become commonplace, the instructions below can actually be used by proactive administrators to identify and disable weak user passwords, expecting that the bad guys are already doing the same.

It's been about 2 years since I've written a Juju charm.  And even those that I wrote were not scale-out applications.

I've been back at Canonical for two weeks now, and I've been spending some time bringing myself up to speed on the cloud projects that form the basis for the Cloud Solution products, for which I'm responsible. First, I deployed MAAS, and then brought up a small Ubuntu OpenStack cluster.  Finally, I decided to tackle Juju and rather than deploying one of the existing charms, I wanted to write my own.

Installing Juju

Juju was originally written in Python, but has since been ported to Golang over the last 2+ years.  My previous experience was exclusively with the Python version of Juju, but all new development is now focused on the Golang version of Juju, also known as juju-core.  So at this point, I decided to install juju-core from the 13.04 (raring) archive.

sudo apt-get install juju-core

I immediately hit a couple of bugs in the version of juju-core in 13.04 (1.10.0.1-0ubuntu1~ubuntu13.04.1), particularly Bug #1172973.  Life is more fun on the edge anyway, so I upgraded to a daily snapshot from the PPA.

sudo apt-add-repository ppa:juju/devel
sudo apt-get update
sudo apt-get install juju-core

Now I'm running juju-core 1.11.2-3~1414~raring1, and it's currently working.

Configuring Juju

Juju can be configured to use a number of different cloud backends as "providers", notably, Amazon EC2, OpenStack, MAAS, and HP Cloud.

For my development, I'm using Canonical's internal deployment of OpenStack, and so I configured my environment accordingly in ~/.juju/environments.yaml:

default: openstack
environments:
  openstack:
    type: openstack
    admin-secret: any-secret-you-choose-randomly
    control-bucket: any-bucket-name-you-choose-randomly
    default-series: precise
    auth-mode: userpass

Using OpenStack (or even AWS for that matter) also requires defining a number of environment variables in an rc-file.  Basically, you need to be able to launch instances using euca2ools or ec2-api-tools.  That's outside of the scope of this post, and expected as a prerequisite.

The official documentation for configuring your Juju environment can be found here.

Choosing a Charm-able Application

I have previously charmed two small (but useful!) webapps that I've written and continue to maintain -- Pictor and Musica.  These are both standalone web applications that allow you to organize, serve, share, and stream your picture archive and music collection.  But neither of these "scale out", really.  They certainly could, perhaps, use a caching proxy on the front end, and shared storage on the back end.  But, as I originally wrote them, they do not.  Maybe I'll update that, but I don't know of anyone using either of those charms.

In any case, for this experiment, I wanted to write a charm that would "scale out", with Juju's add-unit command.  I wanted to ensure that adding more units to a deployment would result in a bigger and better application.

For these reasons, I chose the program known as John-the-Ripper, or just john.  You can trivially install it on any Ubuntu system, with:

sudo apt-get install john

John has been used by Linux system administrators for over a decade to test the quality of their user's passwords.  A root user can view the hashes that protect user passwords in files like /etc/shadow or even application level password hashes in a database.  Effectively, it can be used to "crack" weak passwords.  There are almost certainly evil people using programs like john to do malicious things.  But as long as the good guys have access to a program like john too, they can ensure that their own passwords are impossible to crack.

John can work in a number of different "modes".  It can use a dictionary of words, and simply hash each of those words looking for a match.  The john-data package ships a word list in /usr/share/john/password.lst that contains 3,000+ words.  You can find much bigger wordlists online as well, such as this one, which contains over 2 million words.

John can also generate "twists" on these words according to some rules (like changing E's to 3's, and so on).  And it can also work in a complete brute force mode, generating every possible password from various character sets.  This, of course, will take exponentially longer run times, depending on the length of the password.

Fortunately, John can run in parallel, with as many workers as you have at your disposal.  You can run multiple processes on the same system, or you can scale it out across many systems.  There are many different approaches to parallelizing John, using OpenMP, MPI, and others.

I took a very simple approach, explained in the manpage and configuration file called "External".  Basically, in the /etc/john/john.conf configuration file, you tell each node how many total nodes exist, and which particular node they are.  Each node uses the same wordlist or sequential generation algorithm, and indexes these.  The node modulates the current index by the total number of nodes, and tries the candidate passwords that match their own id.  Dead simple :-)  I like it.

# Trivial parallel processing example
[List.External:Parallel]
/*
 * This word filter makes John process some of the words only, for running
 * multiple instances on different CPUs.  It can be used with any cracking
 * mode except for "single crack".  Note: this is not a good solution, but
 * is just an example of what can be done with word filters.
 */
int node, total;                        // This node's number, and node count
int number;                             // Current word number
void init()
{
        node = 1; total = 2;            // Node 1 of 2, change as appropriate
        number = node - 1;              // Speedup the filter a bit
}
void filter()
{
        if (number++ % total)           // Word for a different node?
                word = 0;               // Yes, skip it
}

This does, however, require some way of sharing the inputs, logs, and results across all nodes.  Basically, I need a shared filesystem.  The Juju charm collection has a number of shared filesystem charms already implemented.  I chose to use NFS in my deployment, though I could have just as easily used Ceph, Hadoop, or others.

Writing a Charm

The official documentation on writing charms can be found here.  That's certainly a good starting point, and I read all of that before I set out.  I also spent considerable time in the #juju IRC channel on irc.freenode.net, talking to Jorge and Marco.  Thanks, guys!

The base template of the charm is pretty simple.  The convention is to create a charm directory like this, and put it under revision control.

mkdir -p precise/john
bzr init .

I first needed to create the metadata that will describe my charm to Juju.  My charm is named john, which is an application known as "John the Ripper", which can test the quality of your passwords.  I list myself as the maintainer.  This charm requires a shared filesystem that implements the mount interface, as my charm will call some hooks that make use of that mount interface.  Most importantly, this charm may have other peers, which I arbitrarily called workers.  They have a dummy interface (not used) called john.  Here's the metadata.yaml:

name: john
summary: "john the ripper"
description: |
 John the Ripper tests the quality of system passwords
maintainer: "Dustin Kirkland"
requires:
  shared-fs:
    interface: mount
peers:
  workers:
    interface: john

I also have one optional configuration parameter, called target_hashes.  This configuration string will include the input data that john will work on, trying to break.  This can be one to many different password hashes to crack.  If this isn't specified, this charm actually generates some random ones, and then tries to break those.  I thought that would be nice, so that it's immediately useful out of the box.  Here's config.yaml:

options:
  target_hashes:
    type: string
    description: input password hashes

There's a couple of other simple files to create, such as copyright:

Format: http://dep.debian.net/deps/dep5/

Files: *
Copyright: Copyright 2013, Dustin Kirkland , All Rights Reserved.
License: GPL-3
 This program is free software: you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation, either version 3 of the License, or
 (at your option) any later version.
 .
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 .
 You should have received a copy of the GNU General Public License
 along with this program.  If not, see .

README and revision are also required.

And the Magic -- Hooks!

The real magic happens in a set of very specifically named hooks.  These are specially named executables, which can be written in any language.  For my purposes, shell scripts are more than sufficient.

The Install Hook

The install hook is what is run at installation time on each worker node.  I need to install the john and john-data packages, as well as the nfs-common client binaries.  I also make use of the mkpasswd utility provided by the whois package.  And I will also use the keep-one-running tool provided by the run-one package.  Finally, I need to tweak the configuration file, /etc/john/john.conf, on each node, to use all of the CPU, save results every 10 seconds (instead of every 10 minutes), and to use the much bigger wordlist that we're going to fetch.  Here's hooks/install:

#!/bin/bash
set -eu
juju-log "Installing all components"
apt-get update
apt-get install -qqy nfs-common john john-data whois run-one
DIR=/var/lib/john
mkdir -p $DIR
ln -sf $DIR /root/.john
sed -i -e "s/^Idle = .*/Idle = N/" /etc/john/john.conf
sed -i -e "s/^Save = .*/Save = 10/" /etc/john/john.conf
sed -i -e "s:^Wordlist = .*:Wordlist = $DIR\/passwords.txt:" /etc/john/john.conf
juju-log "Installed packages"

The Start Hook

The start hook defines how to start this application.  Ideally, the john package would provide an init script or upstart job that cleanly daemonizes its workers, but it currently doesn't.  But for a poor-man's daemonizer, I love the keep-one-running utility (written by yours truly).  I'm going to start two copies of john utility, one that runs in wordlist mode, trying every one of the 2 million words in my wordlist, as well as a second, which tries every combinations of characters in an incremental, brute force mode.  These binaries are going to operate entirely in the shared /var/lib/john NFS mount point.  Each copy on each worker node will need to have their own session file.  Here's hooks/start:

#!/bin/bash
set -eu
juju-log "Starting john"
DIR=/var/lib/john
keep-one-running john -incremental -session:$DIR/session-incremental-$(hostname) -external:Parallel $DIR/target_hashes &
keep-one-running john -wordlist:$DIR/passwords.txt -session:$DIR/session-wordlist-$(hostname) -external:Parallel $DIR/target_hashes &

The Stop Hook

The stop hook defines how to stop the application.  Here, I'll need to kill the keep-one-running processes which wrap john, since we don't have an upstart job or init script.  This is perhaps a little sloppy, but perfectly functional.  Here's hooks/stop:

#!/bin/bash
set -eu
juju-log "Stopping john"
killall keep-one-running || true

The Workers Relation Changed Hook

This hook defines the actions that need to be taken each time another john worker unit is added to the service.  Basically, each worker needs to recount how many total workers there are (using the relation-list command), determine their own id (from $JUJU_UNIT_NAME), update their /etc/john/john.conf (using sed), and then restart their john worker processes.  The last part is easy since we're using keep-one-running; we simply need to killall john processes, and keep-one-running will automatically respawn new processes that will read the updated configuration file.  This is hooks/workers-relation-changed:

#!/bin/bash
set -eu
DIR="/var/lib/john"
update_unit_count() {
        node=$(echo $JUJU_UNIT_NAME | awk -F/ '{print $2}')
        node=$((node+1))
        total=$(relation-list | wc -l)
        total=$((total+1))
        sed -i -e "s/^\s\+node = .*; total = .*;.*$/        node = $node; total = $total;/" /etc/john/john.conf
}
restart_john() {
        killall john || true
        # It'll restart itself via keep-one-running, if we kill it
}
update_unit_count
restart_john

The Configuration Changed Hook

All john worker nodes will operate on a file in the shared filesystem called /var/lib/john/target_hashes.  I'd like the administrator who deployed this service to be able to dynamically update that file and signal all of her worker nodes to restart their john processes.  Here, I used the config-get juju command, and again restart by simply killing the john processes and letting keep-one-running sort out the restart.  This is handled here in hooks/config-changed:

#!/bin/bash
set -e
DIR=/var/lib/john
target_hashes=$(config-get target_hashes)
if [ -n "$target_hashes" ]; then
        # Install the user's supplied hashes
        echo "$target_hashes" > $DIR/target_hashes
        # Restart john
        killall john || true
fi

The Shared Filesystem Relation Changed Hook

By far, the most complicated logic is in hooks/shared-fs-relation-changed.  There's quite a bit of work we need to here, as soon as we can be assured that this node has successfully mounted its shared filesystem.  There's a bit of boilerplate mount work that I borrowed from the owncloud charm.  Beyond that, there's a bit of john-specific work.  I'm downloading the aforementioned larger wordlist.  I install the target hash, if specified in the configuration; otherwise, we just generate 10 random target passwords to try and crack.  We also symlink a bunch of john's runtime shared data into the NFS directory.  For no good reason, john expects a bunch of stuff to be in the same directory.  Of course, this code could really use some cleanup.  Here it is again, non-perfect, but functional hooks/shared-fs-relation-changed:
#!/bin/bash
set -eu

remote_host=`relation-get private-address`
export_path=`relation-get mountpoint`
mount_options=`relation-get options`
fstype=`relation-get fstype`
DIR="/var/lib/john"

if [ -z "${export_path}" ]; then
    juju-log "remote host not ready"
    exit 0
fi

local_mountpoint="$DIR"

create_local_mountpoint() {
  juju-log "creating local mountpoint"
  umask 022
  mkdir -p $local_mountpoint
  chown -R ubuntu:ubuntu $local_mountpoint
}
[ -d "${local_mountpoint}" ] || create_local_mountpoint

share_already_mounted() {
  `mount | grep -q $local_mountpoint`
}

mount_share() {
  for try in {1..3}; do
    juju-log "mounting share"
    [ ! -z "${mount_options}" ] && options="-o ${mount_options}" || options=""
    mount  -t $fstype $options $remote_host:$export_path $local_mountpoint \
      && break

    juju-log "mount failed: ${local_mountpoint}"
    sleep 10

  done
}

download_passwords() {
  if [ ! -s $DIR/passwords.txt ]; then
    # Grab a giant dictionary of passwords, 20MB, 2M passwords
    juju-log "Downloading password dictionary"
    cd $DIR
    # http://www.breakthesecurity.com/2011/12/large-password-list-free-download.html
    wget http://dazzlepod.com/site_media/txt/passwords.txt
    juju-log "Done downloading password dictionary"
  fi
}

install_target_hashes() {
  if [ ! -s $DIR/target_hashes ]; then
    target_hashes=$(config-get target_hashes)
    if [ -n "$target_hashes" ]; then
 # Install the user's supplied hashes
 echo "$target_hashes" > $DIR/target_hashes
    else
 # Otherwise, grab some random ones
 i=0
 for p in $(shuf -n 10 $DIR/passwords.txt); do
  # http://openwall.info/wiki/john/Generating-test-hashes
  printf "user${i}:%s\n" $(mkpasswd -m md5 $p) >> $DIR/target_hashes
  i=$((i+1))
 done
    fi
  fi
  for i in /usr/share/john/*; do
   ln -sf $i /var/lib/john
  done
}

apt-get -qqy install rpcbind nfs-common
share_already_mounted || mount_share
download_passwords
install_target_hashes

Deploying the Service

If you're still with me, we're ready to deploy this service and try cracking some passwords!  We need to bootstrap our environment, and deploy the stock nfs charm.  Next, branch my charm's source code, and deploy it.  I deployed it here across a whopping 18 units!  I currently have a quota of 20 small instances I can run our private OpenStack.  Two of those instances are used by the Juju bootstrap node and by the NFS server.  So the other 18 will be NFS clients running john processes.

juju bootstrap
juju deploy nfs
bzr branch lp:~kirkland/+junk/john precise
juju deploy -n 18 --repository=precise local:precise/john
juju add-relation john nfs
juju status

Once everything is up and ready, running and functional, my status looks like this:

machines:
  "0":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.230
    instance-id: 98090098-2e08-4326-bc73-22c7c6879b95
    series: precise
  "1":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.7
    instance-id: 449c6c8c-b503-487b-b370-bb9ac7800225
    series: precise
  "2":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.193
    instance-id: 576ffd6f-ddfa-4507-960f-3ac2e11ea669
    series: precise
  "3":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.215
    instance-id: 70bfe985-9e3f-4159-8923-60ab6d9f7d43
    series: precise
  "4":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.221
    instance-id: f48364a9-03c0-496f-9287-0fb294bfaf24
    series: precise
  "5":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.223
    instance-id: 62cc52c4-df7e-448a-81b1-5a3a06af6324
    series: precise
  "6":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.231
    instance-id: f20dee5d-762f-4462-a9ef-96f3c7ab864f
    series: precise
  "7":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.239
    instance-id: 27c6c45d-18cb-4b64-8c6d-b046e6e01f61
    series: precise
  "8":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.240
    instance-id: 63cb9c91-a394-4c23-81bd-c400c8ec4f93
    series: precise
  "9":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.242
    instance-id: b2239923-b642-442d-9008-7d7e725a4c32
    series: precise
  "10":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.249
    instance-id: 90ab019c-a22c-41d3-acd2-d5d7c507c445
    series: precise
  "11":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.252
    instance-id: e7abe8e1-1cdf-4e08-8771-4b816f680048
    series: precise
  "12":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.254
    instance-id: ff2b6ba5-3405-4c80-ae9b-b087bedef882
    series: precise
  "13":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.60.255
    instance-id: 2b019616-75bc-4227-8b8b-78fd23d6b8fd
    series: precise
  "14":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.1
    instance-id: ecac6e11-c89e-4371-a4c0-5afee41da353
    series: precise
  "15":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.3
    instance-id: 969f3d1c-abfb-4142-8cc6-fc5c45d6cb2c
    series: precise
  "16":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.4
    instance-id: 6bb24a01-d346-4de5-ab0b-03f51271e8bb
    series: precise
  "17":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.5
    instance-id: 924804d6-0893-4e56-aef2-64e089cda1be
    series: precise
  "18":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.11
    instance-id: 5c96faca-c6c0-4be4-903e-a6233325caec
    series: precise
  "19":
    agent-state: started
    agent-version: 1.11.0
    dns-name: 10.99.61.15
    instance-id: 62b48da2-60ea-4c75-b5ed-ffbb2f8982b5
    series: precise
services:
  john:
    charm: local:precise/john-3
    exposed: false
    relations:
      shared-fs:
      - nfs
      workers:
      - john
    units:
      john/0:
        agent-state: started
        agent-version: 1.11.0
        machine: "2"
        public-address: 10.99.60.193
      john/1:
        agent-state: started
        agent-version: 1.11.0
        machine: "3"
        public-address: 10.99.60.215
      john/2:
        agent-state: started
        agent-version: 1.11.0
        machine: "4"
        public-address: 10.99.60.221
      john/3:
        agent-state: started
        agent-version: 1.11.0
        machine: "5"
        public-address: 10.99.60.223
      john/4:
        agent-state: started
        agent-version: 1.11.0
        machine: "6"
        public-address: 10.99.60.231
      john/5:
        agent-state: started
        agent-version: 1.11.0
        machine: "7"
        public-address: 10.99.60.239
      john/6:
        agent-state: started
        agent-version: 1.11.0
        machine: "8"
        public-address: 10.99.60.240
      john/7:
        agent-state: started
        agent-version: 1.11.0
        machine: "9"
        public-address: 10.99.60.242
      john/8:
        agent-state: started
        agent-version: 1.11.0
        machine: "10"
        public-address: 10.99.60.249
      john/9:
        agent-state: started
        agent-version: 1.11.0
        machine: "11"
        public-address: 10.99.60.252
      john/10:
        agent-state: started
        agent-version: 1.11.0
        machine: "12"
        public-address: 10.99.60.254
      john/11:
        agent-state: started
        agent-version: 1.11.0
        machine: "13"
        public-address: 10.99.60.255
      john/12:
        agent-state: started
        agent-version: 1.11.0
        machine: "14"
        public-address: 10.99.61.1
      john/13:
        agent-state: started
        agent-version: 1.11.0
        machine: "15"
        public-address: 10.99.61.3
      john/14:
        agent-state: started
        agent-version: 1.11.0
        machine: "16"
        public-address: 10.99.61.4
      john/15:
        agent-state: started
        agent-version: 1.11.0
        machine: "17"
        public-address: 10.99.61.5
      john/16:
        agent-state: started
        agent-version: 1.11.0
        machine: "18"
        public-address: 10.99.61.11
      john/17:
        agent-state: started
        agent-version: 1.11.0
        machine: "19"
        public-address: 10.99.61.15
  nfs:
    charm: cs:precise/nfs-3
    exposed: false
    relations:
      nfs:
      - john
    units:
      nfs/0:
        agent-state: started
        agent-version: 1.11.0
        machine: "1"
        public-address: 10.99.60.7

Obtaining the Results

And now, let's monitor the results.  To do this, I'll ssh to any of the john worker nodes, move over to the shared NFS directory, and use the john -show command in a watch loop.

keep-one-running juju ssh john/0
sudo su -
cd /var/lib/john
watch john -show target_hashes

And the results...
Every 2.0s: john -show target_hashes

user:260775
user1:73832100
user2:829171kzh
user3:pf1vd4nb
user4:7788521312229
user5:saksak
user6:rongjun2010
user7:2312010
user8:davied
user9:elektrohobbi

10 password hashes cracked, 0 left

Within a few seconds, this 18-node cluster has cracked all 10 of the randomly chosen passwords from the dictionary.  That's only mildly interesting, as my laptop can do the same in a few minutes, if the passwords are already in the wordlist.  What's far more interesting is in randomly generating a password and passing that as a new configuration to our running cluster and letting it crack that instead.

Modifying the Configuration Target Hash

Let's generate a random password using apg.  We'll then need to hash this and create a string in the form of username:pwhash that john can understand.  Finally, we'll pass this to our cluster using Juju's set action.

passwd=$(apg -a 0 -n 1 -m 6 -x 6)
target=$(printf "user0:%s\n" $(mkpasswd -m md5 $passwd))
juju set john target_hashes="$target"

This was a 6 character password, consisting of 52 random characters (a-z, A-Z), almost certainly not in our dictionary.  526 = 19,770,609,664, or about 19 billion letter combinations we need to test.  According to the john -test command, a single one of my instances can test about 12,500 MD5 hashes per second.  So with a single instance, this would take a maximum of 526 / 12,500 / 60 / 60 = 439 hours. Or 18 days :-) Well, I happen to have exactly 18 instances, so we should be able to test the entire wordspace in about 24 hours.

So I threw all 18 instances at this very problem and let it run over a weekend. And voila, we got a little lucky, and cracked the password, Uvneow, in 16 hours!

In Conclusion

I don't know if this charm will ever land in the official charm store.  That really wasn't the goal of this exercise for me.  I simply wanted to bring myself back up to speed on Juju, play with the port to Golang, experiment with OpenStack as a provider for Juju, and most importantly, write a scalable Juju charm.

This particularly application, john, is actually just one of a huge class of MPI-compatible parallelizable applications that could be charmed for Juju.  The general design, I think, should be very reusable by you, if you're interested.  Between the shared file system and the keep-one-running approach, I bet you could charm any one of a number of scalable applications.  While I'm not eligible, perhaps you might consider competing for cash prizes in the Juju Charm Championship.

Happy charming,
:-Dustin

Printfriendly