From the Canyon Edge -- :-Dustin
Showing posts with label ecryptfs. Show all posts
Showing posts with label ecryptfs. Show all posts

Friday, February 16, 2018

10 Amazing Years of Ubuntu and Canonical

February 2008, Canonical's office in Lexington, MA
10 years ago today, I joined Canonical, on the very earliest version of the Ubuntu Server Team!

And in the decade since, I've had the tremendous privilege to work with so many amazing people, and the opportunity to contribute so much open source software to the Ubuntu ecosystem.

Marking the occasion, I've reflected about much of my work over that time period and thought I'd put down in writing a few of the things I'm most proud of (in chronological order)...  Maybe one day, my daughters will read this and think their daddy was a real geek :-)

1. update-motd / motd.ubuntu.com (September 2008)

Throughout the history of UNIX, the "message of the day" was always manually edited and updated by the local system administrator.  Until Ubuntu's message-of-the-day.  In fact, I received an email from Dennis Ritchie and Jon "maddog" Hall, confirming this, in April 2010.  This started as a feature request for the Landscape team, but has turned out to be tremendously useful and informative to all Ubuntu users.  Just last year, we launched motd.ubuntu.com, which provides even more dynamic information about important security vulnerabilities and general news from the Ubuntu ecosystem.  Mathias Gug help me with the design and publication.

2. manpages.ubuntu.com (September 2008)

This was the first public open source project I worked on, in my spare time at Canonical.  I had a local copy of the Ubuntu archive and I was thinking about what sorts of automated jobs I could run on it.  So I wrote some scripts that extracted the manpages out of each one, formatted them as HTML, and published into a structured set of web directories.  10 years later, it's still up and running, serving thousands of hits per day.  In fact, this was one of the ways we were able to shrink the Ubuntu minimal image, but removing the manpages, since they're readable online.  Colin Watson and Kees Cook helped me with the initial implementation, and Matthew Nuzum helped with the CSS and Ubuntu theme in the HTML.

3. Byobu (December 2008)

If you know me at all, you know my passion for the command line UI/UX that is "Byobu".  Byobu was born as the "screen-profiles" project, over lunch at Google in Mountain View, in December of 2008, at the Ubuntu Developer Summit.  Around the lunch table, several of us (including Nick Barcet, Dave Walker, Michael Halcrow, and others), shared our tips and tricks from our own ~/.screenrc configuration files.  In Cape Town, February 2010, at the suggestion of Gustavo Niemeyer, I ported Byobu from Screen to Tmux.  Since Ubuntu Servers don't generally have GUIs, Byobu is designed to be a really nice interface to the Ubuntu command line environment.

4. eCryptfs / Ubuntu Encrypted Home Directories (October 2009)

I was familiar with eCryptfs from its inception in 2005, in the IBM Linux Technology Center's Security Team, sitting next to Michael Halcrow who was the original author.  When I moved to Canonical, I helped Michael maintain the userspace portion of eCryptfs (ecryptfs-utils) and I shepherded into Ubuntu.  eCryptfs was super powerful, with hundreds of options and supported configurations, but all of that proved far to difficult for users at large.  So I set out to simplify it drastically, with an opinionated set of basic defaults.  I started with a simple command to mount a "Private" directory inside of your home directory, where you could stash your secrets.  A few months later, on a long flight to Paris, I managed to hack a new PAM module, pam_ecryptfs.c, that actually encrypted your entire home directory!  This was pretty revolutionary at the time -- predating Apple's FileVault or Microsoft's Bitlocker, even.  Today, tens of millions of Ubuntu users have used eCryptfs to secure their personal data.  I worked closely with Tyler Hicks, Kees Cook, Jamie Strandboge, Michael Halcrow, Colin Watson, and Martin Pitt on this project over the years.

5. ssh-import-id (March 2010)

With the explosion of virtual machines and cloud instances in 2009 / 2010, I found myself constantly copying public SSH keys around.  Moreover, given Canonical's globally distributed nature, I also regularly found myself asking someone for their public SSH keys, so that I could give them access to an instance, perhaps for some pair programming or assistance debugging.  As it turns out, everyone I worked with, had a Launchpad.net account, and had their public SSH keys available there.  So I created (at first) a simple shell script to securely fetch and install those keys.  Scott Moser helped clean up that earliest implementation.  Eventually, I met Casey Marshall, who helped rewrite it entirely in Python.  Moreover, we contacted the maintainers of Github, and asked them to expose user public SSH keys by the API -- which they did!  Now, ssh-import-id is integrated directly into Ubuntu's new subiquity installer and used by many other tools, such as cloud-init and MAAS.

6. Orchestra / MAAS (August 2011)

In 2009, Canonical purchased 5 Dell laptops, which was the Ubuntu Server team's first "cloud".  These laptops were our very first lab for deploying and testing Eucalyptus clouds.  I was responsible for those machines at my house for a while, and I automated their installation with PXE, TFTP, DHCP, DNS, and a ton of nasty debian-installer preseed data.  That said -- it worked!  As it turned out, Scott Moser and Mathias Gug had both created similar setups at their houses for the same reason.  I was mentoring a new hire at Canonical, named Andres Rodriquez at the time, and he took over our part-time hacks and we worked together to create the Orchestra project.  Orchestra, itself was short lived.  It was severely limited by Cobbler as a foundation technology.  So the Orchestra project was killed by Canonical.  But, six months later, a new project was created, based on the same general concept -- physical machine provisioning at scale -- with an entire squad of engineers led by...Andres Rodriguez :-)  MAAS today is easily one of the most important projects the Ubuntu ecosystem and one of the most successful products in Canonical's portfolio.

7. pollinate / pollen / entropy.ubuntu.com (February 2014)

In 2013, I set out to secure Ubuntu at large from a set of attacks ranging from insufficient entropy at first boot.  This was especially problematic in virtual machine instances, in public clouds, where every instance is, by design, exactly identical to many others.  Moreover, the first thing that instance does, is usually ... generate SSH keys.  This isn't hypothetical -- it's quite real.  Raspberry Pi's running Debian were deemed susceptible to this exact problem in November 2015.  So designed and implemented a client (shell script that runs at boot, and fetches some entropy from one to many sources), as well as a high-performance server (golang).  The client is the 'pollinate' script, which runs on the first boot of every Ubuntu server, and the server is the cluster of physical machines processing hundreds of requests per minute at entropy.ubuntu.com.  Many people helped review the design and implementation, including Kees Cook, Jamie Strandboge, Seth Arnold, Tyler Hicks, James Troup, Scott Moser, Steve Langasek, Gustavo Niemeyer, and others.

8. The Orange Box (May 2014)

In December of 2011, in my regular 1:1 with my manager, Mark Shuttleworth, I told him about these new "Intel NUCs", which I had bought and placed them around my house.  I had 3, each of which was running Ubuntu, and attached to a TV around the house, as a media player (music, videos, pictures, etc).  In their spare time, though, they were OpenStack Nova nodes, capable of running a couple of virtual machines.  Mark immediately asked, "How many of those could you fit into a suitcase?"  Within 24 hours, Mark had reached out to the good folks at TranquilPC and introduced me to my new mission -- designing the Orange Box.  I worked with the Tranquil folks through Christmas, and we took our first delivery of 5 of these boxes in January of 2014.  Each chassis held 10 little Intel NUC servers, and a switch, as well as a few peripherals.  Effectively, it's a small data center that travels.  We spend the next 4 months working on the hardware under wraps and then unveiled them at the OpenStack Summit in Atlanta in May 2014.  We've gone through a couple of iterations on the hardware and software over the last 4 years, and these machines continue to deliver tremendous value, from live demos on the booth, to customer workshops on premises, or simply accelerating our own developer productivity by "shipping them a lab in a suitcase".  I worked extensively with Dan Poler on this project, over the course of a couple of years.

9. Hollywood (December 2014)

Perhaps the highlight of my professional career came in October of 2016.  Watching Saturday Night Live with my wife Kim, we were laughing at a skit that poked fun at another of my favorite shows, Mr. Robot.  On the computer screen behind the main character, I clearly spotted Hollywood!  Hollywood is just a silly, fun little project I created on a plane one day, mostly to amuse Kim.  But now, it's been used in Saturday Night LiveNBC Dateline News, and an Experian TV commercials!  Even Jess Frazelle created a Docker container

10. petname / golang-petname / python-petname (January 2015)

From "warty warthog" to "bionic beaver", we've always had a focus on fun, and user experience here in Ubuntu.  How hard is it to talk to your colleague about your Amazon EC2 instance, "i-83ab39f93e"?  Or your container "adfxkenw"?  We set out to make something a little more user-friendly with our "petnames".  Petnames are randomly generated "adjective-animal" names, which are easy to pronounce, spell, and remember.  I curated and created libraries that are easily usable in Shell, Golang, and Python.  With the help of colleagues like Stephane Graber and Andres Rodriguez, we now use these in many places in the Ubuntu ecosystem, such as LXD and MAAS.

If you've read this post, thank you for indulging me in a nostalgic little trip down memory lane!  I've had an amazing time designing, implementing, creating, and innovating with some of the most amazing people in the entire technology industry.  And here's to a productive, fun future!

Cheers,
:-Dustin

Sunday, January 17, 2016

Intercession -- Check out this book!

https://www.inkshares.com/projects/intercession
A couple of years ago, a good friend of mine (who now works for Canonical) bought a book recommended in a byline in Wired magazine.  The book was called Daemon, by Leinad Zeraus.  I devoured it in one sitting.  It was so...different, so...technically correct, so....exciting and forward looking.  Self-driving cars and autonomous drones, as weapons in the hands of an anonymous network of hackers.  Yikes.  A thriller, for sure, but a thinker as well.  I loved it!

I blogged about it here in September of 2008, and that blog was actually read by the author of Daemon, who reached out to thank me for the review.  He sent me a couple of copies of the book, which I gave away to readers of my blog, who solved a couple of crypto-riddles in a series of blog posts linking eCryptfs to some of the techniques and technology used in Daemon.

I can now count Daniel Suarez (the award winning author who originally published Daemon under a pseudonym) as one of my most famous and interesting friends, and I'm always excited to receive an early draft of each of his new books.  I've enjoyed each of Daemon, Freedom™, Kill Decision, and Influx, and gladly recommend them to anyone interested in cutting edge, thrilling fiction.

Knowing my interest in the genre, another friend of mine quietly shared that they were working on their very first novel.  They sent me an early draft, which I loaded on my Kindle and read in a couple of days while on a ski vacation in Utah in February.  While it took me a few chapters to stop thinking about it as-a-story-written-by-a-friend-of-mine, once I did, I was in for a real treat!  I ripped through it in 3 sittings over two snowy days, on a mountain ski resort in Park City, Utah.

The title is Intercession.  It's an adventure story -- a hero, a heroin, and a villain.  It's a story about time -- intriguingly non-linear and thoughtfully complex.  There's subtle, deliberate character development, and a couple of face-palming big reveals, constructed through flashbacks across time.

They have published it now, under a pseudonym, Nyneve Ransom, on InkShares.com -- a super cool self-publishing platform (I've spent hours browsing and reading stories there now!).  If you love sci-fi, adventure, time, heroes, and villains, I'm sure you'll enjoy Intercession!  You can read a couple of chapters for free right now ;-)

Happy reading!
:-Dustin

Wednesday, January 28, 2015

Security and Biometrics: SXSW Preview Q&A


Rebecca: Can you give me a brief overview of why you see it as a problem that our personal biometrics, at this point mostly fingerprints, are being used to authenticate our actions rather than identify us?

Dustin: How many emails have you received, to date, from some online service or another saying, "We're sorry, but our site was attacked, and while we don't think your password was compromised, we think you should change it anyway, for good measure"?

Surely you've seen this once or twice, right?  And if you're like me, you kind of take a deep breath, and think, "Oh man, that's inconvenient..."

Now, what if that site used some form of biometrics, instead.  Let's say your fingerprint.  Or your eyeball.  How would that email read? You want me to change my fingerprints!?!  My eyeballs!?!

That's ridiculous, of course, but it perfectly shows the problem. Biometrics are not changeable.  You couldn't alter them if you tried. Being able to change, rotate, and strengthen passwords is one of the
most fundamental properties of authentication tokens -- and completely missing from all forms of biometrics!

That's just one of a number of problems with biometrics.  I'll cover more in my talk ;-)

Rebecca: Is biometrics something you've worked with professionally or what has piqued your interest in the area?  What made you want to do a panel on the issue?

Dustin: Sort of.  I've long maintained and developed an encrypted filesystem for Linux, called eCryptfs.  In 2008, I was asked to add eCryptfs support for Thinkpad's fingerprint reader.  After thinking about it
for a while, I refused to do so, with the core arguments being much of what I described above.  With that refusal to support fingerprint readers in 2009, I seemed to have picked a few fights and arguments with various users.

All was pretty quiet on the home front, until Apple released an iPhone with a built-in fingerprint reader in late 2013, and I blogged this piece that criticized the idea accordingly: http://blog.dustinkirkland.com/2013/10/fingerprints-are-user-names-not.html

That blog post in October 2013 sort of did the viral thing on social media, I guess, seeing almost a million unique views in about a month.

Rebecca: I feel embarrassed to admit that I had simply never thought of this issue until seeing your panel synopsis.  Then, it seemed incredibly obvious and I found myself looking at my phone's fingerprint scanner suspiciously.  Why do you think the public has had so little response to biometrics in technology, other than seeing it as a neat feature of a particular gadget?

Dustin: On the surface, it seems like such a good idea.  We've all seen Mission Impossible or 007 or countless other spy movies where Hollywood portrays biometrics as the authentication mechanism of the future.  But it's just that...  Bad pulp fiction.

There are plenty of ideas that probably seemed like a good idea at first, right?  Examples: Clippy, The Hindenburg, New Coke, Tanning beds, The Shake Weight, Subprime Mortgages, Leaded Gasoline.  Think about for just a minute, though.  A passenger blimp filled with Hydrogen?  An annoying cartoon character that always knows more than you?  Massive scale lending to high-risk individuals packed into mortgage-backed securities?  Dig a little deeper and these were actually misapplications from the beginning.  We'll be in the same place with Biometrics, I have no doubt.

Rebecca: Have there been any instances that you're aware of where the technology has been compromised?

Dustin: The Chaos Computer Club have demonstrated compromised Apple TouchID: http://arstechnica.com/apple/2013/09/chaos-computer-club-hackers-trick-apples-touchid-security-feature/

TouchID is actually pretty high resolution.  The Thinkpad fingerprint readers, until recently, could be fooled with a piece of scotch tape: https://pacsec.jp/psj06/psj06krissler-e.pdf

Rebecca: In the future, if we continue down the current path do you see identity theft including the hacking of our fingerprints and voice patterns in addition to our credit card info?

Dustin: I certainly hope we can curtail this doomed path of technology before we get to that point...

But if we don't, then yes, absolutely.  All of your biometrics are easily collected in public places, with your knowledge.


  • Your fingerprints are on your coffee mug and every beer bottle you've ever picked up with your bare hands.
  • Your hair, dandruff, and dead skin contain your DNA.
  • High resolution digital cameras can pick up your iris in incredible detail (less so for the retina currently)
  • Facial recognition -- seriously, unless you've taken exorbitant steps, your face is all over Facebook, Google, LinkedIn, etc., and everywhere you go in public today, there are security monitors.
  • The same goes for vocal recognition.  Surely you've heard, "This call may be recorded for training purposes".  Sure, that's fine.  But do you go spilling your master password to all of your accounts to that phone support?  Well, if you use voice recognition for your authentication, then that's exactly what you've done.

Rebecca: Beyond crime, what are the civil liberties issues you see being entwined with biometrics technology?  Could the government theoretically access this information in much the same way they have our email and phone records in the past?

Dustin: Theoretically, yes.  That that "theoretically, yes" is enough for me to be very concerned.

Is Apple colluding with the NSA/FBI/CIA/etc?  I am most certainly NOT making that accusation.

Could they, or anyone else in this biometrics?  Most certainly.  They could even be coerced or forced to do so.  And they could so unknowingly.  And it might not even be "the good guys".  Anyone of this magnitude is a target for attacks, by less than savory governments or crime organizations.

Moreover, I strongly recommend that everyone consider their biometrics compromised.  As I said above, you leave a trail of your fingerprints, DNA, face, voice, etc. everywhere you go.  Just accept that they're not secret, and don't pretend that they are :-)

Rebecca: What are some places where you see biometrics as appropriate and useful?

Dustin: Back to the title of the presentation, I think biometrics are decent as a "username", just not as a "password".

Is your name secret?  No, not really.  Is your email address secret? No, not really, either.

That's what biometrics are -- they're another expression of your "identity".  It can be used to replace, or rather, look up your name, username, or email address from a list, as it's just another expression of that information.

Now, a password is something entirely different.  A password is how you "prove" your identity.  This is something entirely different.  It must be long, and very hard to guess.  You have to be able to change it.  And you have to keep your passwords separate from different accounts, so that no one account could share that with another account and compromise you.

Rebecca: What are your thoughts on SXSW Interactive as a venue for such discussion?

Dustin: I think it's a fantastic venue!  I attended SXSW Interactive in 2014, and was very impressed with the quality of speakers and discussion around security, privacy, identity, and civil liberties.  I immediately regretted that I didn't submit this talk for the 2014 conference, and resolved to definitely do so for 2015.  Unfortunately, this subject is still important and topical in 2015 :-(  Which means we still have some work to do!

Rebecca: Finally, are there any other panels you're especially looking forward to?

Dustin: All of the Open Source ones (of which there are a lot!), as that's really my passion.  If I have to pick three right now I'm definitely attending, it would be:


Cheers,
Dustin

Thursday, May 1, 2014

Double Encryption, for the Win!


Upon learning about the Heartbleed vulnerability in OpenSSL, my first thoughts were pretty desperate.  I basically lost all faith in humanity's ability to write secure software.  It's really that bad.

I spent the next couple of hours drowning in the sea of passwords and certificates I would personally need to change...ugh :-/

As of the hangover of that sobering reality arrived, I then started thinking about various systems over the years that I've designed, implemented, or was otherwise responsible for, and how Heartbleed affected those services.  Another throbbing headache set in.

I patched DivItUp.com within minutes of Ubuntu releasing an updated OpenSSL package, and re-keyed the SSL certificate as soon as GoDaddy declared that it was safe for re-keying.

Likewise, the Ubuntu entropy service was patched and re-keyed, along with all Ubuntu-related https services by Canonical IT.  I pushed an new package of the pollinate client with updated certificate changes to Ubuntu 14.04 LTS (trusty), the same day.

That said, I did enjoy a bit of measured satisfaction, in one controversial design decision that I made in January 2012, when creating Gazzang's zTrustee remote key management system.

All default network communications, between zTrustee clients and servers, are encrypted twice.  The outer transport layer network traffic, like any https service, is encrypted using OpenSSL.  But the inner payloads are also signed and encrypted using GnuPG.

Hundreds of times, zTrustee and I were questioned or criticized about that design -- by customers, prospects, partners, and probably competitors.

In fact, at one time, there was pressure from a particular customer/partner/prospect, to disable the inner GPG encryption entirely, and have zTrustee rely solely on the transport layer OpenSSL, for performance reasons.  Tried as I might, I eventually lost that fight, and we added the "feature" (as a non-default option).  That someone might have some re-keying to do...

But even in the face of the Internet-melting Heartbleed vulnerability, I'm absolutely delighted that the inner payloads of zTrustee communications are still protected by GnuPG asymmetric encryption and are NOT vulnerable to Heartbleed style snooping.

In fact, these payloads are some of the very encryption keys that guard YOUR health care and financial data stored in public and private clouds around the world by Global 2000 companies.

Truth be told, the insurance against crypto library vulnerabilities zTrustee bought by using GnuPG and OpenSSL in combination was really the secondary objective.

The primary objective was actually to leverage asymmetric encryption, to both sign AND encrypt all payloads, in order to cryptographically authenticate zTrustee clients, ensure payload integrity, and enforce key revocations.  We technically could have used OpenSSL for both layers and even realized a few performance benefits -- OpenSSL is faster than GnuPG in our experience, and can leverage crypto accelerator hardware more easily.  But I insisted that the combination of GPG over SSL would buy us protection against vulnerabilities in either protocol, and that was worth any performance cost in a key management product like zTrustee.

In retrospect, this makes me wonder why diverse, backup, redundant encryption, isn't more prevalent in the design of security systems...

Every elevator you have ever used has redundant safety mechanisms.  Your car has both seat belts and air bags.  Your friendly cashier will double bag your groceries if you ask.  And I bet you've tied your shoes with a double knot before.

Your servers have redundant power supplies.  Your storage arrays have redundant hard drives.  You might even have two monitors.  You're might be carrying a laptop, a tablet, and a smart phone.

Moreover, important services on the Internet are often highly available, redundant, fault tolerant or distributed by design.

But the underpinnings of the privacy and integrity of the very Internet itself, is usually protected only once, with transport layer encryption of the traffic in motion.

At this point, can we afford the performance impact of additional layers of security?  Or, rather, at this point, can we afford not to use all available protection?

Dustin

p.s. I use both dm-crypt and eCryptFS on my Ubuntu laptop ;-)

Monday, April 8, 2013

The Horror


I had just finished an awesome lunch with my good friend Josh, at a restaurant his brother, Riley, manages in Dallas.  Enjoying the delicious smoked brisket sandwich, Riley walks up to us at the bar and asks if I drive a black Cadillac.  I knew that question couldn't mean good news of any kind -- just how bad would it be, as this has happened to me before.



Josh and I had driven from Austin to Dallas that morning, to attend the Big Texas Beer Festival, which was going on later that afternoon, and had dipped into this really upscale, beautiful barbecue restaurant, Smoke, at 11:45am.  It's in a bit of a rough neighborhood that is in the process of being cleaned up.  But looks like they're still in need of a bit more gentrification.

Josh and I each had an overnight bag packed, with a change of clothes.  I also had a small cooler with a growler of my own home brewed Scotch Ale.  Unfortunately, those three bags were in plain sight on the back seat and not in the trunk.  Within 30 minutes, between 11:45am and 12:15pm, in broad daylight, some lowlife reprobate delinquent hoodlum smashed the back window and swiped all three bags.  My Google Plus post, within 5 minutes of learning of the treachery, expressed my feelings at the time.



Besides the growler of home brew, I also lost a change of clothes (of course, my favorite t-shirt, jacket, and flip flops), and my Asus Transformer TF700T and mobile dock.


The latter is, unfortunately, an expensive toy, and the biggest loss here.  Unfortunately, all of this falls within my insurance's deductible, so it looks like I'm just paying the Shit Happens tax :-(


So the only silver lining of this whole experience that my Android tablet was, in fact, encrypted.  I used the stock device encryption option available in Android 4.0+, which, if I understand correctly, uses dm-crypt to encrypt the device.

I'm confident that I have a sufficiently long and complicated pass phrase that it won't be guessed or brute forced easily.  20+ numbers, capital and lower case letters, and special characters.  That has helped me sleep a bit more easily at night, knowing that all I have lost is the physical machinery itself.

While I do know how Ubuntu's Encrypted Home Directory works, inside and out, I will review Android's encryption implementation, so that I can get completely comfortable with it.

A big thanks to Riley, who called the cops (they filed a report), loaned us his computer and printer to print out our beer fest tickets, and a shop vac to clean up the mess of glass shards.

:-Dustin

Monday, August 13, 2012

Data encryption -- Why? Some numbers...


At last weekend's Texas Linux Fest, at the end of my presentation, Data Security and Privacy in the Cloud, an attendee asked a great question.  I'll paraphrase...
So...  What's the actual threat model?  Why are you insisting that people encrypt their data in the cloud?  Where's the risk?  When might unencrypted data get compromised?  Who is accessing that data?
A couple of weeks ago, an article from ComputerWorld made the front page of Slashdot:

'Wall of Shame' exposes 21M medical record breaches New rules under the Health Information Technology for Economic and Clinical Health Act, By August 7, 2012 06:00 AM ET


Here's a few absolutely astounding numbers from that article, which were pulled from the US Department of Health and Human Services Health Information Privacy website by the author of that article.


Since the data is publicly available, I was able to download and import all of these into a spreadsheet and run some numbers and verify ComputerWorld's article.  I can confirm that the Mr. Mearian's numbers are quite accurate, and just as scary.  Since September 2009:

  • 21+ million people have had their health care records exposed
  • 480 breaches have been reported

The top 6 breaches all affected more than 1 million individuals:

  • 4.9 million records: TRICARE Management Activity, the US Department of Defense's health care program, exposed 4.9 million health care records when backup tapes went missing
  • 1.9 million records: Health Net lost 1.9 million records when backup hard drives went missing
  • 1.7 million records: New York City Health & Hospital's Corporation's North Bronx Health Care Network reported the theft of 1.7 million records
  • 1.22 million records: AvMed Health Plans reported the loss of a laptop with 1.22 million patient records
  • 1.02 million records: Blue Cross Blue Shield of Tennessee exposed 1.02 million records with the loss of an external hard drive
  • 1.05 million records: Nemours Foundation (runs children's hospitals) lost 1.05 million records with missing backup tapes


Such breaches are very costly, too.

  • $4.3 million: Cignet Health of Prince George's County civil lawsuit penalty
  • $1.5 million: Blue Cross Blue Shield of Tennessee penalties
    • have since encrypted all of their hard drives, 885TB of data
  • $1.7 million: Alaska Department of Health penalty
    • due to theft of a thumb drive, stolen from an employee's car


Running a few more reports on the public CSV data,

  • Across 480 reported breaches, these were the top reasons given for the incident:
    • 55%: Theft of devices or physical media
      • 26%: Hacking/Unauthorized access
      • 12%: Lost devices, disks, tapes, drives, media
      • 5%: Improper disposal of devices
      • 3%: Other

    The most disappointing part, to me, is that 72% of those breaches stemming from theft, lost devices, and improper disposal -- a total of 15.6 million individual's health records. This means that the vast majority of these compromises are easily preventable, through the use of comprehensive data encryption. And I'd argue that many of the remaining 28% of the breaches attributed to hacking, unauthorized access, and other disclosures could also be thwarted, slowed, or deterred by coupling encryption with advanced key management, access controls, and regular auditing.

    So here I am, writing the same thing I've been writing in this blog for 4 years now...
    1. Encrypt your data.
    2. Help your colleagues, friends, and families encrypt their data.
    3. Insist that your employers institute thorough security policies around encryption.
    4. Ask hard questions of your health care providers and financial services professionals, about the privacy of the data of yours they have. Hold them accountable.
    There's a wide range of tools available, from free/open source, to paid commercial offerings. On the free/open source side, I'm a proponent, author, and maintainer of both eCryptfs and overlayroot (which uses dmcrypt). These can help protect your home directory and your private data in cloud instances.


    And from the commercial side, my employer, Gazzang, sells an enterprise-class encryption product called zNcrypt, and I've architected Gazzang's cloud-compatible key management system, zTrustee. I have no doubt that the combination of these two technologies -- comprehensive data encryption and a robust key management solution -- could have prevented the compromise of millions of these records.

    :-Dustin

    Monday, August 6, 2012

    ecryptfs-utils-100 released

    Most of the original IBM LTC Security Team that designed and implemented eCryptfs, 2005-2008, along with a couple of Gazzangers who have also contributed to eCryptfs.  Gazzang hosted a small reception on Thursday, August 2, 2012.
    I'm pleased to announce the 100th release of the ecryptfs-utils userspace package!

    Somewhat unusually, eCryptfs userspace packages simply increment a single integral revision number, rather than a major or minor revision.  That project maintenance decision predates my involvement as project maintainer.  But it seems to work, at least for me :-)  I started maintaining the eCryptfs project and package at release 50, so this marks roughly my 50th release too.

    Grepping through the changelog, I counted 157 bugs fixed over the last 6 years.  I really like to recognize the contributors who have helped bring a stable and reliable eCryptfs to you:

    Apologies if I missed you...let me know and I'll add you in there ;-)


    Changelog follows.   Here's to another 100!

    ecryptfs-utils (100) precise; urgency=low
    
      [ Tyler Hicks ]
      * src/pam_ecryptfs/pam_ecryptfs.c, src/libecryptfs/key_management.c:
          LP: #1024476
        - fix regression introduced in ecryptfs-utils-99 when Encrypted
          Home/Private is in use and the eCryptfs kernel code is compiled as a
          module
        - drop check for kernel filename encryption support in pam_ecryptfs, as
          appropriate privileges to load the eCryptfs kernel module may not be
          available and filename encryption has been supported since 2.6.29
        - always add filename encryption key to the kernel keyring from pam_mount
    
      [ Colin King ]
      * tests/kernel/inode-race-stat/test.c:
        - limit number of forks based on fd limits
      * tests/kernel/enospc.sh, tests/kernel/enospc/test.c,
        tests/kernel/Makefile.am, tests/kernel/tests.rc:
        - add test case for ENOSPC
    
      [ Tim Harder ]
      * m4/ac_python_devel.m4: LP: #1029217
        - proplery save and restore CPPFLAGS and LIBS when python support is
          enabled
    
     -- Dustin Kirkland Thu, 02 Aug 2012 16:33:22 -0500
    

    Cheers,
    Dustin

    Saturday, August 4, 2012

    Wednesday, August 1, 2012

    Introducing overlayroot -- overlayfs + dmcrypt!

    A beautiful live oak tree with exposed, overlaying roots, near my house in Austin, Texas
    I'm thrilled to introduce what I hope will be an unexpected, innovative, and useful surprise for Ubuntu 12.10!  Along with Scott Moser, I've been hard at work on a new binary package called overlayroot, which is part of the cloud-initramfs-tools source package and project.

    Background

    In a hallway conversation at UDS-Quantal in Oakland, CA in May 2012, I briefly described a concept to Kees Cook, Tyler Hicks, and Scott Moser...



    I look at Ubuntu Cloud AMI's much like Ubuntu Desktop LiveISO's...  The root filesystem of every instance that launches is basically "the same".  What's unique about the system -- the system's actual DNA -- all of that is generated after that first boot.
    Deoxyribonucleic acid.  Or so we're told.
    Log files, host-specific configuration files, variable state information, user home directories, 3rd party software, temporary files, and so on...  Ideally, all of that information would be encrypted before landing on a random disk in a cloud data center!

    About two weeks ago, I started working on implementing this, using two awesome Linux kernel technologies:


    1. overlayfs
      • noting that overlayfs is not yet upstream in the Linux kernel, though Ubuntu carries the patch set (thanks to Andy Whitcroft for his maintenance and upstream communications)
    2. dmcrypt
      • which has been upstream in the Linux kernel for quite some time


    A medieval crypt we visited outside of the Glasgow Cathedral in Scotland
    As a base, I started from some documentation and prototype code in the lengthy Ubuntu wiki article, aufsRootFileSystemOnUsbFlash.  There’s some interesting sample code embedded in that page, licensed under the GPLv3 and attributed to Axel Heider, Sebastian P., and Nicholas A. Schembri.  I’ve never met these individuals but I appreciate their efforts and I’m delighted to embrace, extend, package, and publish their work directly in the Ubuntu distribution!  I had started writing something similar from scratch before I found this.  The real magic for me was seeing the mount command called with the --move option.  Very cool!  That’s really what makes all of this work in the initramfs ;-)


    Shameless advertising.  She's gotta feed the monkey man.
    A special thanks to my employer, Gazzang, who have enabled me to work on this open source project, as it lines up well with our broad interests in securing information in the Cloud and Big Data deployments.  In particular, Gazzang’s zTrustee commercial key management service can be used to protect, secure, backup, revoke, and retrieve the keys that make this whole procedure work.  Contact us if you’re interested in that.

    And an even bigger thanks to Scott Moser for working right there with me on this somewhat ad hoc project.  Scott and I had worked together for years, across IBM and Canonical.  It’s been a lot of fun reconnecting with him, hacking on such an interesting, innovative, and broadly useful feature!

    Overlayfs

    Overlayfs is a successor of a series of union filesystem implementations, such as unionfs, aufs, etc.  It creates a single unified view of two directories.  One is the "lower" filesystem -- in our case, this a read-only mount of our original, pristine Ubuntu AMI.  The second is the "upper" filesystem -- in our case, this is a read-write mount of an encrypted block device.  We’re hopeful that it might one day make it upstream into Linux, though progress on that seems to have stalled.  Thankfully, Ubuntu’s kernels are carrying it for now.

    dmcrypt

    dmcrypt is a block device encryption scheme (different from eCryptfs, which implements a per-file encryption scheme).  Many of my blog readers know that I'm one of the authors and maintainers of eCryptfs, as well as one of its biggest fans :-)  That said, I recognize that dmcrypt solves a different problem, and I chose dmcrypt for this solution.  In this case, we'd like to encrypt the entire block device, ensuring that all reads and writes are happening to and from encrypted storage.  dmcrypt uses cryptsetup and LUKS to configure, format, and open a block device for encryption.  A virtual block device is "mapped" and presented in /dev/mapper/, and this is what will be mounted with read-write capability onto the overlayfs upper filesystem.

    You can read more at this excellent, detailed blog post about dmcrypt.

    A couple of ways dmcrypt can be used
    Installation

    As of August 1, 2012, the overlayroot package is now in the default Ubuntu cloud daily images for quantal, which will debut in the Ubuntu 12.10 Beta1 milestone.

    For older releases of Ubuntu, we’re also automatically publishing backports of the package to the PPA at ppa:cloud-initramfs-tools/ppa.  I’ve verified the backported functionality on at least Ubuntu 12.04 LTS (Precise).  You can add the PPA and install using:

    sudo apt-add-repository ppa:cloud-initramfs-tools/ppa
    sudo apt-get install -y overlayroot

    Configuration

    Once the package is installed, you need to edit the configuration file at /etc/overlayroot.conf.  The syntax, modes, and capabilities are thoroughly described in the inline documentation in the file itself, but I’ll briefly recap here...

    The file should define a single variable, overlayroot, which, by default is empty (meaning that the functionality is disabled).  Alternatively, you can also pass the same variable and string definition on the kernel command line.  The functionality actually works on physical hardware just as well as cloud instances.  There are basically 3 different modes that overlayroot supports:
    1. The backing device is tmpfs
      • eg, overlayroot=”tmpfs”
      • changes on top of the root filesystem are stored in a temporary filesystem in memory
      • all changes are discarded on reboot
      • the amount of available disk space is limited to ½ of the total memory available on the system
      • the tmpfs is mounted onto /media/root-rw
    2. The backing device is block storage
      • eg, overlayroot=”dev=/dev/xvdb”
      • changes on top of the root filesystem are stored on a separate block device, typically /dev/xvdb in AWS
      • these changes are not encrypted by default
      • changes are preserved across reboots
      • the system root filesystem will have as much storage as available on the block storage device
      • the block device is mounted onto /media/root-rw
    3. The backing device is encrypted
      • eg, overlayroot=”crypt:dev=/dev/xvdb”
      • changes on top of the root filesystem are stored in an encrypted block device
      • typically, /dev/xvdb is the backing device, which is mapped and encrypted using cryptsetup and LUKS
      • /dev/mapper/secure is the encrypted block device
      • /dev/mapper/secure is dmcrypt mounted onto /media/root-rw
      • reboots are unfortunately not supported by default, unless you hardcode the dmcrypt mount passphrase in the initramfs
        • See below for more information about that

    All three of these have the tremendous advantage of keeping your root disk pristine and your changes on a separate partition, for auditability and rollback purposes.  Also, there’s a really nice side effect of all of this -- whereas typically you have a root volume with only a couple of GB of space, if your attached EBS storage backing your overlayroot is 100GB, your root partition now has 100GB of available space.  This is far more useful to me than having to symlink and dance around all of that juicy storage inconveniently located over in /mnt!

    There are several other options documented inline in the configuration file.  I’ll leave those for future posts and as an exercise for the reader ;-)  The three simple configuration options in bold above, however, should get you up and running in AWS and OpenStack Clouds trivially with Ubuntu’s 12.10 (Quantal) images though.

    Verification

    You might want to verify that the encryption is operating as expected.  You should see a mount table that looks like this:

    ubuntu@ip-10-140-27-223:~$ mount
    overlayroot on / type overlayfs (rw,lowerdir=/media/root-ro/,upperdir=/media/root-rw/overlay)
    proc on /proc type proc (rw,noexec,nosuid,nodev)
    sysfs on /sys type sysfs (rw,noexec,nosuid,nodev)
    udev on /dev type devtmpfs (rw,mode=0755)
    devpts on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
    tmpfs on /run type tmpfs (rw,noexec,nosuid,size=10%,mode=0755)
    /dev/xvda1 on /media/root-ro type ext4 (ro)
    tmpfs-root on /media/root-rw type tmpfs (rw,relatime)
    none on /sys/fs/fuse/connections type fusectl (rw)
    none on /sys/kernel/debug type debugfs (rw)
    none on /sys/kernel/security type securityfs (rw)
    none on /run/lock type tmpfs (rw,noexec,nosuid,nodev,size=5242880)
    none on /run/shm type tmpfs (rw,nosuid,nodev)
    /dev/xvdb on /media/root-ro/mnt type ext3 (ro)
    /media/root-ro/mnt on /mnt type overlayfs (rw,lowerdir=/media/root-ro/mnt,upperdir=/media/root-rw/overlay/mnt)


    Note that your original root volume is mounted on /media/root-ro:

    /dev/xvda1 on /media/root-ro type ext4 (ro)

    And your root disk is, in fact, an overlayfs on top of /media/root-ro/ and /media/root-rw/overlay.

    overlayroot on / type overlayfs (rw,lowerdir=/media/root-ro/,upperdir=/media/root-rw/overlay)



    If you’re using tmpfs, you should see:

    tmpfs-root on /media/root-rw type tmpfs (rw,relatime)



    If you’re using unencrypted block storage, you should see:

    /dev/xvdb on /media/root-rw type ext4 (rw,relatime,data=ordered)


    And if you’re using encrypted block storage, you should see:

    /dev/mapper/secure on /media/root-rw type ext4 (rw,relatime,data=ordered)


    To verify that your dmcrypt setup of your block storage, you can run:

    $ sudo cryptsetup luksDump /dev/xvdb
    LUKS header information for /dev/xvdb

    Version:        1
    Cipher name:    aes
    Cipher mode:    cbc-essiv:sha256
    Hash spec:      sha1
    Payload offset: 4096
    MK bits:        256
    MK digest:      bc be 66 cd 87 3d 33 6c 2c 99 72 00 b6 d2 be b0 69 6a 76 39
    MK salt:        42 14 8d da 89 15 e4 66 a0 84 6e 6a 7f bc 3e 34
                   da a7 44 3d 6b 80 bb 6f 5b 44 77 4a 9c b0 91 4c
    MK iterations:  27625
    UUID:           245e08b9-4458-4e37-a029-68ebd5f65eca

    Key Slot 0: ENABLED
           Iterations:             110583
           Salt:                   a8 fe 2a ec cd 0c 23 7d 13 ef 91 aa 05 88 24 9b
                                   2e a1 cd 54 54 96 0d 0b ce 88 aa cc 72 a8 d4 63
           Key material offset:    8
           AF stripes:             4000
    Key Slot 1: DISABLED
    Key Slot 2: DISABLED
    Key Slot 3: DISABLED
    Key Slot 4: DISABLED
    Key Slot 5: DISABLED
    Key Slot 6: DISABLED
    Key Slot 7: DISABLED

    And if you’re really paranoid, you can run `sudo strings /dev/xvdb` on the block device and look for anything recognizable as clear text data ;-)

    $ sudo strings /dev/xvdb
    LUKS cbc-essiv:sha256 sha1 \9P< kl2d44f1f1-ccff-4fc7-857c-35519faa9158 Ow\rn {(\T UK9Y> 8Ybk < 'Yb* $i"/5X BIS8
    ...
    The Matrix

    Encryption from Instance Genesis

    So the above example requires a first boot, some configuration, and then a reboot, which means that some of the data generated on first boot (like the host ssh keys) actually landed on the read-only root, before overlayroot was able to do its work.

    Michelangelo's Creation of Adam, a Genesis event 
    Well, ideally we’d capture all of those changes in the overlay as well.  Doing that is a little harder than it sounds.  We need to be able to feed that configuration information into a brand new instance at boot time, during the initramfs.  We can’t use user-data, cloud-init, or the metadata service, because networking is not yet up and running in the instance.

    Scott Moser came up with a rather inspired workaround :-)  He created a “minimal”, 1GB snapshot of a volume that has a filesystem on it, and a very basic overlayroot.conf configuration file.  If such a volume is attached to an instance, and has a volume label of OROOTCFG, the the overlayroot hooks in the initramfs will load and use the overlayroot.conf on that device.  You can simply create a new instance, attaching a volume based on this snapshot!  From the command line, you’d need something like this:

    euca-run-instances --key=ec2-keypair --instance-type=m1.small  --block-device-mapping=/dev/sdc=snap-67d20b17:10:0 ami-e300a88a

    The “:10” says that your volume should be 10GB.  And note that the “:0” denotes whether to delete the volume on termination or not (0=false/do-not-delete, 1=true/delete).
    Note that that snapshot was a test we were using and has since been deleted. I'm hoping Gazzang will be willing to host a couple of these snapshots for general usage. I'll update this post when and if that happens!

    And from the GUI, it looks like this:

    AWS Console, launching an image with an EBS snapshot attached


    Entropy and Keys

    Using mode 3 as described above, you can easily ensure that any runtime data and changes to the stock system image are encrypted before being written to disk.  This begs the question, of course, around the keys used to handle this encryption.  I’ll try to explain the process, complexity, and potential solutions now.

    Encryption of any kind -- be it disk encryption, file encryption, network transport layer encryption, etc. -- always requires high quality keys.  Generating high quality keys requires high quality entropy.  And high quality encryption keys always require “key management”.  I have more information on that in another post, including talks I’m giving on that topic to two conferences in August 2012.


    Keys in the Cloud -- not an easy problem to solve

    A user of overlayroot with encryption can, at their option, specify the dmcrypt mount passphrase itself in the overlayroot.conf file, using the pass=$PASS option.  This has two distinct advantages, and one significant disadvantage.  The advantages are that the user can control the quality/length of the passphrase, and that this passphrase can be used mount the encrypted volume repeatedly, at every boot, ensuring that the persistent encrypted data is available each time the system reboots.  The disadvantage is that the passphrase is plainly visible inside of the initramfs or in your /boot on the base partition, rendering your encryption defenseless if your attacker has access to both your read-only root volume and your encrypted root volume.  For some use cases, this might be a reasonable trade-off, though for others, it may not.

    In the default case, however, where the pass=$PASS option is not specified, the encryption mount passphrase is automatically generated at boot time.  This presents its own set of challenges for the security-minded, as computers have painfully little entropy at initial boot time, as shown in this USENIX paper.  The problem is exacerbated in cloud computing environments, where these “computers” are actually virtual machines, which, by design, have been cloned to appear quite similarly to one another.

    Knowing that this is an issue causing some consternation for security professionals, I’ll describe in detail here how our overlayroot encryption passphrase is generated in the initramfs, and invite your review, feedback, and suggestions.

    Key Generation Design

    An automatically generated overlayroot dmcrypt passphrase is a sha512sum string, which consists of 128 characters of hex [0-9a-f].  Since we’re using cryptsetup with LUKS, this forms a 512-bit wrapping passphrase.  LUKS and cryptsetup will then generate another random 256-bit volume master key.  The wrapping key can be changed easily without re-encrypting all of the data on the device, whereas the volume key cannot.  A compromise of either of these keys will compromise your encrypted data.

    Random bloke cutting keys
    To generate this key, we need to provide the sha512sum utility (which we’ve added to the initramfs) at least 512-bits of random data.  Our current implementation does the following:


    1. We perform a best-effort seeding of /dev/urandom with some psuedo-random data, from two places, if available.  First, we use /random-seed, if found in the initramfs itself.  And second, we use /var/lib/urandom/random-seed, if found in the root image.  The initramfs /random-seed file is created by our overlayroot’s initramfs hooks script, which tries to read up to 4096 bytes of data from /dev/random (true random data).  The root filesystem’s /var/lib/urandom/random-seed file is populated on each system reboot, to carry some randomness across boots.  Neither of these are definitive, but just add some spice to the pool of psuedo-random data multiplexed at /dev/urandom.
    2. Then we seed our input data with the output of `stat -L /dev/* /proc/* /sys/*` in the initramfs.  This adds entropy to our input due to the mostly unpredictable combinations of timestamps associated with the Access/Modify/Change times of each file, directory, and block device in the top level of these kernel filesystems.
    3. Next, we concatenate data from each of 3 places:
      • /proc/sys/kernel/random/boot_id
        • a 16-byte uuid, psuedo-random, “unique” per boot
      • /proc/sys/kernel/random/uuid
        • a 16-byte uuid, psuedo-random, “unique” every time it’s read
      • /dev/urandom
        • 4096-bytes of peusdo-randomness, which is jumbled by our seeding as described above
    4. Finally, we read and concatenate the first 4096 bytes from the specified backing block device
      • /dev/xvdb (presumably, in AWS)
        • Here, a sophisticated or paranoid user could write up to 4KB of their own quality random data to the block device from the same or even a different instance
        • But by default, a fresh block device from Amazon contains a filesystem, which actually has its own somewhat unique metadata written to the head of the block device
    5. Once we’ve gathered all of that data, we calculate the sha512sum of the total, and write it to a root-owned file stored in a tmpfs.  Once booted, the root user can recover this value from /dev/.initramfs/overlayroot.XXXXXX.  This string is used by overlayroot and fed to cryptsetup’s --key-file parameter.
    6. Note that cryptsetup also generates the master volume key.  We are using the default cryptsetup option, which generates a psuedo-random key from /dev/urandom.  We have tried to help that along by seeding /dev/urandom with some additional input.
      • Note that we tried using the cryptsetup --use-random option, but our test instances hung indefinitely, blocking on /dev/random.  Sorry.

    In Conclusion

    If you're still with me here, you should really be proud of yourself :-) This is a bit of a heavy topic, but oh-so-important! We're entrusting more and more of our digital lives, corporate information, trade secrets, cyber property, and critical data to remote storage scatter about the globe.

    How can you possibly know who has physical access to those bits? The sobering reality is that, frankly...you can't.

    I continue to advocate in favor of intelligent cryptographic solutions to this problem, and with overlayroot, you now have a very powerful tool at your disposal, to easily encrypt your information in Ubuntu cloud instances. This covers automatically generated system exhaust spewed to /var, as well as your precious configuration files, user data, and service information.

    :-Dustin

    Printfriendly