From the Canyon Edge -- :-Dustin

Tuesday, August 25, 2015

An Open Letter to Google Nest (was: OMG, F*CKING NEST PROTECT AGAIN!)

[Updates (1) and (2) at the bottom of the post]

It's 01:24am, on Tuesday, August 25, 2015.  I am, again, awake in the the middle of the night, due to another false alarm from Google's spitefully sentient, irascibly ignorant Nest Protect "smart" smoke alarm system.

Exactly how I feel right now.  Except I'm in my pajamas.
Warning: You'll find very little profanity on this blog.  However, the filter is off for this post.  Apologies in advance.


"Heads up, there's smoke in the kids room," she says.  Not once, but 3 times in a 30 minute period, between 12am and 1am, last night.

That's my alarm clock.  Right now.  I'm serious.
"Heads up, there's smoke in the guest bedroom," she says again tonight a few minutes ago, at 12:59am.

There was in fact never any smoke to clear.
Is it possible for anything wake you up more seriously and violently in a cold panic than a smoke alarm detecting something amiss in your 2 year old's bedroom?

Here's what happens (each time)...

Every Nest Protect unit in the house announces, in unison, "Heads up, there's smoke in the kids' room."  Then both my phone and my wife's phone buzzes on our night stands, with the urgent incoming message from the Nest app.  Another few seconds pass, and a another set of alarms, this time delivered by email, in case you missed the first two.

The first and second time it happens, you jump up immediately.  You run into their room and make sure everyone is okay -- both the infant in the crib and toddler who's into everything.  You walk the whole house, checking the oven, the stove, the toaster, the computer equipment.  You open the door and check around outside.  When everything is okay, you're left with a tingling in the back of your mind, wondering what went wrong.  When you're a computer engineer by trade, you're trying to debug the hardware and/or software bug causing the false positive.  Then you set about trying to calm your family and get them back into bed.  And at some point later, you calm your own nerves and try to get some sleep.  It's a work night after all.

But the third, fourth, and fifth time it happens?  From 3 different units?

Well, it never ceases to scare the ever living shit out of you, waking up out of deep sleep, your mind racing, assessing the threat.

But then, reality kind of sets in.  It's just the stupid Nest Protect fucking it all up again.

Roll over, go back to bed, and pray that the full alarm doesn't sound this time, waking up both kids and setting us up for a really bad night and next few days at school.

It's not over yet, though.  You then wait for the same series of messages announcing the all clear -- first the bitch over the loudspeaker, followed by the Android app notification, then the email -- each with the same message:  "Caution, the smoke is clearing..."


20 years later, and the smartest company in the world
creates a smoke detector that broadcasts the IoT equivalent
of PC LOAD LETTER to your smart home, mobile app, and email.
But not this time.  I'm not rolling over.  I'm here, typing with every ounce of anger this Thinkpad can muster. I'm mashing these keys in the guest bedroom that's supposedly on fire.  I can most assuredly tell you that it's a comfy 72 F, that the air is as clean as a summer breeze.

I'm writing this, hoping that someone, somewhere hears how disturbingly defective, and dangerously disingenuous this product actually is.

It has one job to do.  Detect and report smoke.  And it's unable to do that effectively.  If it can't reliably detect normality, what confidence should I have that it'll actually detect an emergency if that dreaded day ever comes?

The sad, sobering reality is: zero.  I have zero confidence whatsoever in the Nest Protect.

What's worse, is that I'm embarrassed to say that I've been duped into buying 7 (yes, seven) of these broken pieces of shit, at $99 apiece.  I'm a pretty savvy technical buyer, and admittedly a pretty magnanimous early adopter.  But while I'm accepting on beta versions of gadgets and gizmos, I am entirely unforgiving on the safety and livelihood of my family and guests.

Michael Larabel of Phoronix recounts his similar experience here.  He destroyed one with a sledgehammer, which might provide me with some catharsis when (not if, but when) this happens again.

Michael Larabel of Phoronix destroyed his malfunctioning Nest Protect
with a 20 lb sledgehammer, to silence the false alarm in the middle of the night
 There's a sad, long, thread on Nest's customer support forum, calling for a better "silence" feature.  I'm sorry, that's just wrong.  The solution is not a better way to "silence" false positives.  Root out the false positives to begin with.  Or recall the hardware.  Tut, tut, tut.

You can't be serious...
This is from me to Google and Nest on behalf of thousands of trusting families out there:  You have the opportunity, and ultimately the obligation.  Please make this right.  Whatever that means, you owe the world that.
  • Ship working firmware.
  • Recall faulty hardware.
  • Refund the product.
Okay, the empassioned rant is over.  Time for data.  Here is the detailed, distressing timeline.
  • January 2015: I installed 5 Nest Protects: one in each of two kids' rooms, the master bedroom, the hallway, and the kitchen/living room
  • February 2015: While on a business trip to, South Africa, I received notification via email and the Nest App that there was a smoke emergency at my home, half a world away, with my family in bed for the night.  My wife called me immediately -- in the middle of the night in Texas.  My heart raced.  She assured me it was a false alarm, and that she had two screaming kids awake from the noise.  I filed a support ticket with Nest (ref:_00D40Mlt9._50040jgU8y:ref) and tried to assure my wife that it was just a glitch and that I'd fix it when I got home.

  • May 23, 2015: We thought it was funny enough to post to Facebook, "When Nest mistakes a diaper change for a fire, that's one impressive poop, kiddo!"  Not so funny now.

  • August 9, 2015: I installed 2 more Nest Protects, in the guest bedroom and my office
  • August 21, 2015, 11:26am: While on a flight home from another business, trip, I receive another set of daytime warnings about smoke in the house.  Another false alarm.
  • August 24, 2015, 12am: While asleep, I receive another 3 false alarms.
  • August 25, 2015, 1am: Again, asleep, another false alarm.  Different room, different unit.  I'm fucking done with these.
I'm counting on you Google/Nest.  Please make it right.

Burning up but not on fire,

Update #1: I was contacted directly by email and over Twitter, by Nest's "Executive Relations", who offer to replace of all 7 of my "v1" Nest Protects with 7 new "v2" Nest Protects, at no charge.  The new "v2" Protect reportedly has an improved design with better photoelectric detector that reduces false positives.  I was initially inclined to try the new "v2" Protects, however, neither the mounting bracket nor the wiring harness are compatible from v1 to v2.  So I would have to replace all of the brackets and redoing all of the wiring myself.  I asked, but Nest would not cover the cost of a professional (re-)installation.  At this point, as expressed my disappointment in this alternative, and I was offered a full refund, in 4-6 weeks time, after I return the 7 units.  I've accepted this solution and will replace the Nest Protects with a simpler, more reliable traditional smoke detector. 
Update #2: I suppose I should mention that I generally like my Nest Thermostat and (3) Dropcams.  This blog post is really only complaining about the Titanic disaster that is the Nest Protect.

Wednesday, August 12, 2015

Ubuntu and LXD at ContainerCon 2015

Canonical is delighted to sponsor ContainerCon 2015, a Linux Foundation event in Seattle next week, August 17-19, 2015. It's quite exciting to see the A-list of sponsors, many of them newcomers to this particular technology, teaming with energy around containers. 

From chroots to BSD Jails and Solaris Zones, the concepts behind containers were established decades ago, and in fact traverse the spectrum of server operating systems. At Canonical, we've been working on containers in Ubuntu for more than half a decade, providing a home and resources for stewardship and maintenance of the upstream Linux Containers (LXC) project since 2010.

Last year, we publicly shared our designs for LXD -- a new stratum on top of LXC that endows the advantages of a traditional hypervisor into the faster, more efficient world of containers.

Those designs are now reality, with the open source Golang code readily available on Github, and Ubuntu packages available in a PPA for all supported releases of Ubuntu, and already in the Ubuntu 15.10 beta development tree. With ease, you can launch your first LXD containers in seconds, following this simple guide.

LXD is a persistent daemon that provides a clean RESTful interface to manage (start, stop, clone, migrate, etc.) any of the containers on a given host.

Hosts running LXD are handily federated into clusters of container hypervisors, and can work as Nova Compute nodes in OpenStack, for example, delivering Infrastructure-as-a-Service cloud technology at lower costs and greater speeds.

Here, LXD and Docker are quite complementary technologies. LXD furnishes a dynamic platform for "system containers" -- containers that behave like physical or virtual machines, supplying all of the functionality of a full operating system (minus the kernel, which is shared with the host). Such "machine containers" are the core of IaaS clouds, where users focus on instances with compute, storage, and networking that behave like traditional datacenter hardware.

LXD runs perfectly well along with Docker, which supplies a framework for "application containers" -- containers that enclose individual processes that often relate to one another as pools of micro services and deliver complex web applications.

Moreover, the Zen of LXD is the fact that the underlying container implementation is actually decoupled from the RESTful API that drives LXD functionality. We are most excited to discuss next week at ContainerCon our work with Microsoft around the LXD RESTful API, as a cross-platform container management layer.

Ben Armstrong, a Principal Program Manager Lead at Microsoft on the core virtualization and container technologies, has this to say:
“As Microsoft is working to bring Windows Server Containers to the world – we are excited to see all the innovation happening across the industry, and have been collaborating with many projects to encourage and foster this environment. Canonical’s LXD project is providing a new way for people to look at and interact with container technologies. Utilizing ‘system containers’ to bring the advantages of container technology to the core of your cloud infrastructure is a great concept. We are looking forward to seeing the results of our engagement with Canonical in this space.”
Finally, if you're in Seattle next week, we hope you'll join us for the technical sessions we're leading at ContainerCon 2015, including: "Putting the D in LXD: Migration of Linux Containers", "Container Security - Past, Present, and Future", and "Large Scale Container Management with LXD and OpenStack". Details are below.
Date: Monday, August 17 • 2:20pm - 3:10pm
Title: Large Scale Container Management with LXD and OpenStack
Speaker: St├ęphane Graber
Location: Grand Ballroom B
Date: Wednesday, August 19 10:25am-11:15am
Title: Putting the D in LXD: Migration of Linux Containers
Speaker: Tycho Andersen
Location: Willow A
Date: Wednesday, August 19 • 3:00pm - 3:50pm
Title: Container Security - Past, Present and Future
Speaker: Serge Hallyn
Location: Ravenna

Monday, August 10, 2015

The Golden Ratio calculated to a record 2 trillion digits, on Ubuntu, in the Cloud!

The Golden Ratio is one of the oldest and most visible irrational numbers known to humanity.  Pi is perhaps more famous, but the Golden Ratio is found in more of our art, architecture, and culture throughout human history.

I think of the Golden Ratio as sort of "Pi in 1 dimension".  Whereas Pi is the ratio of a circle's circumference to its diameter, the Golden Ratio is the ratio of a whole to one of its parts, when the ratio of that part to the remainder is equal.

Visually, this diagram from Wikipedia helps explain it:

We find the Golden Ratio in the architecture of antiquity, from the Egyptians to the Greeks to the Romans, right up to the Renaissance and even modern times.

While the base of the pyramids are squares, the Golden Ratio can be observed as the base and the hypotenuse of a basic triangular cross section like so:

The floor plan of the Parthenon has a width/depth ratio matching the Golden Ratio...

For the first 300 years of printing, nearly all books were printed on pages whose length to width ratio matched that of the Golden Ratio.

Leonardo da Vinci used the Golden Ratio throughout his works.  I'm told that his Vitruvian Man displays the Golden Ratio...

From school, you probably remember that the Golden Ratio is approximately ~1.6 (and change).
There's a strong chance that your computer or laptop monitor has a 16:10 aspect ratio.  Does 1280x800 or 1680x1050 sound familiar?

That ~1.6 number is only an approximation, of course.  The Golden Ratio is in fact an irrational number and can be calculated to much greater precision through several different representations, including:

You can plug that number into your computer's calculator and crank out a dozen or so significant digits.

However, if you want to go much farther than that, Alexander Yee has created a program called y-cruncher, which as been used to calculate most of the famous constants to world record precision.  (Sorry free software readers of this blog -- y-cruncher is not open source code...)

I came across y-cruncher a few weeks ago when I was working on the mprime post, demonstrating how you can easily put any workload into a Docker container and then produce both Juju Charms and Ubuntu Snaps that package easily.  While I opted to use mprime in that post, I saved y-cruncher for this one :-)

Also, while doing some network benchmark testing of The Fan Networking among Docker containers, I experimented for the first time with some of Amazon's biggest instances, which have dedicated 10gbps network links.  While I had a couple of those instances up, I did some small scale benchmarking of y-cruncher.

Presently, none of the mathematical constant records are even remotely approachable with CPU and Memory alone.  All of them require multiple terabytes of disk, which act as a sort of swap space for temporary files, as bits are moved in and out of memory while the CPU crunches.  As such, approaching these are records are overwhelmingly I/O bound -- not CPU or Memory bound, as you might imagine.

After a variety of tests, I settled on the AWS d2.2xlarge instance size as the most affordable instance size to break the previous Golden Ratio record (1 trillion digits, by Alexander Yee on his gaming PC in 2010).  I say "affordable", in that I could have cracked that record "2x faster" with a d2.4xlarge or d2.8xlarge, however, I would have paid much more (4x) for the total instance hours.  This was purely an economic decision :-)

Let's geek out on technical specifications for a second...  So what's in a d2.2xlarge?
  • 8x Intel Xeon CPUs (E5-2676 v3 @ 2.4GHz)
  • 60GB of Memory
  • 6x 2TB HDDs
First, I arranged all 6 of those 2TB disks into a RAID0 with mdadm, and formatted it with xfs (which performed better than ext4 or btrfs in my cursory tests).

$ sudo mdadm --create --verbose /dev/md0 --level=stripe --raid-devices=6 /dev/xvd?
$ sudo mkfs.xfs /dev/md0
$ df -h /mnt
/dev/md0         11T   34M   11T   1% /mnt

Here's a brief look at raw read performance with hdparm:

$ sudo hdparm -tT /dev/md0
 Timing cached reads:   21126 MB in  2.00 seconds = 10576.60 MB/sec
 Timing buffered disk reads: 1784 MB in  3.00 seconds = 593.88 MB/sec

The beauty here of RAID0 is that each of the 6 disks can be used to read and/or write simultaneously, perfectly in parallel.  600 MB/sec is pretty quick reads by any measure!  In fact, when I tested the d2.8xlarge, I put all 24x 2TB disks into the same RAID0 and saw nearly 2.4 GB/sec read performance across that 48TB array!

With /dev/md0 mounted on /mnt and writable by my ubuntu user, I kicked off y-crunch with these parameters:

Program Version:       0.6.8 Build 9461 (Linux - x64 AVX2 ~ Airi)
Constant:              Golden Ratio
Algorithm:             Newton's Method
Decimal Digits:        2,000,000,000,000
Hexadecimal Digits:    1,660,964,047,444
Threading Mode:        Thread Spawn (1 Thread/Task)  ? / 8
Computation Mode:      Swap Mode
Working Memory:        61,342,174,048 bytes  ( 57.1 GiB )
Logical Disk Usage:    8,851,913,469,608 bytes  ( 8.05 TiB )

Byobu was very handy here, being able to track in the bottom status bar my CPU load, memory usage, disk usage, and disk I/O, as well as connecting and disconnecting from the running session multiple times over the 4 days of running.

And approximately 79 hours later, it finished successfully!

Start Date:            Thu Jul 16 03:54:11 2015
End Date:              Sun Jul 19 11:14:52 2015

Computation Time:      221548.583 seconds
Total Time:            285640.965 seconds

CPU Utilization:           315.469 %
Multi-core Efficiency:     39.434 %

Last Digits:
5027026274 0209627284 1999836114 2950866539 8538613661  :  1,999,999,999,950
2578388470 9290671113 7339871816 2353911433 7831736127  :  2,000,000,000,000

Amazing, another person (who I don't know), named Ron Watkins, performed the exact same computation and published his results within 24 hours, on July 22nd/23rd.  As such, Ron and I are "sharing" credit for the Golden Ratio record.

Now, let's talk about the economics here, which I think are the most interesting part of this post.

Look at the above chart of records, which are published on the y-cruncher page, the vast majority of those have been calculated on physical PCs -- most of them seem to be gaming PCs running Windows.

What's different about my approach is that I used Linux in the Cloud -- specifically Ubuntu in AWS.  I paid hourly (actually, my employer, Canonical, reimbursed me for that expense, thanks!)  It took right at 160 hours to run the initial calculation (79 hours) as well as the verification calculation (81 hours), at the current rate of $1.38/hour for a d2.2xlarge, which is a grand total of $220!

$220 is a small fraction of the cost of 6x 2TB disks, 60 GB of memory, or 8 Xeon cores, not to mention the electricity and cooling required to run a system of this size (~750W) for 160 hours.

If we say the first first trillion digits were already known from the previous record, that comes out to approximately 4.5 billion record-digits per dollar, and 12.5 billion record-digits per hour!

Hopefully you find this as fascinating as I!