From the Canyon Edge -- :-Dustin
Showing posts with label openstack. Show all posts
Showing posts with label openstack. Show all posts

Thursday, April 2, 2020

How We've Adapted Ubuntu's Time-based Release Cycles to Fintech and Software-as-a-Service


The Spring launch of the Apex 20a platform on March 26, 2020, marks the beginning of an exciting, new era of product development at Apex Clearing.  We have adopted a number of product management best practices from the software industry and adapted them to address some of the unique challenges within financial services and software-as-a-service.  In the interest of transparency, I’m pleased to share our new processes, what we’ve learned along the way, and where we’re headed next. 


Elsewhere in the Software Industry 

I joined Apex Clearing last year, having spent the previous 20 years as a software engineer, product manager, and executive, mostly around open source software, including UbuntuOpenStack, and Kubernetes.  Albeit IBM, Canonical and Google differ from fintech on many levels, these operating systems and cloud infrastructure technology platforms share a number of similarities with Apex's software-as-a-service platform.  Moreover, there also exists some literal overlap: we’re heavy users of both Ubuntu and Kubernetes here at Apex. 

Ubuntu, OpenStack, and Kubernetes all share similar, predictable, time-based release cycles.  Ubuntu has released every April and October, since October of 2004 – that's 32 major software platform releases, on time, every time, over 16 years.  Ubuntu has set the bar for velocity, quality, and predictability in the open source world.  OpenStack’s development processes have largely mirrored Ubuntu’s, with many of the early project leaders having been ex-Ubuntu engineers and managers.  OpenStack, too, has utilized a 6-month development cycle, since 2010, now on its 20th release.  Kubernetes came along in 2014, and sought to increase the pace a bit, with quarterly release cycles.  Kubernetes is a little bit looser with dates than Ubuntu or OpenStack, but has generally cranked out 4 quality releases per year, over the last 6 years.  I’ve been involved in each of these projects at some level, and I’ve thoroughly enjoyed coaching a number of early stage start-ups on how to apply these principles to their product development methodologies. 

Across the board, users, and especially enterprise customers, appreciate the predictability of these release cycles.  Corporate IT managers are better able to plan, test, and roll out technology changes to their environments, with less risk and disruption.  From a commercial perspective, these methodologies drove many wins against legacy, less predictable platforms.  


The Key Principles of Coordinated Cycles 

As a product development team, you have: 
  1. Time 
  2. Work to complete 
  3. Resources to perform work 
To succeed, you must ensure that two of these three are fixed; and only one may vary.  In the Ubuntu methodology – time and resources are assumed to be fixed.  Time is fixed, in that the product will release, on time.  Resources are fixed, in that it takes many months to recruit and hire and on-board new resources.  We hire new people, of course, but we’ll assume those resources will be productive in a subsequent cycle.  Hence, it’s the amount of work that we will complete, which varies.  We will drop or defer commitments, but we won’t change the release dates, and we won't assume additional resources will have meaningful impact within a cycle. 

Within Ubuntu, OpenStack, and Kubernetes, each cycle would “kick-off” with a summit or conference, that brought together hundreds of developers and leaders from around the industry, to discuss and debate designs for the release.  Anyone who’s participated in an Ubuntu Developer Summit, an OpenStack Design Summit, or a KubeCon Design Summit can tell you how essential these gatherings are, to the success of the project.  Within Canonical, we also held a “Mid-Cycle Summit”, exactly at the half-way point of the cycle.  We used this checkpoint, as product and engineering teams, to right-size the scope, and ensure that we hit our release dates, and with the highest quality standards.  Inevitably, new requirements and priorities would emerge, or some committed work proved more complicated than anticipated.  This checkpoint was critical to the success of each launch, as we adjusted the targeted scope for the remainder of the release.

Adapting these Processes to Apex 

When I arrived at Apex in September 2019 to lead the product organization, I inherited an excellent team of product and project managers, peered with high-quality engineering teams.  Products and projects, however, were managed pretty asynchronously, hence release timelines and new feature commitments were unpredictable.  Of course, I had seen this before at a number of the start-ups that I’ve advised, so the model was quite familiar. 

Launch Cycles 

When adapting the coordinated, time-based release cycle to a given organization, the first thing to consider is the time frame.  After talking to all of our stakeholders, 6-month releases, like Ubuntu or OpenStack felt a little too long, a little too slow, for Apex and our customers.  Most of the engineering teams were already quite agile and utilizing 2-week sprints, so getting product requirements for 26-weeks (13 sprints) seemed a little unwieldy.  Quarterly cycles, however, can be pretty tough to see through, for anything but the smallest individual projects (frankly, Kubernetes struggles with the pace at times).  Moreover, all of the projects I’ve been involved in, have struggled with the end of year holidays, in November, December, and January.  Thus, we actually settled on a 16-week cycle, which amounts to roughly 4-month cycles.  That translates to 3 full cycles per year, with 48 weeks of development, while still allowing for 4 weeks of holidays. 

Our cycles are named for the year in which it will complete (launch), and with a letter as an iterator.  In 2020, we launch Apex 20a (March), Apex 20b (July), and Apex 20c (November), and looking forward to 2021, we should see Apex 21a (March), Apex 21b (July), and Apex 21c (November).  The “c” cycles are a few extra weeks, to account for the holidays near the end of the year.  These aren’t really “versions”, as Apex is more like “software as a service”, rather than “delivered software”, like Ubuntu, OpenStack, and Kubernetes.  Also, conversationally, we're referring to the cycles with the season -- so Apex 20a is our "Spring" launch, 20b will be our "Summer" launch, and 20c will be our "Autumn" launch.


Summits 

Each cycle involves 3 key summits.  As much as possible, these summits are in-person meetings (at least until our travel paused along with the rest of the world).  At this point, we’re proceeding quite seamlessly with virtual summits, instead.  Our summits are recorded in Zoom, and we always take extraordinarily detailed notes in internally shared documents. 
  1. Prioritization Summit 
  2. Planning Summit 
  3. Mid-cycle Summit 


Prioritization 

The Prioritization Summit brings together our product managers with all of our key stakeholders – sales executives working with new business prospects, our client relationship managers working with existing customers, as well as our own IT, operations, and site reliability engineers tasked with keeping Apex running on a day to day basis.  Each product manager works with their stakeholders to gather CUJs (critical user journeys) and map those into patterns of similar, weighted product requests. Product Managers generally spend about 2 weeks on that work, which culminates in a session where the Product Manager presents the consensus priorities for their product area for review by the broader product team.  Based on this work, each product manager then starts working on their PRDs (product requirements documents), for the next 3-4 weeks.  Our Prioritization Summit is about a dozen, hour-long sessions, spread over three days in the same week.  We exit the Prioritization Summit with clear stakeholder consensus on stack-ranked priorities for each product family. 


Planning  

The Planning Summit signals the end of the PRD-writing period, during which product managers worked closely with their engineering counterparts, digesting all of those product requests and priorities, and turn those into product requirements written in RFC2119-style language (must, should, may, etc.).  At the end of that process, each Product Manager and their technical counterpart lead an hour-long session with their plan for the next cycle, including fairly detailed commitments as to the major changes we should expect to be delivered.  Our Planning Summit is about a dozen, hour long sessions, spread over three days in the same week.  We exit the Planning Summit with clear product and engineering consensus on work commitments across the product portfolio, for the upcoming cycle.  This marks the beginning of the development portion of the cycle. 

For the next few weeks, product managers spend the majority of their time with Apex customers and prospects.  Each of us on the product team carry specific OKRs (objectives and key results), to spend meaningful time with our existing correspondents and prospects, communicating our product roadmaps and gathering feedback on their experiences.  We take detailed notes, and all of this data filters directly into our future Prioritization Summits. 


Mid-cycle 

At the middle of our release cycle (week 8 of 16), we bring together the same Product Managers and technical leads to report on status of the first 8 weeks (4 sprints), and recalibrate the remaining work for the cycle.  Without exception, there are always new, late-breaking product requests or requirements that emerge, after the prioritization and planning summits.  Some of these are urgent, and we must accommodate, which usually means something else gets deferred to the next cycle.  Sometimes, we were a little too optimistic with our work estimates, and again we need to adjust.  Occasionally, we’re ahead of schedule and we can cherry-pick some other bite-sized items to bring into scope.  In any case, we will exit the Mid-cycle Summit with a very clear line-of-sight on our deliverables by the end of the cycle. 

With any scope adjustments well understood, the product team shifts into “go-to-market" mode.  Over these next 3 weeks, Product Managers are working with our Marketing counterparts, writing release notes, creating marketing content, educating our sales teams, and working through signoffs on our launch checklists. 

At this point, the cycle repeats itself.  Once our go-to-market activities are complete, Product Managers shift back into prioritization mode, working with our stakeholders, while the engineering team completes their work and the marketing team publishes the launch. 

Speaking of...let’s talk about the Apex 20a Release. 


The Apex 20a and 20b Releases 

Apex 20a launched on March 31, 2020, as our first release using the methodologies described above.  Apex clients can find detailed release notes in the Apex Developer Portal.  This cycle began with a Prioritization Summit in October 2019, a Planning Summit in November 2019, and a Mid-cycle Summit in January 2020.   This cycle involved 17-weeks of development. 

Our work on Apex 20b is already well underway, having held our Prioritization Summit in February 2020, and we’re holding our Planning Summit this week (March 2020).  Our Mid-cycle Summit will be held in May 2020, and we will launch Apex 20b in July 2020. 

It’s important to note that although we do have a very specific “launch date”, which signals the end of the development cycle, each of our engineering teams have developed, tested, and deployed to production hundreds of changesets during the cycle.  Thus, we maintain our agile CI/CD (continuous integration / continuous deployment) systems within every product and engineering team.  To be clear, we don’t “hold” anything specifically until launch date.  This is a very specific differentiation from Ubuntu, OpenStack, and Kubernetes, which are “shipped software”, as opposed to Apex technologies, which amount to “software as a service”.  For these reasons, we try to use the term “launch”, rather than “release”, when we talk about the “launch date” at the end of the cycle.  All, that said, we have found the processes described here very useful in our planning and communications about Apex technology to our customers. 


In Conclusion 

Apex 20a is the first of many coordinated product launch cycles our customers will experience.  We’ve adapted many of the best practices utilized by the open source software industry as well as Silicon Valley, and those practices are helping us work more effectively with our tech-savvy client base.  Apex will have 3 launches in 2020 (20a, 20b, 20c), and at least 3 launches in 2021.  By openly sharing our product stages and delivering a consistent, predictable, and reliable schedule, there are now ample opportunities for both customer input and detailed review and oversight by our leaders, which culminates in secure and stable products for our industry.  We’re delighted at the engagement thus far, and really look forward to more collaboration in the future. 

On behalf of the Apex product team,
:-Dustin

Monday, August 21, 2017

Bare Metal Kubernetes: More Containers, Less Overhead

Earlier this month, I spoke at ContainerDays, part of the excellent DevOpsDays series of conferences -- this one in lovely Portland, Oregon.

I gave a live demo of Kubernetes running directly on bare metal.  I was running it on an 11-node Ubuntu Orange Box -- but I used the exact same tools Canonical's world class consulting team uses to deploy Kubernetes onto racks of physical machines.
You see, the ability to run Kubernetes on bare metal, behind your firewall is essential to the yin-yang duality of Cloud Native computing.  Sometimes, what you need is actually a Native Cloud.
Deploying Kubernetes into virtual machines in the cloud is rather easy, straightforward, with dozens of tools now that can handle that.

But there's only one tool today, that can deploy the exact same Kubernetes to AWS, Azure, GCE, as well as VMware, OpenStack, and bare metal machines.  That tools is conjure-up, which acts as a command line front end to several essential Ubuntu tools: MAAS, LXD, and Juju.

I don't know if the presentation was recorded, but I'm happy to share with you my slides for download, and embedded here below.  There are a few screenshots within that help convey the demo.




Cheers,
Dustin

Thursday, February 23, 2017

The Questions that You're Afraid to Ask about Containers



Yesterday, I delivered a talk to a lively audience at ContainerWorld in Santa Clara, California.

If I measured "the most interesting slides" by counting "the number of people who took a picture of the slide", then by far "the most interesting slides" are slides 8-11, which pose an answer the question:
"Should I run my PaaS on top of my IaaS, or my IaaS on top of my PaaS"?
In the Ubuntu world, that answer is super easy -- however you like!  At Canonical, we're happy to support:
  1. Kubernetes running on top of Ubuntu OpenStack
  2. OpenStack running on top of Canonical Kubernetes
  3. Kubernetes running along side OpenStack
In all cases, the underlying substrate is perfectly consistent:
  • you've got 1 to N physical or virtual machines
  • which are dynamically provisioned by MAAS or your cloud provider
  • running stable, minimal, secure Ubuntu server image
  • carved up into fast, efficient, independently addressable LXD machine containers
With that as your base, we'll easily to conjure-up a Kubernetes, an OpenStack, or both.  And once you have a Kubernetes or OpenStack, we'll gladly conjure-up one inside the other.


As always, I'm happy to share my slides with you here.  You're welcome to download the PDF, or flip through the embedded slides below.



Cheers,
Dustin

Monday, May 9, 2016

Using Containers to Create the World's Fastest OpenStack


Below you can find the audio/video recording of my OpenStack Austin presentation, where I demonstrated Ubuntu OpenStack Mitaka, running on top of Ubuntu 16.04 LTS, entirely within LXD machine containers.  You can also download the PDF of the slides here.  And there are a number of other excellent talks here!



Cheers,
Dustin

Sunday, April 24, 2016

Keep OpenStack Weird



The OpenStack Summit in Austin has already kicked off, and this time, Ubuntu is the official lanyard sponsor at OpenStack Summit Austin.

The sponsorship contract for the OpenStack Summit explicitly states that only the official lanyard sponsor may distribute lanyards. Whilst we understand the reason that clause is there, we don't agree with it. It just doesn't seem very "open" nor in the spirit of OpenStack.

Freedom of choice is an important aspect of all open source communities and one that we certainly champion, so attendees should be free to wear whatever branded lanyard they want with pride at the OpenStack Summit in Austin and we at Canonical will celebrate it.  My hometown here, Austin, prides itself on diversity, where we like to Keep Austin Weird!


So please -- partners, customers, competitors, other OpenStack Sponsors: if you want to distribute your own lanyards then please go ahead safe in the knowledge that Canonical will not complain to the conference organizers.  Let's Keep OpenStack (a little bit) Weird, too!



See you there!
:-Dustin

Thursday, April 14, 2016

Docker 1.10 with Fan Networking in Ubuntu 16.04, for Every Architecture!


I'm thrilled to introduce Docker 1.10.3, available on every Ubuntu architecture, for Ubuntu 16.04 LTS, and announce the General Availability of Ubuntu Fan Networking!

That's Ubuntu Docker binaries and Ubuntu Docker images for:
  • armhf (rpi2, et al. IoT devices)
  • arm64 (Cavium, et al. servers)
  • i686 (does anyone seriously still run 32-bit intel servers?)
  • amd64 (most servers and clouds under the sun)
  • ppc64el (OpenPower and IBM POWER8 machine learning super servers)
  • s390x (IBM System Z LinuxOne super uptime mainframes)
That's Docker-Docker-Docker-Docker-Docker-Docker, from the smallest Raspberry Pi's to the biggest IBM mainframes in the world today!  Never more than one 'sudo apt install docker.io' command away.

Moreover, we now have Docker running inside of LXD!  Containers all the way down.  Application containers (e.g. Docker), inside of Machine containers (e.g. LXD), inside of Virtual Machines (e.g. KVM), inside of a public or private cloud (e.g. Azure, OpenStack), running on bare metal (take your pick).

Let's have a look at launching a Docker application container inside of a LXD machine container:

kirkland@x250:~⟫ lxc launch ubuntu-daily:x -p default -p docker
Creating magical-damion
Starting magical-damion
kirkland@x250:~⟫ lxc list | grep RUNNING
| magical-damion | RUNNING | 10.16.4.52 (eth0) |      | PERSISTENT | 0         |
kirkland@x250:~⟫ lxc exec magical-damion bash
root@magical-damion:~# apt update >/dev/null 2>&1 ; apt install -y docker.io >/dev/null 2>&1 
root@magical-damion:~# docker run -it ubuntu bash
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
759d6771041e: Pull complete 
8836b825667b: Pull complete 
c2f5e51744e6: Pull complete 
a3ed95caeb02: Pull complete 
Digest: sha256:b4dbab2d8029edddfe494f42183de20b7e2e871a424ff16ffe7b15a31f102536
Status: Downloaded newer image for ubuntu:latest
root@0577bd7d5db1:/# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 02:42:ac:11:00:02  
          inet addr:172.17.0.2  Bcast:0.0.0.0  Mask:255.255.0.0
          inet6 addr: fe80::42:acff:fe11:2/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:16 errors:0 dropped:0 overruns:0 frame:0
          TX packets:8 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:1296 (1.2 KB)  TX bytes:648 (648.0 B)


Oh, and let's talk about networking...  We're also pleased to announce the general availability of Ubuntu Fan networking -- specially designed to connect all of your Docker containers spread across your network.  Ubuntu's Fan networking feature is an easy way to make every Docker container on your local network easily addressable by every other Docker host and container on the same network.  It's high performance, super simple, utterly deterministic, and we've tested it on every major public cloud as well as OpenStack and our private networks.

Simply installing Ubuntu's Docker package will also install the ubuntu-fan package, which provides an interactive setup script, fanatic, should you choose to join the Fan.  Simply run 'sudo fanatic' and answer the questions.  You can trivially revert your Fan networking setup easily with 'sudo fanatic deconfigure'.

kirkland@x250:~$ sudo fanatic 
Welcome to the fanatic fan networking wizard.  This will help you set
up an example fan network and optionally configure docker and/or LXD to
use this network.  See fanatic(1) for more details.
Configure fan underlay (hit return to accept, or specify alternative) [10.0.0.0/16]: 
Configure fan overlay (hit return to accept, or specify alternative) [250.0.0.0/8]: 
Create LXD networking for underlay:10.0.0.0/16 overlay:250.0.0.0/8 [Yn]: n
Create docker networking for underlay:10.0.0.0/16 overlay:250.0.0.0/8 [Yn]: Y
Test docker networking for underlay:10.0.0.45/16 overlay:250.0.0.0/8
(NOTE: potentially triggers large image downloads) [Yn]: Y
local docker test: creating test container ...
34710d2c9a856f4cd7d8aa10011d4d2b3d893d1c3551a870bdb9258b8f583246
test master: ping test (250.0.45.0) ...
test slave: ping test (250.0.45.1) ...
test master: ping test ... PASS
test master: short data test (250.0.45.1 -> 250.0.45.0) ...
test slave: ping test ... PASS
test slave: short data test (250.0.45.0 -> 250.0.45.1) ...
test master: short data ... PASS
test slave: short data ... PASS
test slave: long data test (250.0.45.0 -> 250.0.45.1) ...
test master: long data test (250.0.45.1 -> 250.0.45.0) ...
test master: long data ... PASS
test slave: long data ... PASS
local docker test: destroying test container ...
fanatic-test
fanatic-test
local docker test: test complete PASS (master=0 slave=0)
This host IP address: 10.0.0.45

I've run 'sudo fanatic' here on a couple of machines on my network -- x250 (10.0.0.45) and masterbr (10.0.0.8), and now I'm going to launch a Docker container on each of those two machines, obtain each IP address on the Fan (250.x.y.z), install iperf, and test the connectivity and bandwidth between each of them (on my gigabit home network).  You'll see that we'll get 900mbps+ of throughput:

kirkland@x250:~⟫ sudo docker run -it ubuntu bash
root@c22cf0d8e1f7:/#  apt update >/dev/null 2>&1 ; apt install -y iperf >/dev/null 2>&1
root@c22cf0d8e1f7:/#  ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 02:42:fa:00:2d:00  
          inet addr:250.0.45.0  Bcast:0.0.0.0  Mask:255.0.0.0
          inet6 addr: fe80::42:faff:fe00:2d00/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:6423 errors:0 dropped:0 overruns:0 frame:0
          TX packets:4120 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:22065202 (22.0 MB)  TX bytes:227225 (227.2 KB)

root@c22cf0d8e1f7:/# iperf -c 250.0.8.0
multicast ttl failed: Invalid argument
------------------------------------------------------------
Client connecting to 250.0.8.0, TCP port 5001
TCP window size: 45.0 KByte (default)
------------------------------------------------------------
[  3] local 250.0.45.0 port 54274 connected with 250.0.8.0 port 5001
[ ID] Interval       Transfer     Bandwidth
[  3]  0.0-10.0 sec  1.05 GBytes   902 Mbits/sec

And the second machine:
kirkland@masterbr:~⟫ sudo docker run -it ubuntu bash
root@effc8fe2513d:/#  apt update >/dev/null 2>&1 ; apt install -y iperf >/dev/null 2>&1
root@effc8fe2513d:/#  ifconfig eth0
eth0      Link encap:Ethernet  HWaddr 02:42:fa:00:08:00  
          inet addr:250.0.8.0  Bcast:0.0.0.0  Mask:255.0.0.0
          inet6 addr: fe80::42:faff:fe00:800/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1450  Metric:1
          RX packets:7659 errors:0 dropped:0 overruns:0 frame:0
          TX packets:3433 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:22131852 (22.1 MB)  TX bytes:189875 (189.8 KB)

root@effc8fe2513d:/# iperf -s
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 85.3 KByte (default)
------------------------------------------------------------
[  4] local 250.0.8.0 port 5001 connected with 250.0.45.0 port 54274
[ ID] Interval       Transfer     Bandwidth
[  4]  0.0-10.0 sec  1.05 GBytes   899 Mbits/sec


Finally, let's have another long hard look at the image from the top of this post.  Download it in full resolution to study very carefully what's happening here, because it's pretty [redacted] amazing!


Here, we have a Byobu session, split into 6 panes (Shift-F2 5x Times, Shift-F8 6x times).  In each pane, we have an SSH session to Ubuntu 16.04 LTS servers spread across 6 different architectures -- armhf, arm64, i686, amd64, ppc64el, and s390x.  I used the Shift-F9 key to simultaneously run the same commands in each and every window.  Here are the commands I ran:

clear
lxc launch ubuntu-daily:x -p default -p docker
lxc list | grep RUNNING
uname -a
dpkg -l docker.io | grep docker.io
sudo docker images | grep -m1 ubuntu
sudo docker run -it ubuntu bash
 apt update >/dev/null 2>&1 ; apt install -y net-tools >/dev/null 2>&1
 ifconfig eth0
 exit

That's right.  We just launched Ubuntu LXD containers, as well as Docker containers against every Ubuntu 16.04 LTS architecture.  How's that for Ubuntu everywhere!?!

Ubuntu 16.04 LTS will be one hell of a release!

:-Dustin

Printfriendly