Accelerated Velocity: Optimizing the SDLC with DevOps

Note: this article is part 9 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

Anyone who’s run a team knows there’s always more to do and pressure to do even more than that.  Most managers will respond by asking for more workers and/or make the existing workers put in more hours.  I’ve been there myself, and learned the hard way that there is a better, faster and cheaper way: focus on efficiency first.  Let’s look at why.

For clarity and simplicity, I’ll frame this in a simple mathematical representation:

[ total output ] = [ # workers ] x [ hours worked ] x [ output / worker-hour (i.e. efficiency) ]

In other words, output is a factor of the number of people working, the amount of time they work, and their efficiency.  So if we want to increase total output, we can:

  1. Increase the number of workers.
  1. Increase hours worked.
  1. Increase efficiency.

Pretty straightforward so far, but when we start pulling the threads we quickly discover that the three options are not equal in their potential impact.  The problem with #1 is that it is expensive, has a long lead time and, unless the organization is very mature, generally results in a sub-linear increase in productivity (since it also adds additional management and communication load).  #2 may be ok for a short time, but it’s not sustainable and will quickly lead to a significant loss in efficiency as people burn out, start making expensive mistakes and eventually quit (which exacerbates #1). 

That leaves us with #3.  Efficiency improvements are linear, have an immediate impact and are relatively cheap when compared to the other options.  This is where the focus should be.

Really?  It can’t be that simple…

It is.  Let’s look at a real-world example.  You have a 20 person software development team.  To double the output you could hire an additional 22-25 FTE (Full Time Equivalents) and start seeing increased velocity in maybe 3-6 months.  (Why not just 20?  Because the you also need to hire more managers, supporting staff, etc.  You also have to account for the additional burden of communication.  That’s why this is non-linear.) 

You could ask them to work twice as many hours, but very quickly you’ll find yourself processing the flood of resignation letters.  Let’s cross this off the list.

Or you could ask each person on the team to spend 10% of their time focussing on tools, techniques, frameworks and other boosts to efficiency.  If done right, you’ll start seeing results right away and in short order you can cut the development cycle time by as much as half.  In effect you’ve doubled efficiency (and therefore output) for the equivalent of 2 FTE (20 people x 10%).  In economic terms, this would be 10:1 leverage. 

Not bad.  This is why I recommend always focussing on efficiency first.

(This isn’t to say that growing the team is the wrong thing to do – Google wouldn’t be doing what Google does with 1/10 the engineers – but this is an expensive and long-term strategy. Team growth is not a substitute for an intense focus on efficiency.  Adding a bunch of people to an inefficient setup is a good way to spend a lot of money with low ROI.)

So what would the team focus on to boost efficiency?  The list is long and includes both technical (reducing build times, Don’t Repeat Yourself (DRY), etc.) and non-technical (stop wasting time in stupid meetings) topics.  All are valid and should be addressed (especially stupid meetings and time management), but in this article I’m going to focus on leverage through automation and software driven infrastructure; in other words: devops.

DevOps

Ask ten people and you’ll get ten different answers as to what devops is.  I’m less interested in dogmatic purity and more in the foundational elements of devops that drive the benefits, which to me are:

  1. Tools: an intense focus on increased efficiency through DRY / automation.
  1. Culture: increased efficiency through eliminating arbitrary and stupid organization boundaries.

Bonial understood this.  When I arrived in 2014, nearly all of the major build and deployment functions were fully scripted and supported by a fully integrated and aligned “devops” team (though it wasn’t called that at the time), which had even gone so far as to enable interactions with the scripts via a cool chat bot named Marvin.  Julius, Alexander and others on the devops team were wizards with automation and were definitely on the cutting edge of this evolving field.  For the most part we had a full CI capability in place.

Unfortunately, future gains were largely blocked by code and environmental constraints.  No amount of devops can solve problems created by code monoliths and limited hardware.  So, as described in other articles in this series, we invested heavily in breaking up the monolith.  It was painful, but opened up many pathways for team independence, continuous delivery and moving to cloud infrastructure.  After we’d broken apart and modularized the code monolith, we moved into AWS which created further challenges. On one hand we wanted everybody to make full use of the cloud’s speed and flexibility. On the other hand it was important to ensure governance processes for e.g. cost control and security. We balanced those requirements with infrastructure as code (IaC) and we standardized on a few automation platforms (Spinnaker, Terraform, etc.) but let teams customize their process to meet their needs.  At this point, the central “devops” team became both a center-of-excellence and a training and mentoring group.   Our foundation in automation enabled us to very rapidly embrace and adopt IaaS and explore server-less and container approaches.  It took some time to settle on which automation frameworks would best meet our needs, but once that was done the we could spin up entire environments in minutes, starting with only some scripts and ending up with a fully running stack. Due to the sheer speed of changes, adopting a “You Own It You Run It” (YOIYRI) approach and moving more responsibilities into the teams came naturally. All those changes took us to a whole new level.  

SDLC

The natural next step was to address one of the most painful and persistent bottlenecks in our previous software development lifecycle (SDLC) – development and test environments (“stages”).  Previously, all of the development teams had had to share access to handful of “stage” environments, only one of which was configured to look somewhat like production.  This created constant conflict between teams who were jockeying for stage access and no-win choices for me on which projects to give priority access. 

Automation on AWS changed the game for the SDLC. Now instead of a single chokepoint throttling every team, each team could spin up its own environment and we could have dozens of initiatives being developed and tested in parallel.  (We also had to learn to manage costs more effectively, especially when people forgot to shut off their stage, but that’s another story…) 

We still had the challenge of dependencies between teams – e.g. what if the web team needed a specific version of a certain API owned by another team to test a new change.  One of our engineers, Laurent, solved this by creating a central repository and web portal (“Stager”) that any developer could use to spin up a stage with any needed subsystems running inside.  

As it stands today, the “Stager” subsystem has become a critical piece our of enterprise; when its down or not working properly people make noise pretty quick.  As befits its criticality, we’ve assigned dedicated people for focus purely on ensuring that its up and running and continually evolving to meet our needs.  Per the math above, the leverage is unquestionable and investing in this area is a no-brainer.

Closing Thoughts

  • It’s simple: higher efficiency = more output
  • Automation leads to efficiency
  • Breaking down barriers between development and ops leads to efficiency
  • Invest in devops

Many thanks to Julius for both his contributions to this article and, more importantly, the underlying reality.

What a difference a decade makes…

I frequently fly transatlantic as part of my job.  Over the past few years I’ve been excited to see airlines (Delta, Lufthansa, Air Berlin) begin to offer two things: (1) in seat AC power and (2) internet access throughout the flight.  Now I can run my laptop the entire flight and, or daytime flights, stay connected with my team back in Berlin.
 
Last week I was fortunate to be rerouted from a Delta codeshare KLM fight (no power, no internet) onto Lufthansa (power, internet).  On the daytime flight from Frankfurt to Chicago I spent nine hours of blissful time catching up on a ton of work that required online access.  I was able to slack with my team the whole time, send emails, and work on shared documents.  At one point, I was working on a prototype of a voice assistant project – the IDE was running on my laptop and deploying code to Heroku, I was using API.AI to develop the natural language interface, and used Amazon Alexa ADK to generate sample Alexa calls.  Traffic was constantly flowing between all of the nodes.  All from my seat on the plane.
 
Ten years ago we didn’t have smart phones.  We were just a few years past modems.  Streaming media was mostly a dream. There certainly wasn’t wifi on planes.
 
The jury is still out whether I’ll miss the eight hours of uninterrupted quiet time on planes bing-watching of movies that I probably didn’t want to see – there’s certainly something to be said for being unplugged.  But I sure as heck like the option to stay connected.
 
What a difference a decade makes.  It makes me wonder what the net decade will bring.  Can’t wait – should be a wild ride.

Hello World…

Do you remember the point at which you knew you’d crossed over from being a kid to a grown up?  When you realized that you were now one of those older / wiser / more successful professionals that used to be a league apart?  
 
I have to admit it was bit of a shock when I slowly woke up to the fact that I was now “that” person.  I didn’t really feel any different, but everyone else seemed to think I was.  And the truth is I was different.  I knew things that others hadn’t learned yet, my instincts were honed by successes and failures, I had a wealth of experiences to draw on and share, and I could connect the dots better simply because I had more dots to connect.
 
With this in mind I have a lot of mentors to thank for sharing their experiences and knowledge with me, and I owe it to the next generation to share what I’ve learned.  Hence this blog.
 
So what have I learned?  Well, for starters, a bit about technology.  After studying biochem and serving as a submariner I moved into the software space where I’ve run the table from coder to CTO.  I know software, hardware, and networks as well as data storage, transfer and analysis.  I’ve been fortunate to be heavily involved in business activities, including one IPO, so I have a working knowledge of decidedly non-technical topics like sales and marketing, financial modeling, business operations and product management.  Finally as I’ve been in leadership roles for my entire adult life, I’ve developed a deep appreciation for the fact that businesses, even tech businesses, are about the people first (and the people that lead them).
 
This will be an “agile” blog, adapting as needed given the events in tech and markets.  There will be some war stories, opinions, lessons learned and current events.  I’ll break them down into core themes – tech, leadership, business, general – so you can follow whichever threads interest you the most.  
 
So please subscribe to the blog, comment often with questions or your own experiences, and enjoy!