Accelerated Velocity: Clarifying Processes and Key Roles

In a previous article I argued that great people are needed in order to get great results. To be clear, this theorem is asymmetric: great people don’t guarantee great results. Far from it – the history of sports, business and military is littered with the carcasses of “dream teams” that miserably underperformed.

No, there are several factors that need to be in place for teams to excel. The ability to take independent action, discussed in the previous article, is one of those factors. I’ll discuss others over the next few articles, starting here with clarity around processes and roles.

Even the best people have trouble reaching full potential if they don’t know what’s expected of them. True, some people are capable of jumping in and defining their own roles, but this is rare. Most will become increasingly frustrated, not knowing on any given day what’s expected of them and what they need to do to succeed.

People in teams also need to understand the conventions for how best to work with others. How they plan, collaborate, communicate status, and manage issues all play a part in defining how effective the team is. Too much, too little, or too wrong, and a high potential team will find itself hobbled.

The same applies to teams and teams-of-teams. Teams need clarity about their role within the larger organization. They also need common processes to facilitate working together in pursuit of common goals.

Popular software development methodologies provide the foundation for role and process clarity, with the “agile” family of methodologies being the de-facto norm. These frameworks typically come with default role definitions (e.g. scrum master, product owner) as well as best practices around processes and communications. When applied correctly they can be powerful force multipliers for teams, but adopting agile is not a trivial exercise.  In addition, these frameworks only cover a portion of the clarity that’s needed.

Bonial’s Evolution

Bonial in 2014 was maturing as an agile development shop, but there were gaps in role definitions, team processes, and inter-team collaboration that suppressed the team’s potential. Fortunately Bonial has always had an abundance of kaizen – restlessness and a desire to always improve – so people were hungry to change. No-one was particularly happy with the status quo and there was a high willingness to invest in making things better.

We rolled up our sleeves and got started…

We attacked this challenge along multiple vectors. First, we needed a process methodology that would not only guide teams but also provide tools for inter-team coordination and portfolio management. The product and engineering leadership teams chose the Scaled Agile Framework (SAFe) as as over-arching team, program and portfolio management methodology. It was not the perfect framework for Bonial but it was good enough to start with and addressed many of the most pressing challenges.

Second, we spent time more clearly defining the various agile roles and moving responsibilities to the right people. We started with the very basics as broken down in the following table:

Area of Responsibility Role Name (Stream / Team) Notes
What? Product Manager, Product Owner  Ensures the the team is “Doing the right things”
Who and When? Engineering Manager, Team Lead  Ensures that the team is healthy and “Doing things right” while minimizing time to market
How? Architect, Lead Developer  Ensures that the team has architectural context and runway and is managing tech debt  

We created general role definitions for each position, purposely leaving space and flexibility for the people and teams to adapt as appropriate.  (I know many agile purists will feel their blood pressure going up after reading the table above, but I’m not a purist and this simplicity was effective in getting things started.) 

A quick side note here. One of the unintended consequences of any role definition is that they tend to create boxes around people. They become contracts where responsibilities not explicitly included are forbidden and the listed responsibilities become precious territory to guard and protect. I hate this, so I emphasized strongly that (a) role definitions are guidelines, not hard rules, and (b) the responsibility for mission success lies with the entire team, so it’s ok to be flexible so long as everything gets done.

Third, we augmented the team. We hired an experienced SAFe practitioner to lead our core value streams, organize and conduct training at all levels, and consult on best practices from team level scrum to enterprise level portfolio management. This was crucial; the classroom is a great place to get started, but it’s the day-to-day practice and reinforcement that makes you a pro.

Finally, we placed a lot of emphasis on retrospectives and flexibility. We learned and continually improved. We tried things, keeping those that succeeded and dropping those that failed. Over time, we evolved a methodology and framework that fit our size, culture and mission, eventually driving the massive increases in velocity and productivity that we see today.

Team Leads

There was one more big role definition gap that was causing a lot of confusion and that we needed to close: who takes care of the teams? While agile methodologies do a good job of defining the roles needed to get agile projects done, they don’t define roles needed to grow and run a healthy organization. For example, scrum has little to say regarding who hires, nurtures, mentors, and otherwise manages teams. Those functions are critical and need a clear home.

In Bonial engineering, we put these responsibilities on the “team lead” role. This role remains one of the most challenging, important and rewarding roles in Bonial’s engineering organization and includes the following responsibilities:

  • People
    • Recruiting
    • Personal development
    • Compensation administration
    • Morale and welfare
    • General management (e.g. vacation approvals)
    • Mentoring, counseling and, if needed, firing
  • Process
    • Effective lean practices
    • Efficient horizontal and vertical communications
    • Close collaboration with product owner (PO)
  • Technology
    • Architectural fitness (with support from the architecture team)
    • Operational SLAs and support (e.g. “On call”)
    • “Leading by example” – rolling up sleeves and helping out when appropriate
  • Delivery
    • Accountable for meeting OKRs
    • Responsible for efficient spend and cost tracking

That’s an imposing list of responsibilities, especially for a first-time manager. We’d be fools to thrust someone into this role with no support, so we start with an apprenticeship program – where possible, first time leads shadow a more experienced lead for several months, only taking on individual responsibilities when they’re ready. We also train new leads in the fundamentals of leadership, management and agile, and each lead has active and engaged support from their manager and HR. Finally, we give them room to both succeed, fail and learn.

So far this model has worked well. People tend to be nervous when first stepping into the role, but over time become more comfortable and thrive in their new responsibilities. The teams also appreciate this model. In fact, one of the downsides has been that it’s difficult to recruit into this role since it contains elements of traditional scrum master, team manager and engineering expert – a combination that is rare in the market. As such, we almost always promote into the role.

Closing Thoughts

In the end we know that no one methodology (or even a mashup of methodologies) will satisfy every contingency. To that end there are two important principles underpinning how we operate: flexibility and ownership. If something needs to be done, do it. Its great if the person who is assigned a given role does a full and perfect job, but in the end success is everyone’s responsibility, so it’s not an excuse if they can’t or won’t do it.

Some closing thoughts:
• People need to understand their roles and the expectations put on them to be most effective.
• Teams need to have a unifying process to facilitate collaboration and avoid chaos and waste.
• The overarching goal is team success; all members of the team should have that as their core role description.
• Flexibility is key. Methodologies are a means to an end, not the ends themselves.

Accelerated Velocity: Enabling Independant Action

Inefficiency drives me crazy.  Its like fingernails on a chalkboard.  When I’m the victim of an inefficient process, I can’t help but stew on the opportunity costs and become increasingly annoyed.  This sadly means I’m quite often annoyed since inefficiency seems to be the natural rest state for most processes.

There are lots of reasons why inefficiency is the norm, but in general they fall into one of the following categories:

1) Poor process design

2) Poor process execution

3) Entropy and chance

4) External dependencies

The good news in software development is that Lean/agile best practices and reference implementations cover process design (#1).  Process execution (#2) can likewise be helped by hiring great people and following agile best practices.  Entropy (#3) can’t, by definition, be eliminated but the effects can be mitigated by addressing the others effectively.

Which leaves us with the bane of efficient processes and operations: dependencies (#4). 

Simply put, a dependency is anything that needs to happen outside of the process/project in question in order for the process/project to proceed or complete.  For example, a software project team may require an API from another team before it can finish its feature.  Likewise a release may require certification by an external QA team before going to production.  In both cases, the external dependency is the point where the process will likely get stuck or become a bottleneck, often with ripple effects extending throughout the system.  The more dependencies, the more chances for disruption and delay.

So how does one reduce the impact of dependencies?

The simplest way is to remove the dependencies altogether.  Start by forming teams that are self-contained, aligned behind the same mission, and ideally report to the same overall boss.  Take, for example, the age-old divisions between product, development, QA, and operations.  If these four groups report to different managers with different agendas, then the only reasonable outcome will be pain.  So make it go away!  Put them all on the same team. Get them focussed on the same goals.  Give them all a stake in the overall success.

Second, distribute decision making and control.  Any central governance committee will be a chokepoint, and should only exist when (a) having a chokepoint is the goal, or (b) when the stakes are so high that there are literally no other options.  Otherwise push decision-making into the teams so that there is no wait time for decisions.  Senior management should provide overall strategic guidance and the teams should make tactical decisions.  (SAFe describes it well here.)

In 2014, Bonial carried a heavy burden of technical and organization dependencies and the result was near gridlock. 

At the time, engineering was divided into five teams (four development teams and one ops team), and each team had integrated QA and supporting ops.  So far, so good.  Unfortunately, the chokepoints in governance and the technical restrictions imposed by a shared, monolithic code-base effectively minimized independent action for most of the teams, resulting in one, large, inter-connected mega-team.

There was a mechanism known as “the roadmap committee” which was nominally responsible for product governance, but in practice it had little to do with roadmap and more to do with selective project oversight.  One of the roadmap committee policies held that nothing larger than a couple of days was technically allowed to be done without a blessing from this committee, so even relatively minor items languished in queues waiting for upcoming committee meetings.   

What little did make it through the committee ran directly into the buzzsaw of the monolith.  Nearly all Bonial software logic was embedded in a single large executable called “Portal3”.  Every change to the monolith had to be coordinated with every other team to ensure no breakage.  Every release required a full regression test of every enterprise system, even for small changes was on isolated components.   This resulted in a 3-4 day “release war-room” every two weeks that tied down both ops and the team unfortunate enough to be on duty. 

It was painful.  It was slow.  Everyone hated it.

We started where we had to – on the monolith.  Efforts had been underway for a year or more to gradually move functionality off of the beast, but it became increasingly clear with each passing quarter that the “slow and steady” approach was not going to bear fruit in a timeframe relevant to mere mortals. So our lead architect, Al, and I decided on a brute force approach: we assembled a crack team which took a chainsaw to the codebase, broke it up into reasonably sized components, and then put each component back together. Hats off to the team that executed this project – wading through a spaghetti of code dependencies with the added burden of Grails was no pleasant task.  But in a few months they were done and the benefits were felt immediately.

The breakup of the monolith enabled the different teams to release independently, so we dropped the “integrated release” process and each team tested and released on their own.  The first couple of rounds were rough but we quickly hit our stride.  Overall velocity immediately improved upon removing the massive waste of the dependent codebase and labor-intensive releases.

The breakup of the monolith also untethered the various team roadmaps, so around this time we aligned teams fully behind discreet areas of the business (“value streams” in SAFe parlance). We pushed decision making into the teams/streams, which became largely responsible for the execution of their roadmap with guidance from the executive team.  The “roadmap committee” was disbanded and strategic planning was intensified around the quarterly planning cycle.   It was, and still is, during the planning days each quarter that we identify, review and try to mitigate the major dependencies  between teams.  This visibility and awareness across all teams of the dependency risk is critical to managing the roadmap effectively.

Eventually we tried to take it to the next level – integrating online marketing and other go-to-market functions into vertically aligned product teams – but that didn’t go so well.  I’ll save that story for another day.

The breakup of the monolith and distribution of control probably had the biggest positive impact in unleashing the latent velocity of the teams.  The progress was visible.  As each quarter went by, I marveled at how much initiative the teams were showing and how this translated into increased motivation and velocity. 

To be sure, there were bumps and bruises along the way.  Some product and engineering leaders stepped up and some struggled.  Some teams adapted quickly and some resisted.  Several people left the team in part because this setup required far more initiative and ownership than they were comfortable with.  But in fairly short order this became the norm and our teams and leaders today would probably riot if I suggested going back to the old way of doing things.

Some closing thoughts:

  • Organize teams for self-sufficiency and minimal skill dependencies
  • Minimize or eliminate monoliths and shared ownership
  • Keep the interface as simple, generic and flexible as possible when implementing shared systems (e.g. APIs or backend business systems) 
  • Build transparent about dependencies and manage them closely

Accelerated Velocity: Growth Path

I recently heard that the average tenure engineers in tech companies is less than two years.  If true, it’s a mind-boggling critique on the tech industry.  What’s wrong with companies that can’t retain people for more than a year or two?  Seriously – who wants to work for a team where people aren’t around long enough to banter about the second season of Westworld?

I know there are many factors in play, especially in hot tech markets, but there’s one totally avoidable fault that is all too common: being stupid with growth opportunities. 

Software engineering is one of those fields where skills often increase exponentially with time, especially early in a career.  Unfortunately businesses seem loath to account for this growth in terms of new opportunities or increased compensation.   For example, companies set salaries at the time of hire and this is what the employee is stuck with for their tenure at the company – with the exception perhaps of an annual cost of living increase.  At the same time, the employee is gaining experience, adding to their skills portfolio, and generally compounding their market value.  Within a year or two the gap between their new market value and their actual compensation has grown quite large.  As most business shudder at the idea of giving large raises on a percentage basis, the gap continues to grow and the employee eventually makes the rational decisions to move to another company that will recognize their new market value, leaving the original company with en expensive gap in their workforce and massive loss in knowledge capital.

In addition, many companies take a highly individualist approach to compensation with a goal of getting maximum talent for the lowest price.  While this is textbook MBA, it fails in practice simply because it doesn’t take into account human psychology around relative inequality: when people feel they are not being treated fairly they get demotivated.  This purely free-market approach leads to a situation in which people doing the same work have massive disparities in compensation simply because some people are better negotiators than others.  The facts will eventually get out, leaving the person on the low end bitter and both people feeling like they can’t trust their own company.  This is a failing strategy in the long term.

This is what I’ve seen at most companies I’ve been in or around, and this was essentially the situation at Bonial in 2014.  There was a very high variance in compensation – on the extreme end we had a cases in which developers were being paid half the salary of other developers on the same team despite similar experience and skills.  Salaries were also static – the contract salary didn’t change unless the employee mustered the courage to renegotiate the contract.  The negotiation sessions themselves were no treat for either the employee or their manager – in the absence of any framework they were essentially contests of wills, generally leaving both parties unsatisfied.

So we set out to develop a system that would facilitate a career path and maintain relative fairness across the organization.  We modeled it on a framework I’d developed previously which can be visualized as follows:

Basically, as a person gains experience (heading from bottom-left to top-right) they earn the chance to be promoted, which comes with higher compensation but also higher expectations.  They can also explore both technical specialist and management tracks as they become more senior, and even move back and forth between them.

The hallmarks of this system are:

    1. Systematic: Compensation is guided by domain skills – actual contributions to the business and market value – not on negotiation skills. 
    2. Fair: People at the same career/skill level will be compensated similarly.
    3. Regular: Conversation about career level and compensation happens at least once per year, initiated by the company. 
    4. Motivational: People have an understanding of what they need to demonstrate to be promoted. 
    5. Flexible: People have three avenues for increased compensation:
      • Raises – modest boosts in compensation for growth within their current career level based on solid performance.  This happens in between promotions.
      • Promotions – increases to compensation based on an employee qualifying for the next career level (with increased expectations and responsibilities).  This is where the big increases are and what everyone should be striving for.
      • Market increases – increases due to adjustment of the entire salary band based on an evaluation of the general market.

From a management perspective, this system also has some additional upsides:

  • Easy to budget.  Instead of planning with names and specific salaries, one can build a budget based on headcount of certain skill/levels. 
  • Easy to adjust.  If the team decides it needs a mobile developer or a test automator instead of a backend developer, for example, it simply trades one of it’s authorized positions for one of a similar value.  Likewise it can shift around seniority as needed to meet its goals.
  • Mechanism for feedback.  By reserving promotions and raises for the deserving contributors, this system provides an implicit feedback mechanism.

So far the system seems to be working well at Bonial, measured as much by what isn’t happening as what is.  For example, people who have left the team seldom call out compensation as their primary motivator.  We’ve also had few complaints about people feeling they are not being paid fairly compared to their peers.  

As a side note, we conduct regular employee satisfaction surveys and ask how employees feel about their compensation.  Interestingly, their responses on their feeling about compensation vs market do not strongly correlate with their overall satisfaction.  What does correlate?  Their projects, the tech they work with, their growth opportunities, the competence of their team mates, and their leads.  So these are the areas we have and will continue to invest in.

Some closing thoughts:

  • Professionals want to know they are being compensated fairly both within the company and within the market.  That way they can focus on what they’re creating, not be worried about their pay.
  • Professionals want the opportunity to grow and to be recognized (and rewarded) for their growth.  Providing a growth path inside the company improves employee retention and reduces costs related to talent flight.
  • Compensation is an asymmetric demotivator.  Low or unfair compensation will demotivate, but overly high compensation isn’t generally a motivator.  So make sure you’re out of the “demotivating” range and then focus on key motivators, especially in the area of day-to-day satisfaction.

Accelerated Velocity: Situational Awareness

“If a product or system chokes and it’s not being monitored, will anyone notice?”  Unlike the classic thought experiment, this tech version has a clear answer: yes.  Users will notice, customers will notice, and eventually your whole business will notice. 

No-one wants their first sign of trouble to be customer complaints or a downturn in the business, so smart teams invest in developing “situational awareness.” What’s that?  Simple – situation awareness is the result of having access to the tools, data and information needed to understand and act on all of the moving factors relating to the “situation.”  This term is often used in the context of crisis situations or other fast-paced, high-risk endeavors, but it applies to business and network operations as well.

Product development teams most definitely need situational awareness.  The product managers and development leads need to know what their users are doing and how their systems are performing in order to make wise decisions – for example, should the next iteration focus on features, scale or stability.  Sadly, these same product teams often see the tracking and monitoring that is needed for developing situational awareness as “nice-to-have’s” or something to be added when the mythical “someday” arrives. 

The result?  Users having good or bad experiences and no-one knowing either way.  Product strategy decisions being made on individual bias, intuition and incomplete snippets of information.  Not good.

Sun Tzu put it succinctly:

“If you know neither the enemy nor yourself, you will succumb in every battle.”

Situational awareness is a huge topic, so in this series I’m going to limit my focus to data collection (tracking and monitoring) and insights (analytics and visualization) at the product team level.  For the purposes of this series I’ll define ”tracking” as the data and tools that show what users/customers are doing and “monitoring” as the data and tools that focus on systems stability are performance.  Likewise I’ll use “analytics” to refer to tools that facilitate the conversion of data into usable intelligence and “visualization” as the tools for making that intelligence available to the right people at the right time.  I’ll cover monitoring in this article and tracking in a later article.

At Bonial in 2014 there was a feeling that things were fine – the software systems seemed to be reasonably stable and the users appeared happy.  Revenue was strong and the few broad indicators viewed by management seemed healthy.  Why worry?   

From a system stability and product evolution perspective it turns out there was plenty of reason to worry.  While some system-level monitoring was in place, there was little visibility into application performance, product availability or user experience.  Likewise our behavioral tracking was essentially limited to billing events and aggregated results in Google Analytics.  Perhaps most concerning: one of the primary metrics we had for feature success or failure was app store ratings.  Hmmm.

I wasn’t comfortable with this state of affairs.  I decided to start improving situational awareness around system health so I worked with Julius, our head of operations, to lay out a plan of attack.  We already had Icinga running at the system level as well as DataDog and Site24x7 running on a few applications – but they didn’t consistently answer the most fundamental question: “are our users having a good experience?” 

So we took some simple steps like adding new data collectors at critical points in the application stack.  Since full situation awareness requires that the insights be available to the right people at the right time, we also installed large screens around the office that showed a realtime stream of the most important metrics.  And then we looked at them (a surprisingly challenging final step). 

The Bonial NOC Monitor Wall
One of my “go to” overviews of critical APIs, showing two significant problems during the previous day.

The initial results weren’t pretty.  With additional visibility we discovered that the system was experiencing frequent degradations and outages.  In addition, we were regularly killing our own systems by overloading them with massive online marketing campaigns (for which we coined the term: “Self Denial of Service” or SDoS).  Our users were definitely not having the experience we wanted to provide.

(A funny side note: with the advent of monitoring and transparency, people started to ask: “why has the system become so unstable?”)

We had no choice but to respond aggressively.  We set up more effective alerting schemes as well as processes for handling alerts and dealing with outages.  Over time, we essentially set up a network operations center (NOC) with the primary responsibility of monitoring the systems and responding immediately to issues.  Though exhausting for those in the NOC (thank you), it was incredibly effective.  Eventually we transferred responsibility for incident detection and response to the teams (“you build it you run it”) who then carried the torch forward.

Over the better part of the next year we invested enormous effort into triaging the immediate issues and then making design and architecture changes to fix the underlying problems.  This was very expensive as we tapped our best engineers for this mission.  But over time daily issues became weekly became monthly.  Disruptions became less frequent and planning could be done with reasonable confidence as to the availability of engineers.  Monitoring shifted from being an early warning system to a tool for continuous improvement. 

As the year went on the stable system freed up our engineers to work on new capabilities instead of responding to outages.  This in turn became a massive contributor to our accelerated velocity.  Subsequent years were much the same – with continued investment in both awareness and tools for response, we confidently set and measure aggressive SLAs.  Our regular investment in this area massively reduced disruption.  We would never have been able to get as fast as we are today had we not made this investment.

We’ve made a lot of progress in situational awareness around our systems, but we still have a long way to go.  Despite the painful journey we’ve taken, it boggles my mind that some of our teams still push monitoring and tracking down the priority list in favor of “going fast”.  And we still have blind spots in our monitoring and alerting that allow edge-case issues – some very painful – to remain undetected.  But we learn and get better every time.

Some closing thoughts:

  • Ensuring sufficient situational awareness must be your top priority.  Teams can’t fix problems that they don’t know about.
  • Monitoring is not an afterthought.  SLAs and associated monitoring should be a required non-functional requirement (NFR) for every feature and project.
  • Don’t allow pain to persist – if there’s a big problem, invest aggressively in fixing it now.  If you don’t you’ll just compound the problem and demoralize your team.
  • Lead by example.  Know the system better than anyone else on the team.

 

In case you’re interested, here are some of the workhorses of our monitoring suite:

 

Accelerated Velocity: Building Great Teams

Note: this article is part 3 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

People working in teams are at the heart of every company.  Great companies have great people working in high performing teams.  Companies without great people will find it very difficult to get exceptional results. 

The harsh reality is that there aren’t that many great people to go around.  This results in competition for top talent, which is especially true in tech.  Companies and organizations use diverse strategies in addressing this challenge.  Some use their considerable resources (e.g. cash) to buy top talent though with dubious results – think big corporations and Wall Street banks.  Some create environments that are very attractive to the type of people they’re looking for – think Google and Amazon.  Some purposely start with inexperienced but promising people and develop their own talent – a strategy used by the big consulting companies.  Many drop out of the race altogether and settle for average or worse (and then hire the consulting companies to try to solve their challenges with processes and technology – which is great for the consulting companies).

But attracting talent is only half the battle.  Companies that succeed in hiring solid performers then have to ensure their people are in a position to perform, and this brings us to their teams.  Teams have a massive amplifying affect on the quantity and quality of each individual’s output.  My gut tells me that the same person working on two different teams may be 2-3X as productive depending on the quality of the team. 

So no matter how good a company is at attracting top talent, it then needs to ensure that the talent operates in healthy teams. 

What is a healthy team?  From my experience it looks something like this:

  • Competent, motivated people who are…
  • Equipped to succeed and operate with…
  • High integrity and professionalism…
  • Aligned behind a mission / vision

That doesn’t seem too hard.  So why aren’t healthy teams the norm?  Simple: because they’re fragile.  If any of the above pieces are missing, the integrity of the team is at risk.  Throw in tolerance for low performers, arrogant assholes, and whiners, mix in some disrespect and fear, and the team is broken.

(Note that the negatives influences outweigh the positives – as the proverb says: “One bad apple spoils the whole bushel.”  If you play sports you know this phenomenon well – a team full of solid players can easily be undone by a single weak link that disrupts the integrity of the team.)

This leads me to a few basic rules I follow when developing teams:

  1. Provide solid leadership
  2. Recruit selectively
  3. Invest in growth and development
  4. Break down barriers to getting and keeping good people
  5. Aggressively address low-performance and disruption

Bonial had a young team with a wide range of skill and experience in 2014.  Fortunately many of the team members had a bounty of raw talent and were motivated (or desired to be motivated).  Unfortunately there were also quite a few under-performers as well as some highly negative and disruptive personalities in the mix.  The combination of inexperience, underperformance and disruption had an amplifying downward effect on the teams.

To build confidence and start accelerating performance we needed to turn this situation around.  We started by counseling and, if behavior didn’t change, letting go the most egregiously low performers and disruptive people – not an easy thing to do and somewhat frowned upon in both the company and in German culture.   But the cost of keeping them on the team, thereby neutralizing and demoralizing the high performers, was far higher than the pain and cost of letting them go. 

(A quick side note: there were concerns among the management that letting low-performers go would demoralize the rest of the team.  Not surprisingly, quite the opposite happened – the teams were relieved to have the burdens lifted and were encouraged to know that their leads were committed to building high performing teams.)

We started doing a better job of mentoring people and setting clear performance goals.  Many thrived with guidance and coaching; some didn’t and we often mutually decided to part ways.  Over time the culture changed to where low performance and negativity were no longer tolerated.

At the same time we invested heavily in recruiting.  We hired dedicated internal recruiters specifically focussed on tech recruits.  We overhauled our recruiting and interview process to better screen for the talent, mentality and personality we needed.  We added rigor to our senior hiring practices, focussing more on assessing what the person can do vs what they say they can do.  And we added structure to the six month “probation” period, placing and enforcing gates throughout the process to ensure we’d hired the right people.  Finally, we learned the hard way that settling for mediocre candidates was not the path to success; it was far better to leave a position unfilled than to fill it with the wrong person.

How did we attract great candidates?  We focussed on our strengths and on attracting people who valued those attributes: opportunities for growth, freedom to make a substantial impact, competent team-mates, camaraderie, a culture of respect, and exposure to cutting-edge technologies.  Why these?  Because year over year, though employee satisfaction survey and direct feedback, we find these elements correlate very strongly with employee satisfaction, even more so than compensation and other benefits.  In short, we’ve worked hard to create an environment where our team-mates are excited to come to work every day.

(This is not to say we ignored competitive compensation; as I’ll describe in a later post, we also worked to ensure we paid a fair market salary and then provided a path for increasing compensation over time with experience.)

Over time, as our people became more experienced, our processes matured and our technology set became more advanced, Bonial became a great place for tech professionals to sharpen their skills and hone their craft.  New team members brought fresh ideas and at the same time had an opportunity to learn both by what we already had as well as what they helped create.  The result is what we have today: a team of teams full of capable professionals who are together performing at a level many times higher than in 2014

Some closing thoughts:

  • You’re only as good as the people on the teams.
  • Nurture and grow talented people. Help under-performers to perform. Let people go when necessary.
  • Get really good at recruiting.  Focus on what the candidate will do for you vs what they claim to have done in the past.
  • Don’t fall into the trap of believing process and tools are a substitute for good people.

Footnote: If you haven’t yet, I suggest your read about Google’s insightful research on team performance and how  ”psychological safety” is critical to developing high performing teams. 

Accelerated Velocity: Building Leaders

Note: this article is part 2 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

Positive changes require a guiding hand.  Sometimes this arises organically from a group of like-minded people, but far more often there’s a motivated individual driving the change.  In short – a leader.

Here’s the rub: the tech industry is notoriously deficient in developing leaders. Too often the first step in a leader’s journey starts when their manager leaves and they’re blessed with a dubious promotion… “Congratulations, you’rein charge now.  Good luck.”  If they’re fortunate their new boss is an experienced leader and has time to mentor them.  In a larger organization they may have access to some bland corporate training on how to host a meeting or write a project plan.  But the vast majority of people thrust into leadership and management roles in tech are largely left to their own devices to succeed.

Let me pause for a moment and highlight a subtle but important point: leadership and management are different skills.  Leadership is creating a vision and inspiring a group of people to go after the vision; management is organizing, equipping, and caring for people as well as taking care of the myriad details needed for the group to be successful.  There’s some overlap, and an effective leader or manager has competence in both areas, but they require different tools and a different mindset. This article is focussed on the leadership component.

So what does it take to develop competent and confident leaders?  When I look at some of the best-in-class “leadership-centric organizations” – militaries and large consulting companies for example – I see the following common elements:

  1. Heavy up-front investment in training
  2. High expectations of the leaders
  3. An reasonably structured environment in which to learn and grow
  4. A continuous cycle in which role models will coach and mentor the next generation

How did this look at Bonial?

Upon arriving I inherited a 40-ish strong engineering organization broken up into five teams, each headed by a “team lead.” The problem was that these team leads had no clear mandate or role, little or no leadership and management training, and essentially no power to carry out a mandate even if they’d had one.

This setup was intended to keep the organization flat and centralize the administrative burden of managing people so as to allow the team leads to focus on delivery. Unfortunately this put the leads in a largely figurehead role – they represented their teams and were somehow responsible for performance but had few tools to employ and little experience with which to effectively deploy them. They didn’t hire their people, administer compensation or manage any budgets. In fact, they couldn’t even approve vacations.  To this day it’s not clear to me, or them, what authority or responsibility they had.

This arrangement also created a massive chokepoint at the center of the organization – no major decisions were made without approval from “above”. The results were demoralized leads and frustrated teams.

Changing this dynamic was my first priority.  To scale our organization we’d need to operate as a federation of semi-autonomous teams, not as a traditional hierarchical organization.  For this we needed leads who could drive the changes we’d make over the coming years, but this would require a major shift in mindset.  After all, if I couldn’t trust them to approve vacations, why should I trust them with driving ROI from the millions of euros we’d be investing in their teams?  Engineers have the potential to produce incredibly valuable solutions; ensuring they have solid leadership is the first and most important responsibility of senior management.

We started with establishing a clear scope of responsibility and building our toolbox of skills.  I asked the leads if they were willing to “own” the team results and, though a little nervous, most were willing.  This meant they would now make the calls as to who was on the team and how those people were managed. They took over recruiting and compensation administration. They played a much stronger role in ensuring the teams had clarity on their mission and how the teams executed the mission. They received budgets for training, team events and discretionary purchases. And, yes, they even took responsibility for approving vacations.

We agreed to align around the leader-leader model espoused by David Marquet (https://www.davidmarquet.com/) in his book “Turn the Ship Around!  We read the book together and discussed the principles and how to apply them in daily practice.  The phrase “I intend to…” was baked into our vocabulary and mentality.  We eliminated top-down systems and learned to specify goals, not methods.  We focussed on achieving excellence, not just avoiding errors.  The list goes on.

I also started a “leadership roundtable” – 30 minutes each week where we’d meet in a small group and discuss experiences and best practices around core leadership and management topics: motivating people and teams, being effective, basic psychology, communicating, coaching and mentoring, discipline, recruiting, personal organizational skills, etc.  Over time, dozens of people – ranging from prospective team leads to product managers to people simply interested in leadership and management – participated in the roundtables, giving us a common foundation from which to work.

As I’ll share in a future article, we also created a career growth model that fully supported a management track as well as a technical track and, most importantly, the possibility to move back and forth freely between the two.  We encouraged people to give management a try and offered mentoring and support plus the risk-free option of being able to switch back to their former role if they preferred.  In the early days this was a tough sell – “team lead” had the reputation of being mostly pain with little upside.  Never-the-less a few brave souls gave it a shot and, to their surprise, found it rewarding (and have since grown into fantastic leads).

It wasn’t easy – we had a fair share of mistakes, failures and redos – but the positive effects were felt almost immediately. Over time this first generation of leads grew their teams and created cultures of continuous improvement. As the teams grew, the original leads mentored new leaders to take over the new teams and the cycle continued. As it stands today we have a dozen or so teams/squads led by capable leaders that started as software engineers, quality assurance pros, system engineers, etc. 

For what it’s worth, I believe the number one factor driving Bonial’s accelerated velocity was growth in leadership maturity. If you’re looking to engineer positive change, start here.

Some closing thoughts:

  • Positive change requires strong leadership.
  • A single leader can start the change process, but large-scale and enduring change requires distributed leadership (e.g. ”leader-leader”).
  • Formal training can be a great source of leadership and management tools, but mastering those tools requires time, a safe and constructive environment and active coaching and mentoring.
  • Growing a leadership team is not a linear or a smooth process.  The person driving and guiding the development must commit to the long game and must be willing to accept accountability for inconsistent results from first generation leads as they learn their trade.

Read part 3: Building Great Teams

Accelerated Velocity: How Bonial Got Really Fast at Building Software

My boss, Max (Bonial Group’s CEO), and I sat down recently for a “year-in-review” during which we discussed the ups and downs of 2017 as well as goals for the new year.  In wrapping up the conversation, I shared with him my gut feeling that velocity and productivity had improved over the past couple of years and were higher than they’d ever been at Bonial – perhaps as much as double when compared to 2014.  

He asked if I could quantify the change, so on a frigid Sunday a couple of weeks ago I sat down with a mug of hot tea and our development records to see what I could do. We’ve used the same “product roadmap” format since 1Q14 (described here), which meant I could use a “points” type approach to quantify business value delivered during each quarter.  As I was looking for relative change over time and I was consistent in the application, I felt this was a decent proxy for velocity.  

It took me a couple of hours but was well worth the effort.  Once I’d finished scoring and tabulating, I was pleasantly surprised to find that I’d significantly underestimated the improvements we’d made.  Here’s a high level overview of the results:

7X Velocity! Bonial team size, value delivered and productivity over time.

The net-net is that in 1Q 2018 we’ll be delivering ~630% more business value than we delivered in the first quarter of 2014, largely driven by the fact that each person on the team is ~250% more productive.  

Sweet.

The obvious next question: how did we do this?

The short answer is that there is no short answer.  There was no single magic button that we pushed to set us on this path to accelerated velocity; this was a long campaign that started small and grew, eventually spanning people, process, technology and culture.  Over time these learnings, improvements, changes and experiments – some large, some small, some successful, some not – built on each other and eventually created an environment in which the momentum sustained itself.  

Over the next few weeks I’ll summarize the major themes here in this blog for both myself as well as anyone who’s interested.  Along this journey I plan to cover (and will link when available):

  1. Building Leaders
  2. Building Great Teams
  3. Creating Situational Awareness
  4. Providing a Growth Path
  5. Enabling Independent Action
  6. Clarifying Processes and Key Roles
  7. Creating an Architecture Runway
  8. Optimizing the SDLC with DevOps
  9. Getting Uncomfortable
  10. Doing the Right Things
  11. Taking Ownership
  12. Building on What You’ve Got

Each of those topics could alone make for a small book, but I’ll try to keep the articles short and informative by focussing only on the most important elements.  If there’s an area in which you’d like me to dig deeper, let me know and I’ll see what I can do.  Assuming I get through all of those topics I’ll wrap things up with some final thoughts.

So let’s get started with part 2: Building Leaders

Special Forces Architecture

Architects scanning for serious design flaws

 

I’ve been spending some very enjoyable time recently with our architecture team working through some of the complexities that we’ll be facing in our next planning iteration.  Many of those topics make for interesting posts in their own right, but what I want to discuss in this post is the architecture team itself.  

Why?  Because I’m pretty happy with how we’ve evolved to “do” architecture here.  

And why is that noteworthy?  Because too many of the software architecture teams I’ve worked in, with or around have had operating models that sucked.  In the worst cases, the teams have been “ivory tower” prima donna chokepoints creating pretty diagrams with methodological purity and bestowing them upon the engineering teams. At the other end of the scale, I’ve seen agile organizations run in a purely organic more with little or no architectural influence until they ran up against a tangled knot of incompatible systems and technologies burdened with massive architectural debt.  And everything in between.

So, how do we “do” architecture at Bonial?  I think it helps to start with the big picture, so I brainstormed with Al Vilegas (our chief architect) and we came up with the following ten principals that we think clearly and concisely articulate what’s important when it comes to architecture teams:

  1. Architects/teams should think strategically but operate tactically.  They should think about future requirements and ensure there is a reasonable path to get there.  On the flip side, only just enough should be developed iteratively to meet the current requirements while leaving a reasonable runway. 
  2. Architects/teams should have deep domain expertise, deep technical expertise, and deep business context.   Yes, that’s a lot, but without all three it’s difficult to give smart guidance in the face of ambiguity – which is where architects need to shine.  It takes time to earn this experience and the battle scars that come with it; as such, I generally call BS when I hear about “architects” with only a few years of experience.
  3. Architects must be team players.  They should be confident but humble.  Arrogance has no place in architecture.  They should recognize that the engineering teams are their customers, not their servants and approach problems with a service-oriented mindset.  They should listen a lot.  
  4. Architects/teams should be flexible.  Because of their skills and potential for impact, they’ll be assigned to the most important and toughest projects, and those change on a regular basis.  
  5. Architects/teams should be independent and entrepreneurial.  They should always be on the lookout for and seize opportunities to add value.  They shouldn’t need much daily or weekly guidance outside of the mission goals and the existing/target enterprise architecture.  They should ask lots of questions and keep their finger on the pulse of the flow of technical information.
  6. Architects must practice Extreme OwnershipThe should embrace accountability for the end result and expect to be involved in projects from start to finish. This means more often than not that they will operate as part of the specific team for the duration of the project.  They may also assist with the implementation, especially the most complex or most strategic elements.  “You architect it, you own it.”
  7. Architects/teams should be solid communicators.  They need to be able, through words, pictures and sometimes code, to communicate complex concepts in a manner that is understood by technical and non-technical people alike.  
  8. Architects/teams should be practical.  They need to be pragmatic and put the needs of the business above technical elegance or individual taste.  “Done is better than perfect.”
  9. Architects/teams should be mentors.  They should embrace the fact that they are not only building systems but also the next generation of senior engineers and architects.  
  10. Architects/teams must earn credibility and demonstrate influence.  An architect that has no impact is in the wrong role.  By doing the above this should come naturally.

If you take the principles above and squint a little bit, you’ll see more than a few analogs to how military special forces teams structure themselves and operate, as illustrated below: 

High-performing Architecture Teams Military Special Forces Teams
Small Small
Technical experts and domain specialists Military experts and domain specialists
Extensive experience, gained by years of practice implementing, fixing and maintaining complex systems Extensive experience, gained by years of learning through intense training and combat
Flexibly re-structure according to the mission Flexibly re-structure according to the mission
High degree of autonomy under the canopy of the business goals and enterprise architecture High degree of autonomy under the canopy of the mission and rules of engagement
Often join other teams to lead, support, mentor and/or be a force multiplier Often embed with other units to lead, support, mentor and/or be a force multiplier
Accountable for the end results Accountable for the mission success

Hence the nickname I coined for this model (and the title of this post): “Special Forces Architecture.”  

How does this work in practice?  

At Bonial, our 120 person engineering team has two people with “architect” titles, but another half dozen or so that are working in architecture roles and are considered part of the “architecture team.“  An even broader set of people, primarily senior engineers, regularly attend a weekly “architecture board” where we share plans and communicate changes to the architecture, generally on a weekly basis.  We recognize that almost everyone has a hand in developing and executing our architectural runway, so context is critical.  To paraphrase Forest Gump: “Architect is as architect does,” so we try to be as expansive and inclusive as possible in communicating.

The members of the architecture team are usually attached to other teams to support major initiatives, but we re-assess this on a quarterly basis to make sure the right support is provided in the right areas.  In some cases, the architecture team itself self-organizes into a project team to develop a complex framework or evaluate a critical new capability.  

Obviously there’s a lot going on – we typically have 8-10 primary work streams each with multiple projects – so the core architecture team can’t be closely involved with every project.  To manage the focus, we use a scoring system from 1-5 (1 = aware, 3 = consulting, 5 = leading) for what level of support or involvement is provided to each team or initiative.  In all cases, the architects need to ensure there’s a “big picture” available (including runway) and communicated to all of the engineers responsible for implementing.

For example, right now we have team members embedded in our critical new content management and publishing platform and our Kraken data platform.  We have one person working on a design and PoC updating the core user model.  Several members of the team are also designing and prototyping a new framework for managing machine learning algorithm lifecycles and testing.  And a few people have individual assignments to research or prepare for future runway topics.  In this way we expect to stay just far enough in front of the rest of the team to create runway without wasting time on phantom requirements.

Is this model perfect?  No.  But perfection isn’t our goal, so we optimize for the principles above – adaptability, autonomy, expertise, ownership, impact – and build around that.  Under this model, the Bonial platform has gone from a somewhat organic collection of monolithic apps running on a few dozen servers to a coherent set of domains consisting of APIs, micro-services, legacy systems and complex data stores running on hundreds of cloud instances across multiple continents.  I have my doubts that this would have happened in some of the more traditional models. 

I’m happy to answer questions about this model and talk about the good and the bad.  I’d also love to hear from you – what models have worked well in your experience?

Fear Factor: Guns vs Burgers

 

In my previous article I analyzed the statistics around terrorism vs gun deaths and found that, at least as of 2015,  Americans have a higher probability of dying at the hands of another American with a gun than a European has of being killed in a terror attack.  I also noted that the risk seemed to be inversely proportional to the fear.

Stepping back further, let’s look more broadly at how terrorism and gun deaths compare to other preventable causes of death:

At over 480,000 deaths per year, smoking dwarfs the deaths caused by either guns or terrorism in the US (even, it must be noted, when considering the ~3,000 deaths caused by the 9/11attacks).  Obesity is rapidly overtaking smoking at 374,000 and rising.  It seems that Americans should fear Big Macs and Marlboros far more than terrorists.  

As with guns vs terror, I can’t help but note that the fear factor seems to be almost directly inverse to the risk – Americans seem to be mortified by terrorists, afraid of guns, and relatively indifferent to the rest.  Why so irrational?  I’m afraid that answering that question is likely out of the realm of data science and more in the realm of psychology or evolutionary biology.  It does call to mind, however, the argument made by Levitt and Dubner in Freakonomics when discussing the statistics around swimming pools vs guns.  If I recall correctly they use the term “dread” to describe the emotion that drives some irrational choices.  Perhaps the same thing is going on here – the idea of a truck slamming into a joyful Christmas market creates more dread than the somewhat abstract idea of dying from obesity or smoking.  

One final observation about guns in America – people are horrified when mass shootings happen but as a society they choose to do nothing to prevent the next massacre.  This is in marked contrast to terrorism in which the same people are willing to spend billions of dollars, close the borders and sacrifice civil liberties to prevent the next terrorist attack.  It seems to me the dread factor is high in both cases, but the response is highly asymmetrical.  Perhaps a topic for another day…

Fear Factor: Guns vs Terrorism

I‘ve been pretty quite here recently – some intense projects and my travel schedule haven’t left me much time to write.  I do have a few half-written posts that I’ll try to finish up soon.  In the meantime, here’s a short series that veers a bit from pure technology and into the interconnected realm of data analytics and social sciences…

A few weeks ago I was talking to a family member in the U.S. (I’m a U.S. citizen currently living in Germany) and we were discussing the recent spate of weather and other natural disasters that were hammering the states. When we were done he said, “Well as crazy as it is here I’d take this any day over what you’re dealing with.”

I was a bit confused, and asked what disaster he was referring to. He clarified, “No, I mean all of the terrorists driving trucks into crowds and setting off bombs on trains and stuff.”

Ah, right. I’ve heard similar statements several times since I moved to Europe and never quite understood them – after all, while horrific, the sheer number of terror related deaths in either Europe or the U.S. is in the dozens or low hundreds; I was pretty confident that the probability of being a victim of a terrorist is far lower than many other forms of violent crime or preventable death. I replied, “You know, there are more gun deaths each day in the US than terrorism deaths in Europe every year. What you should be afraid of is walking out your door.”

Not surprisingly, we agreed to disagree and the conversation ended cordially. However, it got me thinking: Was I right that someone in Europe is less at risk from an Islamic (or other radical) terrorist than an American is from another American with a gun?  If not, why is the fear factor from terrorism so much greater than gun violence?

The first question sounded like a straightforward data analytics exercise, so I busted out a Jupyter notebook to explore, grabbed some data and challenged the hypothesis.

To analyze terrorism I chose the Global Terrorism Dataset (GTD), a very comprehensive collection of worldwide terrorism over the last half century. The gun violence datasets were harder to come by, in part due to the successful lobbying efforts by the National Rifle Association (NRA) which blocks government research on gun violence, so I chose to work with the Centers for Disease Control (CDC) Multiple Causes of Death dataset which classifies all deaths in the US, including deaths by firearms. The latest year that the GTD and CDC set fully overlap is 2015, so I chose that as the year to focus on.

Terrorism 

Let’s start by looking at terrorism.  Worldwide, there was a significant spike in terrorism over the most recent decade, with the vast majority of the increase coming from the middle east, Africa, and south Asia.

 

If we zoom into this decade and look only at the US and Western Europe, this is what we see:

Look at the Y axis on both of the above graphs – it’s clear that it’s much safer to be in Europe or the US that many other parts of the world (two orders of magnitude safer). While Europe has seen a relative spike in terrorism related deaths since the end of 2015, it also has roughly double the population of the US so to get a better picture of how this compares to US deaths we need to look at deaths per million residents. Here’s what we get:

2015 terror deaths EU: 171.0 total, or 0.23 per million residents
So in 2015 a European had roughly a 1 in 4,000,000 chance of dying in a terrorist attack. That sounds pretty small.  Just out of curiosity, I wonder how that compares to terrorist attacks on American soil:

2015 terror deaths US: 44.0 total, or 0.14 per million residents
I hate to write this because some knucklehead will quote it out of context, but on the surface Europeans have roughly twice the probability of being terror victims than Americans when adjusted for population (in 2015 at least). But that’s like saying a person is twice as likely to be killed by a bear than by a shark – both numbers are so low that doubling either is still a low number.  (In fact, the odds of dying in a shark or bear attack aren’t too far off than death by a terrorist, but that’s for another article.)
Let’s look at the other side of the problem.

Gun Deaths in the U.S. (round 1)

Ok, how does that compare to the risk of dying from a gun in the US? Here’s a high-level breakdown of US gun deaths in 2015:

suicide     22060
homicide    13018
accident      489
other         284

The rough numbers/ratios above have been quoted quite a bit over recent years – roughly 35K gun deaths per year with ~1/3 homicides and ~2/3 suicides – so no big surprises there. 
 
Since terror attacks are essentially homicides, let’s look at gun homicides per million so we can compare with the terrorist threat:

2015 gun homicides US: 13018 total, or 40.29 per million residents
So, at ~40 gun homicides per million residents, an American is ~175x more likely to die from a gun homicide in the US than a European is from a terrorist in Europe.  Hmm.
 
But… it could be argued that this isn’t a fair comparison.  I’ve heard several arguments that have gone something like this: “Terrorists tend to strike random, killing innocent, unsuspecting victims.  U.S. gun violence mostly happens in places like Chicago, St Louis and Detroit and involves gangs and criminals.  In other words, U.S. gun violence is about ‘them’, and we’re not ‘them’.”
 
So how can we whittle the dataset down to “not them”?

Gun Deaths in the U.S. (round 2) 

Let’s see what we can find as we drill into the CDC data…

On an absolute basis, American men are ~6X more likely to be victims of gun violence, while on a percentage basis, men and women have similar levels of root cause, with suicide being the major contributor.

How about race?

The differences here are striking – blacks and hispanics are far more likely to die from homicide while whites are overwhelmingly likely to take their own life. To get a different perspective, let’s look at this on a percentage basis:

Again, some striking differences in intent between different racial groups.  (My gut tells me the homicide rate roughly correlates with average income level, but that’s an analysis for another day.)
 
Ok, maybe education plays a role, either directly or as a proxy for socio-economic status:

Again, a pretty strong correlation.

And now let’s look at age. Here are two views, one broken down by intent and the other by race:

(Note: the bump around 50 is due to a spike in white male suicide…  Remember, remember the month of Movember…)

While tragic, the suicides, accidents and undetermined cause events aren’t relevant to this analysis so we’ll exclude those to focus exclusively on homicides and revisit the age vs race graph in this light:

So, it appears that gun deaths skew heavily towards young black and hispanic males without college degrees.  It feels wrong removing men from the equation since most of the comments I’ve heard relating to this hypothesis have come from men, so let’s just filter on the other dimensions and look at whites over 30 with college degrees:

2015 gun homicides US (white, over 30, college degree): 392 total, or 1.21 per million residents

So even this limited demographic is still ~5X more likely to die from gun in the US than a European is from a terrorist attack.

Conclusion

Ok, let’s review:

  • In 2015 a person in Europe had less than one in a million chance of being killed by a terrorist.
  • That same year, a person in the U.S. had a probability up to 40 in a million of being killed by another American with a gun.

At this point I think I can be pretty confident that my original hypothesis is correct: an American is at much higher risk of being killed by another American with a gun than a European is of being killed by a terrorist.

In the course of exploring this data I have to admit I was surprised at some of the things I found and want to explore them further – for example:

  • What’s going on with terrorism in the rest of the world?
  • How does the casualty rate from guns and terrorism compare with other preventable deaths?
  • Why is the fear factor orthogonal to the reality of the actual risks? 

Stay tuned.

(If interested, you can look at the code of the analysis on this Kaggle kernel.)