Accelerated Velocity: Getting Uncomfortable

Note: this article is part 10 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

“Confident.  Cocky.  Lazy.  Dead.” This admonishment against complacency was the mantra of Johnny “Dread” Wulguru, the villain in Tad William’s Otherland saga.  As true as it is for assassins like Dread, it’s also true (though perhaps not as literally lethal) for teams and companies that choose to rest on their laurels and stop challenging themselves.

Complacency is the enemy of innovation.  This has been proven over and over throughout history in every domain as once successful or dominant players suddenly found themselves lagging behind.  This is also a leadership failure.  Good leaders strive to prevent comfort from becoming complacency. 

Jeff Bezos at Amazon has baked this into the DNA of the company as enshrined in his “Day 1” message to shareholders.  Of note is what Mr. Bezos says happens when companies get comfortable on Day 2:

“Day 2 is stasis. Followed by irrelevance. Followed by excruciating, painful decline. Followed by death. And that is why it is always Day 1.”

Confident.  Cocky.  Lazy.  Dead.

It’s not easy, though.  Comfort is the reward for success after all.  “Don’t fix something that ain’t broke.”  Right?

Wrong.  The reward for success is being in position to hustle and build on that success.  Period. 

Getting Uncomfortable

It’s no different in software development teams. 

By 2016 we’d made significant strides in velocity and efficiency in Bonial product development.  Processes were in place.  An architecture roadmap existed.  Teams were healthy.  The monolith was (mostly) broken.  AWS was being adopted.  All signs pointed to very successful changes having taken place. 

At the same time I felt a certain sense of complacency settling in.  The dramatic improvements over the past couple of years had some thinking that it was now “good enough.”  Yet we still had projects that ran off the rails and took far longer than they should have.  We still couldn’t embrace the idea of an MVP.  We still had mindsets that change was dangerous and scary.  And, perhaps most important, many had a belief that we were as fast as we could or needed to be. 

Yes, we were better and faster, but I knew we had only begun to tap our potential.  It was time to get uncomfortable.

Engineering Change

Changing a deep-seated mindset in a large organization using a head-on approach is tough.  An easier and often more effective approach is to engineer and successfully demonstrate change in smaller sub-groups and spread out from there.  Once people see what is possible or, better yet, experience it themselves, they tend to be quite open to change. 

So we looked for opportunities to challenge individual teams to “think different”.  For example, on several occasions the company needed an important feature insanely fast.  Rather than say no, we asked teams to work in “hackathon mode” – essentially, to do whatever it took to get something to market in a few days even if the final solution was wrapped in duct tape and hooked to life support.  Not surprisingly we usually delivered and the business benefitted massively.  Yes, we then had to spend time refactoring and hardening to make the solution really stable, but the feature was in the market, business was reaping the benefits and the teams were proud of delivering fast.

On another occasion we had a team that struggled with velocity due, in part, to lack of test automation and an over-reliance on manual testing.  So I challenged the team to deploy the next big feature with zero manual testing; they had to go to production with only automated tests.  This made them very uncomfortable.  I told them I had their backs if it didn’t work out – they only needed to give it their best shot.  To their credit, the team signed up for the challenge and the release went out on time and had no production bugs.  This dramatic success made a strong statement to the rest of the organization.

Paradigm Shift

We also took advantage of our new app ecosystem.  Over the past few years the company has started several new “incubation” initiatives to explore new possibilities and expand our product portfolio.  We didn’t want to do these initiatives with our core product development teams because (a) we didn’t want to be continually wrestling with questions as to whether to focus on the new or old products, and (b) we feared that doing things like the “core” teams would be too slow. 

So we spun out standalone teams with all of the resources needed to operate independently.  Not surprisingly these “startup” teams moved much faster than any of our core teams.  In part this was because they were not burdened by legacy systems, technical debt, and risk/exposure of making mistakes that affect millions of users. But I think the bigger part was sheer necessity.  We ran our incubation projects like mini startups – they received funding, a target and a timeline and they had to hit those targets (or at least show significant progress) in order to receive more funding.  As a result, the teams were intensely focussed on delivering MVPs as quickly as possible, measuring the results in the market, and pivoting if needed. 

Between 2015 and 2017 we ran three major incubation projects and each one was faster than the last.  The most recent, Fashional, went from funding to launch in less than 12 weeks, which included ramping up development teams in two countries, building web and native mobile apps and lining up initial partners and marketing launch events.  This proven ability to move fast made a strong statement to our other teams. 

We soon had “core” teams making adjustments and shifting their thinking.  Over the next few quarters, every team had embraced a highly iterative, minimalistic approach to delivery that enabled us to try more things more quickly and, when needed, take more aggressive risks.   Now each team strives to deliver demonstrable, user-facing value every sprint.  Real value, not abstract progress.  Just like the agile book says.  This isn’t easy but is fundamentally required to drive minimalistic, iterative thinking.  The result is dramatic improvement in velocity while having more fun (success is fun). 

For sure this hasn’t been perfect.  Even today we still have teams that struggle to plan and deliver iteratively and we still have projects that take way too long.  On the flip side we have a much deeper culture of challenging ourselves, getting uncomfortable and continually improving. 

Confident.

Closing Thoughts

  • It’s easy to become complacent, especially after a period of success.  This is deadly.
  • Leaders must act to remove complacency and force themselves and their teams to “get uncomfortable” and push their own limits.
  • Break the problem into smaller chunks.  Work with entrepreneurial teams on initiatives that challenge the status quo.  Have them show the way.
  • Reward and celebrate success and make sure you have the team’s back.  Honor your commitments.

Accelerated Velocity: Creating an Architectural Runway

Note: this article is part 8 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

Most startups are, by necessity and by design, minimalistic when it comes to feature development.  They build their delivery stack (web site or API), a few tools needed to manage delivery (control panel, CMS) and then race to market and scramble to meet customer requests.  Long term architecture thinking is often reduced to a few hasty sketches and technical debt mitigation is a luxury buried deep in the “someday” queue. 

At some point success catches up and the tech debt becomes really painful.  Engineers spend crazy amounts of time responding to production issues which they could have used to develop new capabilities.  New features take longer and longer to implement.  The system collapses under new load.  At this point tweaks won’t save the day.  An enterprise architecture strategy and runway is needed.

What is an architecture runway?  In short it’s a foundational set of capabilities aligned to the big picture architecture strategy that enable rapid development of new features.  (SAFe describes it well here.)  In plain english – it’s investing in foundational capabilities so features come faster.

The anchor of the architecture runway is, of course, the architecture itself.   I’m not going to wade into the dogmatic debate about “what is software architecture”; rather, I’ll simply state that a good architecture creates and maintains order and adaptability within a complex system.  The architecture itself should be guided by a strategy and long-term view on how the enterprise architecture will evolve to meet the needs of the business in a changing market and tech-space.   

In developing an architecture strategy and runway, architects should start with the current state. At the very least, create a simple diagram that gives context to everyone on the team as to what pieces and parts are in the system and how they play together.   Once the “as is” architecture is identified and documented, the architects can roll up their sleeves and develop the “to be” picture, identify the gaps between the two states, and then develop a strategy for moving towards the “to be”.  The strategy can be divided into discreet epics / projects, and construction of the runway can begin.

Bonial’s Architecture Runway

Success had caught up to Bonial in 2014.  Given the alternative I think everyone would agree that that’s the right problem to have, but it was a problem none-the-less.  The majority of the software was packaged into a single, huge executable called “Portal3,” which contained all of the business logic for the web sites, mobile APIs, content publishing system and a couple dozen batch jobs.  There were a few ancillary systems for online marketing and some assorted scripts, but they were largely “rogue” projects which didn’t add to the overall enterprise coherence.  While this satisfied the immediate needs and had fueled impressive early growth and business success, it wasn’t ready for the next phase.

One of my first hires at Bonial was Al Villegas, an experienced technologist who I asked to focus on enterprise architecture.  He was a great fit as he had the right mix of broad systems perspective and a roll-up-his-sleeves / lead-from-the-front mentality.  He and I collaborated on big-picture “as-is” and “to-be” diagrams that highlighted the full spectrum of enterprise domains and showed clearly where we needed to invest going forward.   Fortunately we version and save the diagrams, so here are the originals:

Original 2014 “As Is” High Level Enterprise Architecture
Original “To Be” 2015 High Level Enterprise Architecture

These pictures served several purposes: (1) they gave us an anchor point for defining and prioritizing long-term platform initiatives, (2) they let us identify the domains that were misaligned, underserved or needed the most work, and (3) they gave every engineer additional context as they developed their solutions on a day-to-day basis.

Then the hard work started.  We would have loved to do everything at once, but given the realities of resource constraints and business imperatives we had to prioritize which runways to develop first.  As described in other articles of this series, we focussed early on our monitoring frameworks and breaking up the monolith.  In parallel we also started a multi-phase, long-term initiative to overhaul our tracking architecture and data pipelines.  Later we moved our software and data platforms to AWS in phases and adopted relevant AWS IaaS and SaaS capabilities, often modifying or greatly simplifying elements of the architecture in the process.  Across the span of this period, we continually refined and improved our APIs, moving to a REST-based, event-driven micro-services model from the dedicated/custom approach previously used. We also invested in an SDLC runway, building tools on top of the already mature devops capabilities to further accelerate the development process. 

The end result is a massive acceleration effect.  For example, we recently implemented a first release of a complex new feature involving sophisticated machine-learning personalization algorithms, new APIs and major UI changes across iOS, Android and web.  The implementation phase was knocked out in a couple of sprints.  How?  In part because the cross-functional team had available a rich toolbox of capabilities that had been laid down as part of the architecture runway: REST APIs, a flexible new content publishing system, a massive data-lake with realtime streaming, a powerful SDLC / staging system that made spinning up new production systems easy, etc.  The absence any of these capabilities would have added immensely to the timeline.

The architecture continues to evolve.  We’ve recently added realtime machine learning and AI capabilities as well as integrations with a number of external partners, both of which have extended the architecture and brought both new capabilities and new (and welcome) challenges.  We are continually updating the “as is” picture, adapting the architecture strategy to match the needs of the business, and investing into new runway.

And the cycle continues.

Closing Thoughts

  • Companies should start with a simple single solution – that’s fine, it’s important to live to fight another day.  But eventually you’ll need a defined architecture and runway.
  • Start with a “big picture” to give everyone context and drill down from there.
  • Don’t forget the business systems: sales force automation, order management, CRM, billing, etc.  As much as everyone likes to focus on product delivery, it’s the enterprise systems that run the business.
  • Create a long-term architectural vision to help guide the big, long-term investments.

Accelerated Velocity: Situational Awareness

Note: this article is part 4 of a series called Accelerated Velocity.  This part can be read stand-alone, but I recommend that you read the earlier parts so as to have the overall context.

“If a product or system chokes and it’s not being monitored, will anyone notice?”  Unlike the classic thought experiment, this tech version has a clear answer: yes.  Users will notice, customers will notice, and eventually your whole business will notice.

No-one wants their first sign of trouble to be customer complaints or a downturn in the business, so smart teams invest in developing “situational awareness.” What’s that?  Simple – situation awareness is the result of having access to the tools, data and information needed to understand and act on all of the moving factors relating to the “situation.”  This term is often used in the context of crisis situations or other fast-paced, high-risk endeavors, but it applies to business and network operations as well.

Product development teams most definitely need situational awareness.  The product managers and development leads need to know what their users are doing and how their systems are performing in order to make wise decisions – for example, should the next iteration focus on features, scale or stability.  Sadly, these same product teams often see the tracking and monitoring that is needed for developing situational awareness as “nice-to-have’s” or something to be added when the mythical “someday” arrives. 

The result?  Users having good or bad experiences and no-one knowing either way.  Product strategy decisions being made on individual bias, intuition and incomplete snippets of information.  Not good.

Sun Tzu put it succinctly:

“If you know neither the enemy nor yourself, you will succumb in every battle.”

Situational awareness is a huge topic, so in this series I’m going to limit my focus to data collection (tracking and monitoring) and insights (analytics and visualization) at the product team level.  For the purposes of this series I’ll define ”tracking” as the data and tools that show what users/customers are doing and “monitoring” as the data and tools that focus on systems stability are performance.  Likewise I’ll use “analytics” to refer to tools that facilitate the conversion of data into usable intelligence and “visualization” as the tools for making that intelligence available to the right people at the right time.  I’ll cover monitoring in this article and tracking in a later article.

At Bonial in 2014 there was a feeling that things were fine – the software systems seemed to be reasonably stable and the users appeared happy.  Revenue was strong and the few broad indicators viewed by management seemed healthy.  Why worry?   

From a system stability and product evolution perspective it turns out there was plenty of reason to worry.  While some system-level monitoring was in place, there was little visibility into application performance, product availability or user experience.  Likewise our behavioral tracking was essentially limited to billing events and aggregated results in Google Analytics.  Perhaps most concerning: one of the primary metrics we had for feature success or failure was app store ratings.  Hmmm.

I wasn’t comfortable with this state of affairs.  I decided to start improving situational awareness around system health so I worked with Julius, our head of operations, to lay out a plan of attack.  We already had Icinga running at the system level as well as DataDog and Site24x7 running on a few applications – but they didn’t consistently answer the most fundamental question: “are our users having a good experience?” 

So we took some simple steps like adding new data collectors at critical points in the application stack.  Since full situation awareness requires that the insights be available to the right people at the right time, we also installed large screens around the office that showed a realtime stream of the most important metrics.  And then we looked at them (a surprisingly challenging final step). 

The Bonial NOC Monitor Wall
One of my “go to” overviews of critical APIs, showing two significant problems during the previous day.

The initial results weren’t pretty.  With additional visibility we discovered that the system was experiencing frequent degradations and outages.  In addition, we were regularly killing our own systems by overloading them with massive online marketing campaigns (for which we coined the term: “Self Denial of Service” or SDoS).  Our users were definitely not having the experience we wanted to provide.

(A funny side note: with the advent of monitoring and transparency, people started to ask: “why has the system become so unstable?”)

We had no choice but to respond aggressively.  We set up more effective alerting schemes as well as processes for handling alerts and dealing with outages.  Over time, we essentially set up a network operations center (NOC) with the primary responsibility of monitoring the systems and responding immediately to issues.  Though exhausting for those in the NOC (thank you), it was incredibly effective.  Eventually we transferred responsibility for incident detection and response to the teams (“you build it you run it”) who then carried the torch forward.

Over the better part of the next year we invested enormous effort into triaging the immediate issues and then making design and architecture changes to fix the underlying problems.  This was very expensive as we tapped our best engineers for this mission.  But over time daily issues became weekly became monthly.  Disruptions became less frequent and planning could be done with reasonable confidence as to the availability of engineers.  Monitoring shifted from being an early warning system to a tool for continuous improvement. 

As the year went on the stable system freed up our engineers to work on new capabilities instead of responding to outages.  This in turn became a massive contributor to our accelerated velocity.  Subsequent years were much the same – with continued investment in both awareness and tools for response, we confidently set and measure aggressive SLAs.  Our regular investment in this area massively reduced disruption.  We would never have been able to get as fast as we are today had we not made this investment.

We’ve made a lot of progress in situational awareness around our systems, but we still have a long way to go.  Despite the painful journey we’ve taken, it boggles my mind that some of our teams still push monitoring and tracking down the priority list in favor of “going fast”.  And we still have blind spots in our monitoring and alerting that allow edge-case issues – some very painful – to remain undetected.  But we learn and get better every time.

Some closing thoughts:

  • Ensuring sufficient situational awareness must be your top priority.  Teams can’t fix problems that they don’t know about.
  • Monitoring is not an afterthought.  SLAs and associated monitoring should be a required non-functional requirement (NFR) for every feature and project.
  • Don’t allow pain to persist – if there’s a big problem, invest aggressively in fixing it now.  If you don’t you’ll just compound the problem and demoralize your team.
  • Lead by example.  Know the system better than anyone else on the team.

In case you’re interested, here are some of the workhorses of our monitoring suite:

Accelerated Velocity: How Bonial Got Really Fast at Building Software

My boss, Max (Bonial Group’s CEO), and I sat down recently for a “year-in-review” during which we discussed the ups and downs of 2017 as well as goals for the new year.  In wrapping up the conversation, I shared with him my gut feeling that velocity and productivity had improved over the past couple of years and were higher than they’d ever been at Bonial – perhaps as much as double when compared to 2014.  

He asked if I could quantify the change, so on a frigid Sunday a couple of weeks ago I sat down with a mug of hot tea and our development records to see what I could do. We’ve used the same “product roadmap” format since 1Q14 (described here), which meant I could use a “points” type approach to quantify business value delivered during each quarter.  As I was looking for relative change over time and I was consistent in the application, I felt this was a decent proxy for velocity.  

It took me a couple of hours but was well worth the effort.  Once I’d finished scoring and tabulating, I was pleasantly surprised to find that I’d significantly underestimated the improvements we’d made.  Here’s a high level overview of the results:

7X Velocity! Bonial team size, value delivered and productivity over time.

The net-net is that in 1Q 2018 we’ll be delivering ~630% more business value than we delivered in the first quarter of 2014, largely driven by the fact that each person on the team is ~250% more productive.  

Sweet.

The obvious next question: how did we do this?

The short answer is that there is no short answer.  There was no single magic button that we pushed to set us on this path to accelerated velocity; this was a long campaign that started small and grew, eventually spanning people, process, technology and culture.  Over time these learnings, improvements, changes and experiments – some large, some small, some successful, some not – built on each other and eventually created an environment in which the momentum sustained itself.  

Over the next few weeks I’ll summarize the major themes here in this blog for both myself as well as anyone who’s interested.  Along this journey I plan to cover (and will link when available):

  1. Building Leaders
  2. Building Great Teams
  3. Creating Situational Awareness
  4. Providing a Growth Path
  5. Enabling Independent Action
  6. Clarifying Processes and Key Roles
  7. Creating an Architecture Runway
  8. Optimizing the SDLC with DevOps
  9. Getting Uncomfortable
  10. Doing the Right Things

Each of those topics could alone make for a small book, but I’ll try to keep the articles short and informative by focussing only on the most important elements.  If there’s an area in which you’d like me to dig deeper, let me know and I’ll see what I can do.  Assuming I get through all of those topics I’ll wrap things up with some final thoughts.

So let’s get started with part 2: Building Leaders

What I learned from writing an AI voice assistant and chat bot

I have a confession: despite being in management I still love to code.  Since I don’t get to program as much as I’d like or stay up on the latest trends and technologies, I set a goal for myself to learn at least one new technology every year (and more than one on a good year).  This learning hobby is how I made the leap from back-end to full-stack developer, how I learned iOS and Android, and how I stepped into the hallowed halls of Data Science.

This year I decided to explore chat bots and voice assistants.  As I learn best by doing, I generally think up a fun or useful project and then learn through building it.  For this project I decided to tackle an unending source of stress in my household: bickering and arguing over screen time for our kids.  

Enter ChronosBot

The idea behind ChronosBot is simple.  Parents set up screentime accounts for each child as well as an an automatic allowance that puts time in the accounts.  After linking their account to Alexa, Google Assistant, Facebook Messenger, etc., they can say or write things like, “Alexa, ask ChronosBot to withdraw 30 minutes from Axel’s account” or “… what’s everyone’s balance?”

With the idea in place, I had to choose my tech stack.  Google has a robust platform built on API.AI.  API.AI supports a dozen or so chat integrations (Allo, Messenger, Telegram, Kik, etc.) as well as a voice interface for Google Home, allowing developers to (theoretically) write one interface for both voice and chat.  At the time I started, Amazon Alexa had a rudimentary platform for speech dialog development using structures text.  In both platforms the interface designer creates “intents” that match what the user says to something the bot can do and then provides appropriate responses, and both platforms hand off the business logic to a backend app using web hooks.

For the backend, I decided to sharpen my python skills and implement in Django on top of Postgres.  For deployment I decided to give Heroku a try.  

Development of the basic use cases took me a couple of weeks in the late evening and weekends.  I submitted to both Amazon and Google and waited for a week or so in each case for the review.  Both rejected my app, but for reasons that I hadn’t expected.  Amazon told me that my app violated the Alexa ToCs because it “targeted children” (huh?) and told me to not resubmit the app ever again (seems they relented).  Google gave me the boot because my invocation name couldn’t be recognized properly but a very helpful person from Google worked with me to resolve the issue and now it’s live.  

I’ve since continued development and added new features like “rewards” and “penalties” (requested by my wife) and “mystery bonus” (requested by the kids).  I’ve enabled Telegram and Messenger and have adapted the platform to support both visual and audio surfaces.  And the Alexa version was finally approved earlier this week.

Lessons Learned

So, what have I learned while navigating the ins and outs of the Google and Alexa development platform and publication process? 

1)  Amazon and Google have very different approaches.  Google has taken the bold approach of enabling all community developed actions and using an intent matching algorithm to route users to the correct action.  Amazon requires users to enable specific skills via a Skill Store.  In both cases, discovery is a largely unsolved challenge.

2)  Too early to tell who will be king.  Amazon Alexa has a crazy head start, but Google seems to be a more robust speech development platform.   With a zillion Android devices already on the market one certainly can’t count them out.  On the other hand not a month seems to go by without a new Alexa form factor hitting the market.  

3)  It’s early days.  Both platforms are being developed at a lightning fast pace.  Google had a big head start with API.AI.  The original Alexa interface was frustratingly primitive, but they’ve since upgraded to a new UI (which suspiciously bears a strong resemblance to API.AI) that has great promise.  

I have to take my hat off to both companies for creating a paradigm and ecosystem that makes voice assistant and natural language development accessible to the broader development community.  It’s so straight forward that even my kids gave it a try – my daughter (10) developed “The Oracle” that answers deeply profound questions like “Who’s awesome” (she’s awesome).  My son (12) wrote a math quiz game with which he is happy to challenge anyone to beat his top score. 

4)  Conversational UX is easy; good conversational UX is really hard.  I’ve known this since I was involved with Nuance and the voice web in the late 1990’s (and I also happen to be married to an expert in the space).  Making it easy to build a conversational UX is a very different thing than helping developers build a high quality conversational UX (especially a Voice UX).  Both Amazon and Google have tried to address this with volumes of best practice documentation, but I expect most developers will ignore it.

5)  Conversational UX is limited.  There are some use cases that work for serial interactions (voice or chat) and some that work better in parallel interactions (visual).  Trying to force one into the other typically doesn’t make sense or only applies to “desperate users”.  You see the effect of this to some degree already in the Alexa Skill Store – there are some clear clusters evolving (home automation, information retrieval, quiz games).

6)  Multi-modal UX is the next natural step.  I’m very excited about the Amazon Echo Show as I expect that will unleash a wave of interesting multi-modal interaction paradigms.  

7)  It’s fun.  There’s just something about the natural language element of voice assistants that allows for a richer, more human interaction than what GUIs can provide.  

All in all I’m really excited about the potential of this space, and I’m not alone – just look at the growth of the Alexa Skills Store.  The tech press is also taking a critical look at these capabilities (e.g. a recent article featuring yours truly) and I expect most companies are at least thinking about how these capabilities will play in their business.  My company, Bonial, is investing in several actions/skills to explore the potential of voice and chat interfaces.  To date we’ve already launched a bot that allows users to search for local deals and will shortly launch a voice assistant interface to our shopping list app, Out of Milk.  We’ve learned a lot and we’ll share more on those projects in other posts.  

The Micro-service Conundrum

 

Micro-services have been the rage in software circles over the past couple of years.  A natural evolution of service oriented architectures (SOA), and popularized by successful implementations at companies like Spotify, Soundcloud and many others, micro-services have become the “must have gadget this holiday season”: if you aren’t doing them, you must be doing something wrong.  

But is that true?  As much as people (and especially engineers) love black and white, the answer here is a firm “maybe.”  Here are some of the positives and negatives from one CTO’s perspective.

On the plus side, micro-service architectures provide an excellent canvas for rapid development and continuous integration.  Hard dependencies are minimized, business logic is localized, and the resulting services are typically cloud ready.  Developers tend to like micro-services because it allows for a great deal of independence.  It’s hard to understate the potential pain savings and optimizations – people, process and technology – that can be driven by moving to this type of architecture.

But it doesn’t come for free.  For starters, you’ll likely have a lot more moving pieces in terms of individual components and running executables.  A few weeks ago I wrote a post on the architectural heuristic: Simplify Simplify Simplify in which I posited that simple is better when it comes to minimizing TCO.  In that vein, one must ask if micro-services follow the rule.  Yes, each individual service itself is simpler than a bloated monolith as a result of the small size and tight boundaries.  But the total business logic in your enterprise hasn’t changed, and now you may have hundreds or thousands of additional code modules to manage and executables to orchestrate.  The good news is that cloud hosting providers like AWS provide an ever increasing set of tools to help with managing micro-service architectures (e.g. Lambda, Container Services), but it still requires a good deal of cultural and process change.

Another side effect of the proliferation of executables is potential increase in cost – many hosting providers and software vendors (e.g. APM providers) still price based on number of processes or agents.  If you take the same processing load and 10X the number of running processes, you might find yourself in a world of hurt pretty quickly.

Finally, in moving to micro-services, you’ll find yourself needing to address a host of new challenges that you may not have had to previously – service discovery, versioning, transactions and eventual consistency, event tracing, security, etc.  At a minimum, the upside benefits you’ll realize will be offset by developing competency and code to solve those new challenges.

So, what does this mean for the typical company.  If you have applications that are bloated monoliths, those are fantastic candidates for breaking down into smaller components or micro-services.  On the other hand, if you have a reasonably well architected system with decent boundaries in place already, I’d carefully weight the cost-benefits – maybe run a few trials projects to get a better sense of how it would fit into your platform.  Just realize that in many ways you’re “squeezing the balloon” – trading one set of problems for another.  So long as you’re happier with the new problems (and the corresponding benefits), you win.

In closing, whether you move to micro-services or not, I do think there are great lessons to be learned from applying the discipline required by micro-services – namely, enforcing clear boundaries around business logic and using “API thinking” to service a variety of clients.  I wonder if there isn’t a compromise to be had in which one uses the principles for developing and organizing the code, but you still deploy in a more constrained manner – “Code Micro, Deploy Macro.”  But that’s a discussion for another time. 

Conversations with Amazon Alexa

(Warning: this article will delve into technical design and code topics – if you’re not in the mood to geek-out you might want to skip this one.)
 
I’m excited about Alexa and it’s siblings in the voice assistant space – the conversational hands-free model will facilitate “micro moment” interactions to a degree that even mobile apps couldn’t do.  These new apps and interactions can be quite powerful, but as the saying goes – “with great power comes great responsibility.”  In this case the responsibility is to build voice interfaces that don’t suck, and that’s not trivial.  We’ve all used a bank or airline automated systems that have infuriated us, either by being confusing, a waste of time, or by leaving us stuck in “IVR hell” unable to understand or get us to where we want to be.
 
Fortunately there are solutions.  First, there is a UX specialty know as Voice User Interface Design (VUI Design) who’s practitioners are highly skilled in the art, science, psychology, sociology and linguistics required to craft quality speech interactions.  Unfortunately they are rare and will likely be in extremely high demand as voice assistant skills blossom.
 
Second, there are online frameworks for developing speech interactions that fill much the same role as bumpers at the bowling alley – they won’t make you a better bowler, but they’ll protect you from some of the most egregious mistakes.  Perhaps the best tool on the market today is API.AI, which is primarily a natural language interpretation engine that can be the brains behind a variety of conversation interfaces – chat bots like Facebook Messenger and Telegram, voice assistants like Google Home, etc.
 
The Alexa ADK also comes with an online tool for developing interactions, but it’s quite primitive and cumbersome to use for anything but the simplest of skills.  Probably the biggest gap in the ADK is the lack of support for “slot filling”.  Slot filling is what speech interfaces do when they don’t get all the info needed to complete a task.  For example, let’s say you’re developing a movie ticket purchase skill.  In a perfect world every user would properly say, “I’d like two adult tickets to the 5:00 PM showing of Star Wars today.”  Given that our users will be rude and not behave the way we want them to, it’s likely they’ll say something like, “I want two tickets to Star Wars.”  It’s our skill’s responsibility to discover the [ ticket type ], [ showtime ], and [ show date ].  Our skill would likely next as the user: “How many tickets do you want to buy?” and so on.  That’s slot filling.
 
Alexa provides no native tools for managing slot filling, so it’s left to the developer to implement the functionality on their own service (which Alexa calls via “web hooks”.  Here’s an approach we use here at Bonial:
 
  • Create a Conversation object (AlexaConversation) that encapsulates the current state of the dialog and the business logic for determining next steps.  The constructor takes the request model from Alexa, which includes a “Session” context.   Conversations expose three methods:
    1. get_status() – whether the current dialog is complete or not
    2. get_next_slot() – if the dialog is not complete, which slot needs to be filled next
    3. get_session_context() – the new session context JSON to be sent back to Alexa (and then returned to the app on the next call) – basically the dialog state
class Conversation:
    __metaclass__ = ABCMeta

    model = None
    status = None
    type = None

    # pass in the underlying model or data needed to assess the current state of the dialog
    def __init__(self, model):
        self.model = model

    @abstractmethod
    def get_status(self):
        None

    @abstractmethod
    def get_next_slot(self):
        None

    @abstractmethod
    def get_session_context(self):
        None
  • When a request from Alexa arrives, we simply create an AlexaConversation with the request JSON and ask whether the current dialog is complete or not.  If it is complete, we then pass the dialog to the business logic layer for interpretation and processing (more in this later).  If not complete we respond to Alexa with a prompt to ask for the next slot.  Repeat.
 
So far it’s working well and reduces the complexity of the processing code.  Unfortunately both the dialog rules (how many slots, which are required, which order) is in the code, as are the slot prompts.  Are next step will be to move both of these into a declarative format so the VUI designers will have the flexibility to edit without involving the coders.
 
We assume this will be a stop-gap until the ASK and other resources have proper slot-filling capabilities.  We’d also love to hear how you’re approaching this challenge.

What a difference a decade makes…

I frequently fly transatlantic as part of my job.  Over the past few years I’ve been excited to see airlines (Delta, Lufthansa, Air Berlin) begin to offer two things: (1) in seat AC power and (2) internet access throughout the flight.  Now I can run my laptop the entire flight and, or daytime flights, stay connected with my team back in Berlin.
 
Last week I was fortunate to be rerouted from a Delta codeshare KLM fight (no power, no internet) onto Lufthansa (power, internet).  On the daytime flight from Frankfurt to Chicago I spent nine hours of blissful time catching up on a ton of work that required online access.  I was able to slack with my team the whole time, send emails, and work on shared documents.  At one point, I was working on a prototype of a voice assistant project – the IDE was running on my laptop and deploying code to Heroku, I was using API.AI to develop the natural language interface, and used Amazon Alexa ADK to generate sample Alexa calls.  Traffic was constantly flowing between all of the nodes.  All from my seat on the plane.
 
Ten years ago we didn’t have smart phones.  We were just a few years past modems.  Streaming media was mostly a dream. There certainly wasn’t wifi on planes.
 
The jury is still out whether I’ll miss the eight hours of uninterrupted quiet time on planes bing-watching of movies that I probably didn’t want to see – there’s certainly something to be said for being unplugged.  But I sure as heck like the option to stay connected.
 
What a difference a decade makes.  It makes me wonder what the net decade will bring.  Can’t wait – should be a wild ride.

Here to Stay

As we slide into 2017, speech recognition is all the rage – it was the darling of CES and you can’t pick up a business or tech journal without reading about the phenomenon.  The sudden explosion of interest and coverage could lead one to assume this is yet another hype bubble waiting to burst.  I don’t think so, and I’ll explain why below.  But first let’s roll back the calendar and look at the evolution of the technology… 2016… 2015… 2014… 2010… 2005…

In the late 1990’s and early 2000’s, speech recognition was a hot topic.  Powered by early versions of machine learning algorithms and improvements in processing power, and boosted by the advent of VoiceXML which brought web methods to voice, pundits preached of the day when a “voice web” would rise to rival the dot-com bubble.

It never happened.

Implementors quickly found an Achille’s heel in speech interfaces: the single-dimensional information stream provided by audio was no match for two-dimensional visual presentation of the web and apps – it was simply too cumbersome to consume large quantities of information via audio.  This relegated speech to “desperation” scenarios where visual simply wasn’t an option or to human automation scenarios (e.g. call centers).

Fast forward a decade and a half.  Siri which, for all that its been maligned, was a watershed moment for speech.  It came with reasonable accuracy and with a set of natural use cases (hand free driving, message dictation, cold-weather hands-free operation, etc.).  It took speech mainstream.

What Siri started, Amazon Echo took to the next level.  Rather than requiring the user to interrupt the natural flow of their lives to fiddle with their phone, Alexa is always on and ready to go (so long as you’re near it, of course).  This means Alexa enables Micro-Moments and integrates into one’s normal flow of life.  The importance of that can’t be understated.

Over the last six months other tech giants have started falling over themselves to respond to the market success of Echo and the surprising stats coming in from the market: 20% of mobile searches via speech, 42% of people using voice assistants, etc.  Google recently released “Home” and is plugging Assistant into its Pixel phone and other devices.  Facebook and others and trailing close behind.  And Apple is working to regain it’s early lead by freeing Siri from the confines of the phone and laptop.

So where’s it all going?

To speculate on that we should probably look at why consumer speech recognition is succeeding this time.  First, improvements in processing power and neural network / “deep learning” algorithms dropped the cost and radically improved the accuracy of speech recognition.  This has allowed speech + AI to subtly creep into more and more user facing apps (e.g. Google Translate) which both conditioned users as well as helped train the speech engines.  The technology is still limited to single-dimensional streams, but the enormous popularity of chat and, more recently, bots shows that there is plenty of attraction to this single dimension.

But speech is still limited – for example the best recognition engines need a network connection to use cloud resources and noisy environments common to cityscapes continue to confound recognition engines.  This is why the Echo approach is genius – an always-on device with a great microphone in a (somewhat) quite environment.  But will speech move beyond the use case of playing music or getting the weather?

Yes.  Advanced headphones like the Apple AirPods will expand “always on” beyond the home.  Improved algorithms will handle noisy environments.  But perhaps most important – multi-modal interfaces are now eminently possible.

What’s multi-modal?  Basically an interaction that spans multiple interfaces simultaneously.  For example, you may start an interaction via voice but complete it via a mobile device – like asking your voice assistant to read your email headers but then forwarding an email of interest to the nearest screen to be read and responded to.  Fifteen years ago there simple weren’t too many options for bouncing between speech and graphical interfaces.  Today, the ubiquity of connected smart phones changes the equation.

Will this be the last big wave of speech?  No.  Until the speech interface is backed by full AI it can’t reach it’s full potential.  Likewise there’s still a lot of runway in terms of interface devices.  But this time it’s here to stay.