Wednesday, July 28, 2010

The Hidden Cost of Agile Software Development

Perhaps my favorite class so far at MIT is the “Business of Software” MIT Sloan class taught by Professor Michael Cusumano. This is a seminar-style course aimed at anyone interested in the business of software (products and services) and software-based digital platforms. Anyone interested in working for, founding, or analyzing and consulting for companies relying on software technology or digital platforms such as mobile phones should find this course of interest. Many of the issues we discussed were also highly relevant for companies whose businesses are heavily dependent on software, such as in e-business, digital content and publishing, financial services, or embedded software for industrial applications. The class material was augmented by visits from various CEOs and software professionals, who joined our classes each week to provide a first person view of any given software sector. The list of visitors was long and included: Brian Halligan (CEO of Hubspot), Ted Schadler (Forrester Research), Harvard Business School Prof. Tom Eisenmann (platforms strategy concepts), Jeremy Allaire (CEO of Brightcove and the innovator behind macromedia Flash), Zia Yusuf (Stanford D-School, serial entrepreneur, and former SAP executive VP for platform ecosystem development) and so on..

One topic near and dear to me that we examined was that of large-scale software development approaches. Specifically, we compared the Waterfall and the Iterative software development approaches. We discussed the pros and cons of each, and developed an understanding of the internal (firm) and external (market) conditions that may call for one versus the other. I will not go into that discussion at this time; these two development styles have been compared and pitted against each other for decades now...




However, it seemed to me that the overall impression that people have is that an Iterative/Agile model of software development was the superior approach in today’s market, as more and more software development organizations are adopting or transitioning to the Agile/iterative approach, especially the software behemoth du jour: Google. This may be true, but Agile software development comes with it’s own unique issues and complexities that require special attention to Software Configuration Management (SCM), and therefore added costs. Note that Configuration Management is also one of the Key Process Areas (KPAs) in the Capability Maturity Model (CMM) for a level 2 organization. Another way to put this is: Agile may allow you to develop a software product faster, but not necessarily cheaper. I argue that in fact Agile requires more stringent SCM and planning which incurs more management effort and cost. Let’s take a closer look at some of the SCM issues that a manager might face when adopting an Agile development model. My point of view is based on my personal experience with large distributed software system development, having spent the last 12 years in software development for aerospace and defense systems at Raytheon Network Centric Systems and Raytheon Integrated Defense System.

Version Control:
I had never heard why Windows Longhorn was a failure until I heard about the way in which the Windows group had operated: nightly builds were constantly breaking as various component teams would “check-in” their changes to the common code base . “Making even small changes in one part of the product led to unpredictable and destabilizing consequences in other parts since most of the components were tied together in complex and unpredictable ways” (M. Cusumano, 2006, What Lies ahead for Microsoft and Windows.)
This reminded me of one of the major challenges I face in my professional life when managing multiple teams of developers working simultaneously on several “threads” in an iterative manner (we call features or capabilities threads – multiple threads of software development are woven into a particular release.) Many a developer has known the shame of walking in to the office early in the morning and having others give him/her the “you broke the nightly build” look.
The answer to this problem is Version Control: you allow parallel development activities to occur on “branches” and “sub-branches” that fork off from an exiting baseline. Each team has its own “view”. This allows each team to develop their piece in a stable environment. Later, all of the threads are “merged” together into a common baseline. In the example in figure 1 below, the same file “prog.c” is modified by two threads of development “db_optimize” and “pat_usability”; db_optimize is merged back to the main baseline and is part of release 1.4.


Figure 1 – Example Version Tree (source:
IBM Software Information Center)

Version Control is just one of the tools for managing the configuration of one of the essential work products of a software development organization: source code files. Many products exist in the industry to allow for the implementation of such version control strategies. Personally I am quite experienced with the Rational ClearCase suite of tools which provide us, among other things, with a version tree browser, a “diff tool” which highlights the differences between any two given versions of a file, and a merge tool which facilitates the reconciliation of conflicts between two versions of a given file. Another popular tool is CVS (Concurrent Versions System) which is Open Source and supports many environments. I must also pay homage here to the grandfather of them all, SCCS (Source Code Control System,) which I used in the old UNIX command-line days. While it is considered obsolete today, I still use it to control versions of my intranet webpage.
To be clear, these Version Control concepts are not unique to Agile development. As I hinted with my UNIX command-line reference, Version Control concepts predate Agile, and are used also in Waterfall models.


My point however is that Agile calls for even more stringent SCM, and that the complexities of your version tree and therefore of your merge efforts are going to increase dramatically as you create more branches and sub-branches to cater to more parallel and iterative cycles. In such an environment, you have to plan and coordinate the merging of the sub-threads and perform some higher level testing of your functionality.

A common pitfall for a software product development manager is to think that deveklopment work is complete when all of the threads have each individually completed their iterations. This is not the case! One has to plan for problems and account for some refactoring or rework that may arise when merge conflicts are irreconcilable.

Having managed a large team working in this fashion for the lasts few years, here are some rules of thumb that I now live by:
1- Merge early and merge often: The last thing you want to do is merge everything near the end of your schedule, right before a big test or sell-off, only to encounter a host of conflicts that break your schedule and possibly your reputation. It is much better to identify conflicts and address them early.
2- Make sure your team understands the branching strategy and are all on the same page. Before the start of every release’s development cycle, I hold a “SCM Strategy” meeting with all of the development leads, as well as the integration team, to go over a branching plan and agree to get team buy-in.
3- Have a CM-meister. In our case, it was very hard convincing senior management that we needed an individual working full time to manage our branches, views, etc. and coordinate merges. This person needed to be an expert ClearCase user. We never got such a person, so it became task by committee.
4- Always plan for SCM activities in your project schedule. In the past, our Integrated Master Schedule only included tasks for requirements analysis, design, code, etc up to sell-off but had nothing for merges, syncs, etc. This resulted in the “development is complete, but why can’t we start tests yet?” question as we scrambled to reconcile merge conflicts and get the software to run. I have now learned to plan and leave room for SCM – the size of this effort is calculated as a function of the number of threads in the release and the number of other active releases (bug fixes are ported across releases, further complicating things.)


-->Proper version control helps reign in some of the chaos that may result from Agile development, however its complexities translate directly into added costs for the program as a whole.


Interface Control:
Take the example of an interface changing on one branch, while procedure calls to the old interface were added in another branch, resulting in a code merge conflict that has to be remedied when the two threads come together. Such situations can be avoided by using a strict change control mechanism. In our case, interfaces between components, as well as common/shared data structures, are treated as a contract between software subsystems. Any change of contract has to go through a formal review and authorization process. In order to change an interface, one must submit an Interface Change Request (ICR.) All of the stakeholders are then tasked to review and approve the change, and the ICR is reworked until it is approved – then the implementation of the change is coordinated across the threads of development. This process is driven/managed by the Software Change Control Board (SCCB.) The workload of the SCCB also increases dramatically when multiple concurrent threads of development are occurring, and what once used to be a part-time role for a few senior software architects becomes a full-time responsibility.

--> Slow turn-around by the SCCB can a bottleneck to development and eat heavily into your schedule as teams of development wait for concurrence for an interface change.



Change Management/Request tools:
Now that you have many concurrent threads of development and interface changes to manage, how are you ever going to document all of this activity, compile objective evidence for CMMI auditors, or even know what is going on in any given release? Clearly you need a database to house information related to changes being made to the software, documentation, and other work products. You also need a place to document code changes, and manage the authorization and review process.

This is where change management tools come in. My company has its own proprietary tool, but there are many other tools on the market such as Borland’s StartTeam, or even 37Signals’ Basecamp.
I highly recommend using such tools to track all changes to your software. This becomes more important as the size of the project increases. In fact this is an imperative for any self-respecting software organization. These tools are also very valuable as defect metrics collection mechanisms and can provide you with a wealth of information. They allow you to collect important statistics and for example identify problem areas that need special attention. These type of statistics recently led me to a finding that our process for performing SWIT was inadequate as a large and disproportionate number of defects that should have been caught in that phase were escaping it and caught in subsequent phases of the program (and we all know the sooner you find it the cheaper it is to fix.)
Change tracking also adds another dimension of documentation: beyond well-commented code, the iTracker system allows me to further investigate why certain changes to the code might have been made. File versions on the version tree are labeled with corresponding CR numbers, and now you have a full story of the evolution of any given file.

--> Again, this is not a new concept that comes with Agile development. However, as before, the activity level increases and the number of concurrent development teams explodes, the burden placed on the people who have to manage Change Request system increases exponentially.




Conclusion
One may assume that Agile means “less overhead” however I have learned that when you take down some of the barriers and allow the engineering/programming part of software development to happen in a more creative, agile and iterative manner, there is a higher cost associated on the management side of this style of development. The process requires control at a higher level to reign in the inevitable chaos and avoid the spaghetti-code phenomenon. While an Agile approach may result in quicker beta releases and more innovative software features, if not managed properly it can turn into a nightmare of broken nightly builds and never-ending refactoring (Longhorn, anyone?) With proper SCM tools and techniques, much of the risk can be mitigated, albeit at a higher cost.