Technical debt is not a mere topic for the CIO or head of IT. Its business impact makes it a broader strategic issue and as such a topic for the board.
It Is Not Just About Coding
Legacy and New Software: A Technical Debt Recipe
Many organizations intend to increase their competitiveness by continuously adding new products or increasing their service offerings. However, adding and/or expanding features may not be supported by the existing IT architecture and might require integrating new code with existing software. This touchpoint between old and new carries the risk of creating and accumulating technical debt.
Technical debt (TD) is a metaphor coined by Ward Cunningham, one of the authors of the Agile Manifesto. Using the financial terms principal and interest TD explains the impact using non-optimal code has in the long-run. It refers to intentionally or unintentionally prioritizing features or project constraints like deadlines over design of code or system architecture. For example, combining existing or legacy solutions with new applications, or customizing off-the-shelf-software with bespoke adjustments. This may postpone replacing existing code or switching to a whole new code base (i.e., paying the principal) while still enabling the new features. However, such add-ons or combinations of systems make future maintenance more costly (i.e., interest accumulates). It may further make the evolution of the system – to address any future requirements – cumbersome or even impossible without a full rewrite or replacement.
The increased maintenance costs associated with patchwork code may use up most of an IT budget, limiting the ability of an organization to react to market opportunities, implement new features or innovate in general. These cost increases (i.e., interest payable on TD) are often neglected when a company decides between a fast rollout or additional features and a more future-proof solution that will require higher upfront investment and longer lead-times. While cash-strapped start-ups or other companies working towards a minimum viable product (MVP) in general may want to ignore TD and bring a product in front of their target customers early, the same may not be advisable for established companies with proven offerings in the market.
Sub-optimal technical decisions like non-optimal code allow for a quicker go-to-market. However, the downstream costs incurred by taking this short-cut will need to be repaid, unless the whole product/service will either be retired, abandoned or never upgraded. At some point patching up or upgrading the different layers of legacy-systems will no longer be possible and a full replacement or rewrite is required. In other words, the repayment of the principal is due. Until this time, the non-optimal code makes maintenance more costly, leading to the accumulation of further interest.
The concept of TD and its negative impact on the evolution of software solutions provides ample reason to move away from monolithic software architecture to more modular approaches and data lakes. These allow for easier refactoring and future development without putting the whole system at risk when removing bugs or adding new features.
Does this imply that TD should be avoided at all costs? No, but it needs to be identified and managed carefully. The total costs of TD, i.e., principal repayment plus accumulated interest, have to be considered right from the start to better estimate the return on investment.
There is No Magic Potion to Measure and Manage Technical Debt
Identify – Quantify – Repay
Making informed decisions on TD, whether to incur it, if and when to reduce it or to fully ignore it, depends on a three-step process:
Whether TD should be repaid or not depends mostly on the expectations of the business, whether the debt-laden IT-systems are expected to be changed or upgraded, whether they are stable or even planned to be retired in the near future.
Identifying TD was not long ago a predominantly manual process, unless it has been properly labeled as such, e.g., in a ticketing system or an issue tracker. However, the first couple of studies have commenced using machine learning to identify TD in tickets that do not directly refer to it. For software applications code auditing tools like Sonar or Cast can be used to crawl the code and rate it.
A popular model to quantify TD is SQALE (Software Quality Assessment based on Life-cycle Expectations). It starts with defining non-functional requirements of right code, establishing a list of bad coding practices and subsequently developing a debt-estimation model. Based on this model, every identified bad coding practice is assigned with the estimated remediation costs. TD is defined as the sum of the required remediation work. Debt density, i.e., TD by code size is then used to measure the severity of TD in the organization.
Paying off TD typically requires rewriting all or at least part of the code to make it easier to modify and integrate with other systems. Such investment only makes sense, if the system is likely to be upgraded in the future as interest especially accumulates in respect of changes or additions to the code base.
Agile and DevOps Framework can Help
More and more organizations become aware of TD, its costs and its dampening effect on innovation. To manage TD, it needs first to be identified and quantified. Only once the costs are visible, a business decision can be made whether to eliminate it, reduce it or ignore it.
In recent years a rising number of companies use Agile and DevOps to address TD. Some proponents even argue that Agile and DevOps should be combined as both aim to increase velocity and deliver value to customers more rapidly. While Agile emphasizes team interactions, culture and values, DevOps focuses on delivery pipelines and flow. An integrated approach may therefore get the best of both worlds.
Considering the accelerating rate of change in many industries, Agile as a project management methodology has become very popular. Agile encourages adopting a mindset promoting teamwork, accountability and self-management all with a clear focus on the customer. It addresses the diminishing returns of long-term planning in environments, in which requirements are either unknown at the outset of the project or change throughout its life-cycle. Hence, small teams collaborate in short intervals (Sprints) to constantly validate that the developed solution meets the needs of the customer.
To address TD through Agile software development, it is important to prioritize quality over scope of the release. The conflicting interests of business, which typically focuses on functional features and operations, which prioritizes technical stability need to be addressed and resolved. Each technical feature not implemented increases TD. Extreme Programming (XP) is an Agile software development framework that prioritizes code quality. Assigning time in each Sprint backlog for refactoring and paying down TD can keep the accumulation of new TD to a minimum while incrementally reducing existing TD.
DevOps is a methodology that combines software development (Dev) with operations (Ops). It addresses the tension between faster development cycles versus quality and security by automating the code pipeline, containerizing deployments and building security and quality assurance into the process while eliminating manual testing and interventions.Shared ownership among software stakeholders encompassing current development, maintenance and future releases are its foundations. The focus on automated testing, continuous integration and continuous deployment mitigates the risk of accumulation of TD and allows reduction of existing TD at a steady pace.
Companies like Intel use the Identify-Pay-Prevent framework. It combines Agile and DevOps approaches to address three sources of TD:
Deliberate & Prudent – introduced when quick changes are done to reduce time-to-market
Accidental or Outdated Design Debt – the results of systems evolving over time. Old design may not scale and therefore requiring substantial refactoring
Bit Rot Debt – the results of complexity introduced over time with many incremental changes and deviations from the original design and intent. This debt is difficult to fix or pay off and better avoided/prevented.
In their whitepaper IT@Intel White Paper: Enterprise Technical Debt Strategy & Framework Intel proposes three ways to address TD within an organization.
In a first step all software applications are categorized into continuous-development, maintaining status quo, or phase-out.
Based on this categorization Gartner’s TIME (tolerate, invest, migrate, eliminate) model is applied.
In a last step the further accumulation of TD is mitigated by integrating TD management into the DevOps model of continuous refactoring.
According to this framework, accumulation of TD should be deliberate and closely monitored, ensuring that TD doesn’t get out of control and that the benefits of accumulating TD exceed the costs.
Case Study: How Effective Coding Reduces Technical Debt
One of Cuelebre’s clients in the logistic domain has a legacy data lake solution with thousands of Hive scripts, SQL queries and Java code. Over the period the client experienced increasing delivery times even for small changes and management wanted to understand the root cause.
Even though the client’s team was well skilled, they had difficulties understanding the existing system and code base. After some in-depth analysis, Cuelebre found the following problems that caused or increased the delay, including
- No proper naming convention
- Duplicate and redundant code base
- Absence of continuous implementation (CI) and continuous deployment (CD) processes
- Mono repo structure which makes it difficult to work simultaneously on different use cases.
- Lack of documentation and appropriate comments in the code.
- Missing important functionality and features developed a few years back in the repository
Approach to resolve the accumulated TD
The technical experts from Cuelebre reviewed the code thoroughly and came up with recommendations. Cuelebre also prepared a detailed action plan to reduce the TD. Various tools like SonarQube were used in order to review the code and reduce the TD. Such static analyzers help identify redundant and repetitive code.
The Cuelebre Method
Cuelebre’s detailed action plan spanned over multiple sprints where solutions were classified into the following classes.
- Short term goals – something which can be fixed within the same sprint or in a few days within the regular development routine.
- Mid-term goals – tasks which can span multiple sprints and lead to substantial performance improvement.
- Long term goals – something which can only be achieve over a long period of time (typically years), but has the highest impact and provides the most performance and efficiency gains
Key results and added value from the DevOps approach
✓ Code review with pull request
✓ Static code analysis
✓ Following naming conventions
✓ Setting up a CD and CI approach
✓ Implementing a coding guideline
✓ Removing duplicate code
✓ Refactoring of the code from mono repo to multi repo
✓ Adopting new technologies
All these changes supported achieving the following outcomes
- Time to release for a feature was reduced by 75% compared with previous release cycle
- Developers’ efficiency doubled as reduced conflicts occurred in git merge
- CI/CD reduced the time for deployment and reduced manual effort by factor 3
- Due to an improved modular organizing structure, the team was able to migrate one module without impacting another module
- Removing duplicate code decreased code base by 40%
- Code review identified several architectural issues that have been rectified
Take Action to Identify, Manage and Mitigate the Risk
Technical debt is a suitable metaphor to describe the costs/benefits or trade-offs required when managing a growing IT infrastructure encompassing numerous applications. Decision makers have to consider future costs, if they use short-cuts, that allow a quicker roll-out or go-to-market compared with solutions that require more upfront resources, especially time. There is nothing wrong with taking a loan or incurring TD as long as paying back interest and principal have been factored into the decision. Software tools like Sonar or Cast can monitor code quality, security and help identify existing TD. DevOps, XP or similar methodologies and frameworks that use iterative sprints seem to be well suited to address existing TD as well as mitigate further accumulation.
Sia Partners has developed strong expertise and capabilities in assessing the maturity and effectiveness of the Software Development LifeCycle. In addition, Sia Partners has assisted its customers in managing and delivering their projects by adopting Agile and DevOps frameworks and methodologies, ensuring full alignment between business and development teams and a faster time-to-market.
This article was written in collaboration with Joerg Riebel, MBA student at the University of Hong Kong.