Saturday, February 05, 2011

Maintainability of code

Most of my career has involved maintaining code built by others. I'd wager that most developers' career has been the same, even if they didn't want to admit it. Sometimes maintenance goes under the name of updating your own code, so people delude themselves into thinking they're not maintaining code, they're just creating the next version. From a maintainability POV, any code that's built is legacy.

And yet most (all?) methodologies of software development either address completely start-from-scratch greenfield development, or propose ideal practices that mature, passionate developers who are in the field for the love of programming.

The reality, however, is that most development involves enhancing or maintaining code - usually somebody else's, and mostly crufty; and in all likelihood the crufty code is from somebody who's presence in the industry is a happenstance rather than a planned event. If you're lucky it merely represents somebody's state of maturity at the time of writing, and that person has now improved.

A comprehensive methodology for maintainable software, therefore, must:
  • To address the legacy code angle:
    • Provide mechanisms to create "good enough" comprehension of the code
      • But avoid attempts at large scale or totalitarian model-driven comprehension. Such attempts will always fail in the real world simply because there will always be something outside the model
      • That is, allow for leaky abstractions
      • That is, allow for manual steps. There will always be manual steps. The methodology should allow for that by exclusion or explicit statement.The former is easier.
    • Help identify what changes and what doesn't. Easier: help identify what has changed and not in the past and let the humans extrapolate to the future.
    • Provide a means to migrate from what is to what should be that's incremental.
  • To address the maturity concern:
    • Allow for different levels of maturity
    • Allow for the ability to define "good enough" architecture, design and code; and the ability to easily enforce it
    • Allow quick enough comprehension of these definitions
    • Allow for gradual adoption, and a means to measure progress
The usual solution is to relegate this to Software Engineering, which typically spirals into talks of CMM and suchlike - process heavy, people agnostic.

The reality, however, is that software development is largely a human effort, and precisely because it lacks the usual shackles of other human endeavors. A mechanical, electrical or electronics engineer will always hit upon some natural limit. Not so the software engineer. His limits are the limits of the mind. If you can think it, you can make it.

And therein lies the problem. If you can think it, so can a multitude of other software engineers; and each such mind can think of at least one variation to the same problem using the same essential solution. This is why I believe we will not see Software ICs any time soon. Most process oriented methodologies tend to gravitate towards this concept, or to the equivalent one of "resources tasked to do X will repeatably produce output of increasing metric Y".

Meantime, in the real world, human programmers are finding ways to game that system. 

As software practitioners, what are we doing to better this? There seem to be promising starts with the BDD and (to a smaller extent) TDD movements, and (in a less focused, but generic scale) with the general move towards declarative programming. There're some incremental gains from tooling in general, but those gains are largely in the area of reducing the steps that the interface requires you to go through to achieve any particular task. There's also some progress in architecture analysis, validation and enforcing that constructs such as DSM and its ilk provide - if practiced regularly.

However, all of these lean toward the mature, self-aware programmer. By the time it reaches Joe Developer, its an organization mandate, not something he WANTS to do. So the gaming cycle begins. This time, however, its insidious because it projects the impression that code quality has improved. We therefore need tools and methods that "work in the trenches". I don't have a good enough answer, but here're are some interim steps that I can think of:
  • Allow for easy documentation of what exists, and in such a way that the document follows changes. Working diagrams are a good way of documenting the subset of the code that's understood. 
  • Use tagging as a means of collation of documentation created thus. Its easy and "good enough".
  • Don't loose what you gain. Developers gain a lot of system knowledge during debugging. There's no easy way of saving that for posterity, so the knowledge is lost. Do not expect developers to write a wiki page - they don't have the time. This area is ripe for tools such as 
    • bookmarking and code traversal trails being saved. 
    • A "tweet my XApp-fu" app is what's required here.
    • A way to share all of this that's searchable
  • Make creating tests a 1-click affair. Any run of the software should be convertible into a test, especially a manual  during the normal write-compile-test cycle that the developer engages in.
  • Allow cheating on setting up the context. In most production systems, its not easy to inject mocks or test doubles. Test automation should allow for as little of the system being mocked out as possible.
  • Mindset changes:Somebody on the team needs to evangelize these:
    • "Works on integration box" is less useful than "PROVABLY works on any box". This implies CI without requiring buying into the CI religion.
    • Exemplar test data is better than just test data. Exemplars are test data sets that "stand for" a scenario, not a particular test run. 

No comments: