Sunday, May 15, 2011

Code as workflow


Code as workflow

At work, we have a couple of core components that are essentially workflow engines. I call them workflow engines because of the following properties:
  • The components house named business processes
  • The processes have granular steps and they are named, too
  • Data is passed between steps via a shared context - essentially a data bus
  • The processes may be long lived, and therefore have asynchronous steps
This model, while decidedly a leaky abstraction in implementation, got me thinking about plain old code, though:

Take the smallest organizing unit of modular programming - the function. It has a name, it has granular steps (although they are not named) and data is passed via a shared context - the stack.

I mention the similarity between the function and the concept of a workflow only to highlight that such a parallel is possible. In principle, any organizing unit - the class, the program, package or application could be modeled as a workflow, IMO.

I contend, therefore: At a suitably small scale, all code can be treated as workflow.

What benefit, if any, do we have with taking such a view of code, though? Business logic is expressed as workflow when we know the following:
  • The individual steps have meaning to the business
  • The overall process is likely to change over time, the implementation of the process is therefore required to change quickly to reflect the new reality.
  • The change usually causes reordering of steps, removal of steps, or introduction of steps. The process still remains the same, as does the implementation logic within each step.
It therefore behooves us to create a framework where the steps are named and their communication is through a standard data bus so that they can be easily removed/updated/added.

Now think of code in general, and read out the reasons I mention above for needing workflow engines. Except for the scale  and the "implementation logic remains same" part, they're the same reasons you have cod around as well. 
  • If you think each line of code doesn't have business meaning, you'v obviously not had a big impact bug that was fixed with a one-line change.Admittedly, not all lines have business meaning, however.
  • Code does need to change constantly to reflect business reality
  • All edits on code reorder the steps, remove them or add new ones. In addition, we also typically change existing steps in place. Aside from this difference, there's essentially nothing different between editing code and editing a workflow, and even that can be modeled as:
update = delete + insert
I'd go so far as to call normal code a form of "complied" workflow - it IS a series of steps that have business meaning, its just that we've deemed that particular series of steps as optimal enough that no change is expected. Until the next time we change our minds, that is.

What if we treated code as workflow?
 Imagine edits being made on code exactly as if it were a workflow where the operators available for editing are not at the character or word level, but at the step level. The developer would decide how to reorder steps to achieve the newly expected functionality, or if the better approach would be do away with the entire function (read superstep in a hierarchical workflow). Imagine the following kinds of operators:

  • Add step
  • Remove step
  • Update Step ( = remove + add)
  • Promote step  (up one level)
  • Demote step (down one level)
  • Coalesce steps
  • Explode step
As might be obvious, what we do today with our editors is the textual equivalent of these. The advantage of this conceptual hair splitting, however, is that we now have a semantic model for changes made on code. With suitable expansion, for example, it could be shown that promote step is the process of creating an abstract class (or interface).

Imagine next, an environment where changes to code are recorded as such a series of steps. That series of steps is itself a workflow. This opens up a lot of interesting possibilities:

  • A version control system that records changes to code as these workflow steps
  • A build/deploy system that allows code migrations similar to current forays into automated data migration (like Rails' activerecord). Essentially deploying a new version of code involves running code that changes the existing version in place, not replacing the old version with an entire new snapshot containing the new version
  • Pattern recognition applied to a set of such code edit workflows; and many similar code analyses that can now be done on the change stream itself, not just the end product.
  • This is obviously the tip of the proverbial iceberg.
All's not well in workflow world, though
In almost every workflow-based system/framework I've seen - be it in house like the ones mentioned above, or commercial ones like Webmethods, I've seen some major issues:
  • Polluted Data bus: Since the shared data bus is the primary means of communication, authors of individual steps have no trust on its contents as a whole. The do trust their immediate inputs, and will almost always take defensive copies of the input (in whole or substantial subsets of it). Its quite common to find multiple copies of the same data in the data bus, which obviously leads to inefficiencies and slowness.
  • Leaky Abstraction: Implementing a clean workflow is not easy. It requires discipline in using the data bus, and that alone as the communication mechanism. Any out-of-band communication between the steps means the premise of being able to take steps out, or reorder them is lost. Any framework built on a general purpose language will always have to contend with the sneaky programmer who got around the pesky data bus limitation :)
These are the reasons I shy away from asserting that code IS workflow. Its useful to think of code AS workflow, however. The baby in all of this bathwater is: "Can we use the concept of workflow to model changes to code in a useful way?"

I think yes.


The trail to the big ball of mud


  • Check out the first version of an app
  • Run code analysis tools like Structure 101 or Lattix on them, and setup rules for architectural violations
  • Repeat for each version to date
  • Project the results on a time-lapse display tool that shows change in architectural violations over time
This will show you:
  • The inflexion points where the implementation deviated from the original intent
  • Impact of adding new resources
  • Impact of not having policy manifested in the system/ not having documentation
  • Impact of tribal knowledge
I posit that this will also show you:
  • Why thought leaders that build great systems need not always make great teachers
  • Why tribes/inner circles are a bad idea
  • Why NIH is a bad idea
  • Why publicly available implementations/frameworks are better than proprietary ones in general
  • How a well documented proprietary framework with a clearly manifested policy could be a long way from becoming BBOMA. Although you might not find very many examples of such a framework :)

Monday, May 09, 2011

I would LOVE to be a tool developer, but...

developer faces a problem as part of normal app development
developer fixes the problem
developer faces the same problem again.
developer fixes the problem again.
developer faces the same problem 5 more times
developer builds a tool that automates the fix

time passes

tool gains popularity
developer now is a tool developer and spends all his time on the tool

time passes

tool gains even more popularity
developer is now part of (or owns) a company (or opensource project) whose product is the tool

another developer has a problem in his app domain that should be a feature on the tool
unfortunately original developer will never see this because his domain is now the tool itself.

Big Ball of Mud Architecture is like cancer; tools and policy are like chemo

..meaning they're the best answer we have, and all they can do is inhibit the spread.

Any enterprise has software that has technical debt, and it will keep increasing.

Tools that will help:

  • Visualization of the architecture - logical and deployment; show multiple views.
  • Tools that inhibit deviations from the blessed architecture instead of tribal control
  • Tools that embody promised features in the code
  • Code review via tools
  • Incremental code quality tools
Policy
  • Reduce impedence mismatch between the logical and the physical namespaces. eg: java package namespace is different from the deployed jars and that namespace.
  • Map source control artifacts and deployment artifacts - make it 1-1 as much as possible
  • Make setup and run for newbies as easy as possible. Not ok to have your bootcamp say "this is a difficult environment. Deal with it". Early success is a big confidence booster.