Friday, October 28, 2011

Jack and Turing completeness

From the previous posts on Jack's primitive capabilities it should be pretty obvious that it is turing complete. It supports sequences and selection naturally; and allows iteration of any node's children.

The intent, however, is to NOT use Jack as a primary programming language. I do not see a hello world in Jack's future - at least not in the conventional sense.

Jack's Turing completeness is intended to provide a full complement of computing power so that program logic may be effectively TRANSFORMED. The more appropriate set of hello world would programs may be:

  • A program that edited another program and represented the editing as a sequence of Jack steps
  • A program that reasoned about another program using Jack
  • A program that expressed the history of a program and exposed its "why" using Jack
  • A program that expressed the future roadmap of a program using Jack

Jack: the whys and wherefores

As I pen down these thoughts that I have about Jack, I realized that there's a little bit of the "losing the forest for the trees" going on - while the initial posts were  what-oriented, the latter ones have been decidedly how-oriented. The "why" is kinda sprinkled around. This post attempts to correct that gap.

Jack is the culmination of a series of vaguely connected concepts that have been swimming below the surface in my consciousness. Some of the individual strands are:

  • Code today suffers from want of better abstraction for itself. The simplest manifestation of this is the conflation of its persistent, modeling and editing formats into one single form - text. I posit that treating code in its true form - an AST - directly will reap higher benefits.
    • There are sub-strands to this where concepts from Subtext (and the whole language workbench movement, for example) finds resonance with my thinking and therefore has been subsumed into it.
  • True code structure is latent and the structure made apparent by its organization into files and folders is misleading. I posit that all code will look like graphs (a la social networks) once we represent (and visualize) the actual relations between code components. At that point, graph CRUD techniques can be used to manage the code and graph analysis techniques can be applied on it to reduce the comprehension overhead.
  • Today's applications require connecting together at least 4-5 different languages. There is no conceptual glue to hold these together, splintering comprehension and adding to the accidental complexity. I posit that treating apps as a graph containing code components will allow for easier comprehension and maintainability.
  • Today's development practice is woefully snapshot based. There is no way to represent the code's past (so that you may learn from it or make corrections easily) nor its future (so that you may plan for it or code in a certain way)
  • The average developer follows Reality Driven Development, no matter what methodology they say they follow. 
  • All of this has direct bearing on EXISTING code. Most approaches to bettering software development talks about tools and techniques that you can use on your NEXT project. I'd like to come up with tools and techniques that you can use on the CURRENT or PREVIOUS project, or even that ANCIENT one from the '80s - the one that's already in production and making money for the business and will never be replace (at least not completely) that you have to either maintain or talk to. This is also the hard and unglamorous side of the development cycle - the maintenance side. However this is also the place where code spends most of its time and yet is mistreated much. 

Jack is the code name I use to represent my solution to these issues. It started out addressing the format and representation issue and has burgeoned into including the rest of the concerns. It therefore has gained the larger qualities that any solution to the problems above should have:

  • It should not favor any particular language or framework - at least not design and definitely not in the ultimate implementation.
  • It should not introduce another language either; instead use existing language runtimes as host environments and introduce ways to work with the code by adding metadata and the ability to programmatically refactor it.
  • It should introduce a runtime environment that is live in the same way a LISP repl or Smalltalk IDE is. Compiled environments should get as close as possible to this ideal. This is fluent.
  • It should allow for models of the code to be built at varying levels of abstraction from the code itself. These models would be used to understand code and build it. It should therefore subsume disparate technologies used to build working apps or make it easy to add new ones into the modeling platform.
  • It should allow incremental adoption of itself. No big bang, lose-your-religion type steps because we're dealing with maintenance code and developers following RDD.
  • By definition, therefore, all tools and techniques within should have models for incompleteness: incomplete code, design, architecture, relations, etc. This should be used to help comprehension of the code in "comfortable bite sizes" and to allow description of future code whose specifics are not yet known.
  • It should have a native access to the history of the code so that comprehension can follow. This is the reason the previous post has the scm FFI.
So there you have it: the WHY and WHAT of Jack in one post.

Thursday, October 27, 2011

Jack: some new thoughts, a few iterations

Some strands of thoughts have come together of late as I re-read my posts about Jack, watched Subtext and talked about CodeViz to people. Things congealed into place as I revisited OMeta and decided to give it a shot as a prototyping platform.

Iteration 0
As a precursor to trying OMeta out I listed out the features that Jack was supposed to have based on my posts. Here's what I came up with:
Transforming this to OMeta syntax, here's what I came up with:
As you can see, some compression has happened from my BNF-ish syntax to the OMeta one.

Iteration 1
Pumped by this quick success, I immediately started counting my unhatched chicken. How would people easily share Jack code, I wondered? I'd already thought of YAML as a storage format so, I quickly wrote up the YAML version of the function from above:


I also came up with the whole slew of pipelines that would bring Jack (and some part of Fluent) to life:


text syntax   --JackTxtParser-------> JackAST
JackAST       --JackUIRenderer------> JackViewTree
JackViewOp    --JackUIModifier------> modifiedJackViewTree
JackViewTree  --JackInterpreter-----> JS Evaluation
JackViewTree  --JackCompiler--------> JS Source
JacViewTree   --JackTxtGenerator----> text syntax output


Iteration 2
Then reality set in. What are the primitives I'd need to build for this to take off? A reverse read of all jack posts revealed the following set of "things to be built":


  • A primitive conditional
  • The ability to make a module out of anything and vice versa
    • make module (code list)
    • devolve module into codelist(module)
  • A way to denote a function as optimized via a Foreign Function Interface (FFI). This could be in the fact language, but this means the fact language and the compiler should talk.This is probably required anyway; the interpreter should be able to query the analyzer for facts' veracity.Note: This facility is like annotations, but doesn't entail arbitrary side effects as the annotation processor allows through arbitrary code running.Each FFI, however, should expose a way to call the underlying optimization, with ways to map values to and forth.
  • The ability to refer to any single piece of code: built into the structure already. url tbd
  • The ability to refer to any set of code pieces: define a continuous code range, define an arbitrary list of code pieces (is this required)
  • The ability to comment on such a set of code pieces: ie attach a comment to a code set. this is somewhat similar to the modularize reqt above
  • An FFI to scm tools with the basic functions supported
    • check in/out
    • commit
    • branch
    • merge
    • snapshot
  • A functional version expression
  • The ability to represent WIP code
  • Most importantly, an interpreter that supports all this

Iteration 3
Then I took a step back and looked at the complexity growing. Could there be some abstraction done? Here's the outcome of some furious (re)thinking:
  • There are nodes
  • Nodes have attributes
  • Standard attributes are:
    • [id] is used to uniquely identify a node. it can be system-generated or user-provided
    • [comment] is used to provide a comment about the node.
    • [fact] is used to state a fact about the node using FOPL; and is typically used to derive some "higher order knowledge" about the node
    • [name] is used to provide a referencible alias for the node.
    • [kind] is used to identify the type of node. this may be dynamically assigned to implement duck-typing.
    • [context] is used to "run" or execute the node.
  • Other attributes can be added at will and used in execution
  • A collection of nodes is also a node, and therefore can have the same attributes and be executed similarly.
  • Standard nodes are:
    • base:
      • block : a continuous sequence of nodes to be executed one after the other
      • if : a node that conditionally executes one of its child nodes
      • define : a node that adds a name to a node
      • function : a named block
      • module : a collection of functions
      • app : a collection of modules
    • meta:
      • group : a node that groups other nodes (and optionally names)
      • split : a node that splits an existing group
      • insert : a node that inserts a node at a given point in a collection
      • delete : a node that deletes a node from a collection
    • versioning:
      • checkout : a node that checksout a version of its input node ref from scm
      • commit : a node that commits a version of its input node ref to scm
      • branch : a node that branches a version of its input node into a new branch
      • merge : a node that merges two node refs
      • standin : a node that can stand in for any other node. to be used for nodes that dont exist yet 
Thoughts
I reached this far and realized something larger abstractions are possible:
  • Jack code could be stored in any data format that can handle trees - backward compatible data format!
  • Jack could use any language as host - backward compatible code! Jack is "just another scripting language on top of $my_fav_lang
So, it sounds like:
  • Jack's true role will be to consolidate sequences to specific hosts, create and break interfaces and manage the change that happens - a sort of super shell.
  • Ultimately Jack should enable code comprehension and legacy support.
Maybe Jack should be renamed Glue.

Tuesday, October 18, 2011

Intent to sms contact

Write an android intent to allow sms-ing a contact, cos the default phone app doesnt have one.

Update: This has been Implemented. Woohoo!

Saturday, October 08, 2011

Physics of Software: a review

I stumbled upon an interesting series of blog posts that attempts to bring in a "physics" for software. This attempt in and of itself seems to me a worthy cause as IMO Computer Science could do with some foundations like these. There's too much on the engineering side and less on the science one.

The other reason I liked the series (which I'm still reading) is my own latent ideas on using FEM-ish analysis on the structure of software.

Halfway through the series, however, it seems to me that the analysis is overly reliant on real world physical concepts. Software having a center seems fine, as does mass and inertia; but extrapolating that to speed and acceleration seems contrived.

My own take is that software DOES have a physics of its own. However, concepts from Natural Physics may be applicable as analogies, but not directly to express this phsyics. My suspicion is that the physics of software would be multi-dimentional (which the author of this series concedes as well), graph-like (which he also alludes to with his attraction-repulsion diagrams), fractal (also stated, but not evolved fully) and much in need for a set of operators that define the effect of change (which I've not found yet in these papers).

Still, a very worthy exercise indeed; and one that introduced me to Coplien's forays into the area from the late 90s. Bonus: Reinteroduced me to Christopher Alexander's Design pattern concepts.

PS: While the posts dont talk about change, I was led to them via link from Micheal Feather's attempt to quantify design change by measuring correlation between methods/classes changed in a single commit.