Thursday, October 27, 2011

Jack: some new thoughts, a few iterations

Some strands of thoughts have come together of late as I re-read my posts about Jack, watched Subtext and talked about CodeViz to people. Things congealed into place as I revisited OMeta and decided to give it a shot as a prototyping platform.

Iteration 0
As a precursor to trying OMeta out I listed out the features that Jack was supposed to have based on my posts. Here's what I came up with:
Transforming this to OMeta syntax, here's what I came up with:
As you can see, some compression has happened from my BNF-ish syntax to the OMeta one.

Iteration 1
Pumped by this quick success, I immediately started counting my unhatched chicken. How would people easily share Jack code, I wondered? I'd already thought of YAML as a storage format so, I quickly wrote up the YAML version of the function from above:


I also came up with the whole slew of pipelines that would bring Jack (and some part of Fluent) to life:


text syntax   --JackTxtParser-------> JackAST
JackAST       --JackUIRenderer------> JackViewTree
JackViewOp    --JackUIModifier------> modifiedJackViewTree
JackViewTree  --JackInterpreter-----> JS Evaluation
JackViewTree  --JackCompiler--------> JS Source
JacViewTree   --JackTxtGenerator----> text syntax output


Iteration 2
Then reality set in. What are the primitives I'd need to build for this to take off? A reverse read of all jack posts revealed the following set of "things to be built":


  • A primitive conditional
  • The ability to make a module out of anything and vice versa
    • make module (code list)
    • devolve module into codelist(module)
  • A way to denote a function as optimized via a Foreign Function Interface (FFI). This could be in the fact language, but this means the fact language and the compiler should talk.This is probably required anyway; the interpreter should be able to query the analyzer for facts' veracity.Note: This facility is like annotations, but doesn't entail arbitrary side effects as the annotation processor allows through arbitrary code running.Each FFI, however, should expose a way to call the underlying optimization, with ways to map values to and forth.
  • The ability to refer to any single piece of code: built into the structure already. url tbd
  • The ability to refer to any set of code pieces: define a continuous code range, define an arbitrary list of code pieces (is this required)
  • The ability to comment on such a set of code pieces: ie attach a comment to a code set. this is somewhat similar to the modularize reqt above
  • An FFI to scm tools with the basic functions supported
    • check in/out
    • commit
    • branch
    • merge
    • snapshot
  • A functional version expression
  • The ability to represent WIP code
  • Most importantly, an interpreter that supports all this

Iteration 3
Then I took a step back and looked at the complexity growing. Could there be some abstraction done? Here's the outcome of some furious (re)thinking:
  • There are nodes
  • Nodes have attributes
  • Standard attributes are:
    • [id] is used to uniquely identify a node. it can be system-generated or user-provided
    • [comment] is used to provide a comment about the node.
    • [fact] is used to state a fact about the node using FOPL; and is typically used to derive some "higher order knowledge" about the node
    • [name] is used to provide a referencible alias for the node.
    • [kind] is used to identify the type of node. this may be dynamically assigned to implement duck-typing.
    • [context] is used to "run" or execute the node.
  • Other attributes can be added at will and used in execution
  • A collection of nodes is also a node, and therefore can have the same attributes and be executed similarly.
  • Standard nodes are:
    • base:
      • block : a continuous sequence of nodes to be executed one after the other
      • if : a node that conditionally executes one of its child nodes
      • define : a node that adds a name to a node
      • function : a named block
      • module : a collection of functions
      • app : a collection of modules
    • meta:
      • group : a node that groups other nodes (and optionally names)
      • split : a node that splits an existing group
      • insert : a node that inserts a node at a given point in a collection
      • delete : a node that deletes a node from a collection
    • versioning:
      • checkout : a node that checksout a version of its input node ref from scm
      • commit : a node that commits a version of its input node ref to scm
      • branch : a node that branches a version of its input node into a new branch
      • merge : a node that merges two node refs
      • standin : a node that can stand in for any other node. to be used for nodes that dont exist yet 
Thoughts
I reached this far and realized something larger abstractions are possible:
  • Jack code could be stored in any data format that can handle trees - backward compatible data format!
  • Jack could use any language as host - backward compatible code! Jack is "just another scripting language on top of $my_fav_lang
So, it sounds like:
  • Jack's true role will be to consolidate sequences to specific hosts, create and break interfaces and manage the change that happens - a sort of super shell.
  • Ultimately Jack should enable code comprehension and legacy support.
Maybe Jack should be renamed Glue.

No comments: