Thought Train to Nowhere: fluent

Showing posts with label fluent. Show all posts

Friday, March 01, 2013

Groovy DSL Voodoo... or why I think imperative programming makes more evolutionary sense

I recently had to consider the Groovy space for a project proposal as the client had decided on Groovy/Grails as the platform of choice. While evaluating the platform for the proposal, I naturally gravitated towards Groovy's DSL capabilities and started thinking about using it for my personal projects, obviously. Abu seems like a natural fit (why didnt I think of Groovy before, I wonder? I'd hit a wall with JRuby due to generics anyway); as did Jack.

Anyhoo, with all this background, I hit slideshare and found tons of decks from the language leads on the specific topic of DSLs. One example piqued my interest in particular: slide 106 of Paul King's deck, which I've replicated here as a gist:

I wanted to both understand how this worked and to replicate it by myself. Here's all the flailing I went about in trying to achieve those two goals:

As you can see, I failed miserably. My last trial had a near-working version, but left me completely disillusioned about Groovy's DSL capabilities - especially considering I came out of the process feeling that all the work was done by the single 'of' closure with the rest playing dummies to the Groovy language's optional syntax rules - especially because in my Englishy version, you could switch the order of the closures passed into 'of' and still have it work fine.

However, all of this didn't explain the "extra code" in the original, which left me with the nagging feeling that no language author would be trying to pull this much wool over people's eyes. So I went back and expanded the original expressions like so:

Now it made much better sense.

Mind you, I still haven't figured out how you go from a problem statement like "I want to express the ability to compute square roots in an English-y fashion" to the solution presented above. This is as far as I can get:

I'll obviously need a way to call Math.sqrt and then print it. This needs to be a function name that's english-y
An englishy description could be something like "Show the square root of 100". Using Groovy's existing rules, that makes "of" the candidate for the function name.
How now do we make the rest of the words work? Well, "show" and "square root" are exactly the two actions to be taken. So as long as I can define those as functions and compose them, I'm good.
How do I pass in the functions without naming them?

Obviously, its all the baggage of years of imperative programming knowledge holding me back. Somebody with more skill in functional programming might find this the intuitive enough to roll out.

But I have to wonder: There's too much magic going on here. The imperative approach to the stated requirement would be to define a grammar, write a parser and allow actual interpretation of such a language. Painstaking and error prone, sure; but explicit and clearly out of the programmer's head and into a language that is so dumb that there's no doubt as to what's happening.

This sounds like I'm propounding Worse Is Better in other words, but there's something more: I realized that the cognitive overload required to understand the Groovy version is higher. More importantly, the cognitive overload to retain that understanding over time is even more so - for "normal" developers who have not walked the functional way for very long. That's the majority in today's world, at least.

More insidious, however, is the implied promise that this is real; that the Groovy DSL is actually English. Could I, for example, expect the slightly less polite "show the square_root of 100" to work? Or the even curt "square_root of 100"? As an English speaker, why would I not?

As a programmer, I see why 'please' and 'the' are the required glue to make the english-y sentence work within Groovy's pretend-English world. Its extensible in that you could replace square_root with cube_root, for example; but not so that you could change the grammar of the sentence itself. That would require a different set of closures, like the ones you'd find in slide 105 of the same deck, for example. Note that in this version its 'the()' that is the active closure. But I fail to see why this should prevent me from expressing myself naturally as an English speaker.

This then, IMO, is the larger problem with DSLs. When you make something like its real-world counterpart, the human mind immediately taps into the huge body of latent real-world knowledge and tries to use it with that thing: although this DSL doesnt promise it, I immediately wanted to use English structure on it. It's all fine that your programming language has risen to talk your user's language, but has it captured enough of the user's world to go beyond "training wheel use"? To paraphrase Carl Sagan,: To make an apple pie DSL, you must make a universe that's up to scratch.

How will the DSL author handle this? In general, not very well, I think. I realize I'm picking on a toy example here, but on extrapolation I cannot imagine a domain where all possible concepts and all of their possible combinations are easily enumerated; and then retain their meaning over time and across minds.

I begin to wonder then, at attempts like the VPRI's FONC, where one of the overarching ideas seems to be to create a multitude of DSLs to produce an OS in under ~20 KLOC; and at feeble attempts like my vaporware designs of Jack and Fluent: would they build a better world or is the imperative code bloat of Windows and Linux the more sustainable way?

Wednesday, September 19, 2012

Idea: Use a map framework to depict code

Today's XKCD comic and its interpretation as a zoomable view using Leaflet had me thinking of the possibilities this presents:

Software cartography already demonstrated how code could be converted into a map. It even has the interesting property that it attempts to map the mental model of the code instead of its specific implementation - which IMO is way better than something like Code City simply because the city (or country) looks the same even if a few buildings disappear - if you know what I mean.

The only missing piece is scale - how do you scale this up to larger and larger codebases? Well, using a map engine is one way, IMO.

The problems of scale have already been solved there, as is that of display form factor: most of the map frameworks are already mobile-ready. The UI metaphors are familiar with most people too.

The only possible thing that detracts from my grandiose view of an n-dimensional version of CodeBubbles to depict the true complexity of code is that map engines are decidedly 2 dimensional. But even that is a weak argument - layers provide sufficient degrees of freedom to annotate the display appropriately.

Sunday, June 10, 2012

What if there IS no source code?

...yeah, kinda linkbait-y heading, I know.

What I mean is: What if there is no single source of the code?

Let me explain.

Typically we have source code. Its written by someone, stuck in a source control system somewhere, changed by others and so on.

The projectional editing school of thought modifies that picture somewhat by suggesting that we could have different views of the same code - a functional/domain-specific one, a folded one, a UML(ish) one, a running trace one and so forth. The relationship between the source and the multiple views, however, remains decidedly one-to-many.

What if instead of this master-slave relation, the different views themselves were the source? That is, the "whole picture" is distributed across the views - like a peer network?

Assuming the views are consistent with each other, modifying the source in one view should retain the true intent. But is that possible even? Views are, by definition, projections of the code; meaning some parts are included in the view and some aren't. So how would we maintain consistence across multiple views?

Two paths lead from here:
One: We cannot. This is why we need the "one true source" that's the parent of all views.
Two: Maybe we don't need consistency all the time. Maybe we can do with the "eventual consistency" that big data/nosql guys are raving about?

#2 seems like an interesting rabbit hole to explore :)

Wednesday, April 18, 2012

Fluent in the real world: Light Table

Today's HN story on Light Table only tells me that I won't have to build Fluent myself - someone else will, soon enough!

In the discussion that ensued, there was however, this interesting comment:

Yes, but our code was entirely in these utterly unusable changeset files that couldn't work nicely with the version control that everyone else in the entire world was using; his version still uses files under the hood. There's a team that's trying to back Monticello with Git, I believe by saving each method into its own file in part of a Git source tree; that looks promising as a compromise.

That's a good lesson to learn from, methinks. Just as "text as code and not just its serialization format" has stood the test of time, so has version control using textual diffs. Anything else will face the migration gap.

I can't be bothered to find the link, but a while ago there was a post by somebody big in the blogosphere (Spolsky or Cringely) that spoke about a bill pay start-up that solved the migration problem by allowing users to forward their paper bills directly to them to have the bills digitized and paid. The idea was that once the bills were digitized they could be paid electronically thence, but it was the first time setup that was the hump that users didn't want to go over. Take that pain away, and you got yourself converts by the droves.

Tripit does something similar with Trip Management by parsing booking confirmation emails from travel websites.

I'm not so sure such an approach will apply to developers and their tools for a couple of reasons:

Text is actually not a format that you want to get away from (unless you're in the structured camp)
Providing the gateway solution involves fixing all the tools to work with the non-text format you come up with. And you must fix ALL of them - the IDE, the debugger, the built tool, you name it.
Alternatively, you could change the environment to something where text input is actually less efficient. Like the Tablet. Then you have a chance.

Its a long, tough road ahead :)

Wednesday, March 14, 2012

Early thoughts on Jack

I found this writeup from 2009 in one of my umpteen storage media and thought it was evocative enough to publish. My later ramblings on the topic of Jack have been attempts to concretize the ideas rather than gaze starry-eyed at the wonder that is that ephemeral topic of a versionable language. This one is more of the latter kind than the others. There are 2 parts to the writeup: a exposition on top and notes at the bottom. The notes are actually the outline of the exposition, which is essentially unfinished. Except for some cosmetic formatting changes, I've reproduced the piece as is:

Think of an aspect. An aspect usually takes a chunk of code, and in the most general case (called the around advice), wraps around that chunk an envelope. This envelope acts as a guardian to any entry into the chunk and is not only capable of altering expected behavior of the chunk by altering its inputs, but is also capable of deciding if the chunk should execute at all.

Think of a monad. It effectively does the same, except it does it to separate non-functional bits from functional ones.

Think of an ESB. It too, does the same; except that it does it at a much higher level of granuality - at that of a service. Taking the generally understood definition that services are suitably large components that house some specific business functionality, an ESB orchestrates the sequence of operations between these components to achieve the affect of an application.

Now think of how we modify code. We do the same thing - alter the expected behavior, decide if the code should execute at all, (re)orchestrate the sequence of operations to achieve the effect we're expecting.

We are the aspect, the monad, the ESB.

However there are major differences between us and these constructs that act to the disadvantage of humans while maintaining code:

each of these constructs have the feature that it describes the change it effects on the component being changed. The humans changes are available only as diffs on an external tool - the version control system

the human's change is at a textual, character level; not a statement/method/package/module/unit of execution level.So the change is not percieved as a change of language elements to effect it, but numerous characters being shuffled around

While this might seem like a tirade against text-based programming and possibly making a case for structural editors, there's more than that. The key problem is that the unit of execution is not identifiable. Statements in modern languages are anonymous, except by the line number of the source file. If they were identifiable, we could express something like "i had to remove the 5th if statement to after the assignment of var foo" instead of "cut line 150-234, paste at line 450". The former is what we usually do when we talk about it, but there's no direct way of enshrining that in a machine readable way.

Whats the use of such a feature you ask? Well, imagine a language that allowed us to identify the statements, and then express addition, deletion and modification of statements within itself. Something like:

insertStmt module3.class4.method1.if#5, AFTER, module3.class4.method1.letvarfoo

The fact that the statements of code are addressable allows us to refer to them in a logical manner and the fact that the operations carried out to cause the code to change are operators allows us to maintain the change itself as code.Similar to insertStmt there'd be addStmt, deleteStmt and modifyStmt operators; and obviously we can extend the concept of a function to these operators too, so that the complete conversion from one version to another is expressed as a single operation - a changeset in code, if you will.Producing the next version of your app is no longer a snapshot activity - it can become incremental. Further, multiple changesets can be "compiled" into a fix pack of changes to produce any version at will. And all changes are expressed as logical changes, not textual deltas.

More importanly, think of what the language (or its units of execution - the statements) would have to support for this to work. They would have to become self-contained modules. Self-contained micro services, if you will, which can then be "orchestrated" my moving them to the location in the code which will cause whichever version we desire to be effected via these transformation functions. Code therefore becomes easier to change.

Now lets take it to the next level, and define these operators at all levels of abstraction/modularization that the language supports. So we'd have addMethod,addMethodArg, addModule, etc.

Notes:

Identity
modularity
esb-style orchestration
not just run time, but also statically, we can express the change occuring. which makes is amenable to machine learning.
automatic modularization/aggregation is the key to useful versioning.
forget versioning. what i'm really trying to do is to discover the steps in the computation being carried out that can be abstracted out such that an esb can act on it.
deriving the higher order sequence from base description
there are only 3 basic constructs in programming - the assignment, the goto (including the implied goto by the instruction pointer), and the conditional ie if. if is usually followed by the goto, and all instructions of the JZ, JNZ variety are combinations of the if/goto. so the real inflection point for the sequence of operations is the if. we can consider any block of code before an if as a single block with appropriate inputs, outputs and context, and similarly any block of code after an if. each if clause represents a micro service. if therefore is the micro esb.
so partitioning of code cana be done based on ifs. now if we take all the ifs at the same peer level - within a method, class or package/module, (or even app) and find the same conditions being checked, those paths can be collapsed, or refactored - this is similar to the ideas in subtext. find the right partitioning of code so that it can be expressed easily.

Sunday, December 11, 2011

Information Density and textual vs visual

I was reading the Wikipedia page on Information Theory as part of my read the wiki project, when it struck me that there's possibly an objective way of measuring the effectiveness of text vs visual programming languages using Information Theory.

The central concepts (whose math I'm admittedly unable to fathom) are those of information, information rate, entropy and SNR. One of the age-old cases for text-based programming ( and therefore against non-textual programming languages) has been that it has very low SNR and the "information density" is high for given screen real estate.

Is that really true, though? How much "noise" does syntax add? On the other side of the spectrum, I've seen infographics that assuredly deliver more "understanding" in the given screen space than the equivalent textual description. Is it possible to design an "infographic-style" programming language that packs more power per square inch than ascii?

It would be interesting to do some analysis on this area.

Tuesday, February 08, 2011

Jack/Webster/Fluent: Use YAML as the text format

YAML seems like a nice fit for a text-based format for Jack or Webster/fluent. The things that attracted me to YAML are:

Strings dont need to be quoted unless absolutely required. This is a huge advance over json
YAML has references.

it still might not be the best way to represent code as data, but its close.

Sunday, December 05, 2010

OT + Grammar = New way of defining languages?

Operational transformation is a way of defining documents via the series of steps required to create the document - among many other things.
Language grammars are usually defined by rules on how to recognize a document once its created.

What if the latter was done in the former's style?

Need to think this through but if the rule set is not prohibitively large this would be a nice way to achieve incremental parsing, not to mention the ability to have "islands of known information" mixed with WIP unknown fragments, methinks

Tuesday, June 30, 2009

FEM - like analysis on code

This idea is partly from a section I remember from the "Inmates are running the Asylum". Alan Cooper compares code to a huge stack of cards one on top of the other, with just that single flick threatening to bring the whole thing crashing down.

How about treating code as a structure, and using finite element method-like analysis on it? Of course, the math would be discrete, not continuous. But the possiblities are interesting - we already create software in layers, with upper ones depending on the lower ones. Surely some definitions of structural stabilty can be got from those? Similarly the dynamic nature of programs could possibly be modeled as forces...

food for thought..but that's as far as I've got with this idea.

Saturday, May 30, 2009

A Web-based MPS Editor

I have been playing around with IntelliJ's Meta Programming System for a couple of days now, and I find it a very interesting and promising tool - to teach myself its constructs, I'v started writing an IDE for Robot scripts. I've got a basic editor and generator done and it looks pretty good, especially for completely declarative code.

I was wondering, however, at the applicability of the concept to the web - as in how easy it would be to port the concept of a declarative structural editor to the web. The MPS editor language is surprisingly similar to divs and spans - all one would need to move the editor to the web would be to slap on a generator that creates D/HTML/5 widget wizardry!

Another project to start....

Implementation thoughts:
This might not need a full fledged editor like EditArea - since its structural, it might actually be cumbersome if it were. It might be easier to have a simple div/span structure or even a table with variable row and col spans, with a single in-place editor component for the current element being edited. Of course, keypress handlers would have to be written to make it usable for keyboarders (like me), and we'd have to find a way to show a cursor always.

Interesting ideas on continuation-style parsers for javascript editors: http://marijn.haverbeke.nl/codemirror/manual.html

Further implementation idea: use the MPS Editor language to define it, and create a generator for the editor from it that outputs GWT! This way the editor is web-enabled!