Showing posts with label soapbox. Show all posts
Showing posts with label soapbox. Show all posts

Saturday, September 15, 2012

Why new bottles with the same old wine works... everytime!

I attended the second BangaloreJS meet today. The sessions were good and so was the food; and I met some really interesting people so in all a fun time was had.

Before the sessions started, however, Jon - the meetup organizer - showed a quick demo of Yeoman: a new tool released by Paul Irish and some of his friends from Twitter that eases building client-side JS projects.

I had a senior moment when I finally realized what it was.

This was Rake in client-side JS clothing, which was ant in Ruby clothing, which was make in Java clothing, which was ... I dont know what came before make because that was waaay before I was born, but I know there was something else that all the cool kids left for make.

Yes, I realize this does way more than a simple build tool; much like rake does way more than just build code. And it did it in a way that seemed magical when it came out first, too. You could do all that with ant tasks/plugins but it was not the same thing. But that's besides the point. It was a build tool in its core.

Why do we as developers keep.reinventing.these.common.tools?

Answer: For the same reason tool vendors have different versions of the same tool by language.

See, I happened to be browsing JetBrains' website the other day and saw that they claim that their non-Java IDEs share the "same features as other Jetbrains IDEs" or something to that effect. It struck me that they surely have a generic base version of the IDE that is skinned and customized for each "vertical" i.e., language-specific IDE.

Why then do they have one tool per language?

Because there are very few polyglot programmers in the world. Majority of us stick to one language, maybe a platform with a few related ones. That's IT. So we want something that "speaks to us, knows our problems", not something that can boil the PL ocean. And why would we need all those generic, all-language features anyway, right?

So the vendors sell us the shiny new bottles. And we drink it up.

And that's the way it should be. Why should client-side JS devs (who are in an unprecedented hype cycle after a long winter) be grudged a "robust and opinionated client-side stack, comprised of tools and frameworks that can help developers quickly build beautiful web applications"?

Note to self: When you finally finish creating Fluent, remember to brand it by language. FluentC, FluentRuby, FluentJava all sound very usable :)

Wednesday, June 27, 2012

Tools are the bane of the Beginner

"Huh? Isn't that counter-intuitive?", you ask?
I realizing something particular as a result of some experiences and wrote that line down, but to me even it sounds counter-intuitive; so let me hasten to explain.

A beginner, for the purposes of this discussion, is somebody who's beginning something. This could be a novice starting to learn a skill; but it could also be an expert who's beginning a new project within his area of expertise.

A a beginner (thus defined) is hindered by the presence of tools and frameworks for two reasons: 
  • They obscure the what and the how of the problem at hand by hiding it within ( usually for novice beginners).
  • They prevent easy exploration of the why of the problem and its solution space through ceremony and preventing access(usually for experienced beginners).
Allow me to present the experiences that led to this realization. It all began one mundane work day in the not-too-distant past....


Story#1

My team had just got a bunch of freshers. They were recent engineering graduates (presumably with some exposure to programming) who have passed through the company's training program (which again presumably imparts further such exposure). We, however, found that they couldn't do some simple tasks like write code outside Eclipse. They didn't know how to deploy a web application except through the Eclipse plugin; had never debugged an application via logs and in fact didn't know  about webapps as an idea independent from "Tomcat". Their OO concepts were shaky at best, but they had implemented small-but-complete web applications using Tomcat, Struts and Hibernate and passed an EJB exam. When asked to build their study app the same from scratch using a command line, however, they were lost. When asked to build a different application (than the one they'd done) *using Eclipse*, they were similarly lost.

While a large portion of the blame should rightly lie in the teaching methodology (or lack of one), the tools and frameworks too, IMO, should bear some of it. "Deploying" for them meant clicking on the Tomcat icon in Eclipse, so they had no use of knowing what "web.xml" did nor did they know that it was no longer required. The same wizards and menu options that make the life of a practitioner easy actually obscure the underlying process (and why its required) to a beginner.


Story#2

The same group of freshers were slowly getting on track with (re)learning the basics of programming, when I thought it might be a good idea to instill in them at this "early age" the values of Test Driven Development. I immediately checked myself, however, because they'd have to learn JUnit and how to use it. On second thoughts, however, I realized that they didn't HAVE to use Junit or any such framework to do TDD. All they had to do was write a test before writing the actual code, have it fail, write the code and have it pass the test. The test could be code in the same function or in main() or as a series of calls to the program stored as a shell script or as JUnits. All of them are equally valid as "tests". We generally, however, recognize only the last of these as tests. The concept of TDD has been usurped by the concepts of the tools that implement it - to the extent that TDD doesn't seem have a life outside of those tools.

Tools, therefore, seem to be actively scuttling the consumption and adoption of Concepts, even when they were created explicitly for the reason of automating the repeated application of known concepts.

I personally have been struggling with this - the guilt of not doing TDD vs the allure of just seeing working code - especially when I'm beginning something and still feeling the problem and solution spaces out. In some cases, the solution space doesnt have readily available tools (BDD on the browser, anyone?) and in others there are tools but I'm still not ready to commit to them because I dont know what my solution is yet (should I build the parser first to figure out the syntax or the AST interpreter to see how it would run?). My liberation came when I declared that a test will be whatever I call a test for the situation at hand, not what some framework determines to be one. Since then the test-red-code-green-repeat cycle is a much more doable one.

Full Circle

Back to the initial "Huh?" moment from the beginning. Why then do we generally consider tools to be useful - especially for beginners? Tools are generally time-savers. They do one thing and they do it well; and that is their value. They do, however, have an "operating range" in which they're most useful. Below that, they're overkill and above, they're obstructive - as depicted in this highly accurate graph on problem size vs tool effectiveness:


So when we usually talk about tools being useful, we're talking about the useful operating range. Specifically for beginners, tools are solution accelerators at that range. The stories presented here represent the two ends of the spectrum, however, where tools are sub-optimal.

Note: I've glossed over frameworks in this discussion, but the concept is the same; or applicable even more so for frameworks. Frameworks by definition are a standard solution to a common problem, with room for customization so that application specifics can still be implemented. The framework is one because it has a known world view and exposes an interface that allows operations on that world view. The concept of operating range is well-ingrained, therefore; as are those of the limits on either side.So please read "tools/frameworks" wherever you see "tools" below.

So...

Armed with this framework for evaluating tools, we can start asking some interesting questions.
  1. What is a good tool?
  2. When are tools not required?
  3. When are tools required?
  4. How do we determine the operating range of a tool, then? 
  5. What can we do to use tools more effectively?
  6. What can tool builders do to make effective tools
Attempts at answers to these questions in part 2 of this article.

Monday, August 22, 2011

On Software Testing and the case for the Common Test Scenario Database


TL;DR


As a participant (and sometimes contributor of ideas) to Test Automation efforts (Strategy, tool selection and implementation), it has been my long-standing opinion that the true gains from automation can be achieved only when Test Scenarios (enshrined as exemplar test data) are a heavily reused resource across the organization. This page is an attempt to "arrive" at that opinion by progressively optimizing the software testing process.

I take the example of testing the hotel search feature and progressively optimize it through 4 stages to illustrate the value of each such optimization; and the value of shared test scenarios in particular.

What is Software testing?


Software Testing is the art and science of validating that a particular piece of software behaves as expected. Since most words in the previous sentence are loaded with meaning, I'll dissect it:
  • Art and science: Art because some of it is based on the skill of the tester and science because most bits are tractable into well-defined, repeatable procedures with predictable outcomes
  • Validating: This might involve the atomic step of input-process-output, or a sequence of many such atomic steps (typically known as "following a script"). Each such step (or steps) entails comparing an expected value (or more generally outcome) with a known one
  • Expected: This implies that there are some requirements that the software is supposed to meet which is an accumulation of the most important (or all) of the comparisons mentioned above.

A Concrete Example - Testing Hotel Search functionality

Any travel website has a hotel search functionality. You enter a destination, start and end dates and number of guests; and the website returns a list of hotels. Each hotel card has some details (name, location, address, picture, reviews, rating, description etc); and the list itself has some details (count, sort order, #pages, etc).

Notice also that there are some implicit inputs (eg: choice of POS, choice of currency) and some implicit outputs (eg: sort order).

To truly test this feature, you need to validate that:
  • The functionality is right:
    • The list is as expected (has the expected number of hotels which are in the expected order, paginated at the right place, etc)
    • Each hotel card is as expected (Has the name as expected, image shows up, description is correct, etc)
  • The display is right:
    • Visual characteristics for each data item is correct
    • No visual elements overlap
    • And so on
  • Adding the feature didn't break anything else:
    • tests on features that already exist pass
    • tests on features that are in-flight at the same time and targeted to the same release pass
  • The performance is within acceptable limits

Focusing on functional testing alone for a moment, we see that:
  • Each combination of values that we input produces a unique output; and
  • Each such output has details that need to be validated against expected values.

The universe of all things that need to be tested therefore is of the order:
             Number of input combinations  X  Size of output for each input combination
Now, for the hotel search scenario,
            Number of input combinations = Product of (Size of the set that an input belongs to) for all inputs.
And
            Size of each output = Size of result set details + Size of details for all hotel cards
Note: inputs include both implicit and explicit ones


Using the details mentioned above and ignoring the etc's for now,

Size of inputs for hotel search
Number of inputs = 2 implicit + 3 explicit = 5
Product of sizes
= Size of the set of destinations
X Size of the set of Possible Start dates (and invalid ones)
X Size of the set of Possible End dates (and invalid ones)
X Size of set of number of guests (subset of Integers)
X Size of set of POSs allowed (subset of integers)
X Size of set of Currencies allowed
 
= 50000 (assumed)
X 365 (assuming a year of allowed booking)
X 365 (assumed similarly)
X 4 (assumed to be max allowed guests per booking)
X 1 (assuming USPOS)
X 1 (assuming USD)
Size of each output
= 3 result set details + N x (7 hotel card details)
 
Where N = size of result set itself

If N = 100, the Test Universe = 50000 x 365 x 365 x 4 x 1 x 1 x (3 + 100 x 7) = 1E13 Tests
Onto this set of 1e13 Tests, we'll have to add the Regression, UI Validation and Performance Tests as well.


Sidebar for the mathematically inclined
 All testing can be considered validation of functions. Given an abstract function y = f(x), testing can be considered validation that for all x in X (the input set), there exists the expected y in Y (the output set)
The domain of the function represents the size of the input and the range of the function represents the size of the output. The Cartesian product is, therefore, the set of tests to be conducted to validate the function. 


Obviously, this is a lot of testing to do; and if we're actually able to do all of it, that would be Exhaustive Testing. Also obviously, anything larger than the simplest feature would quickly be intractable due to the combinatorial explosion of tests required; so we apply the "art and science" part and try to pare the problem down a bit. I'll focus on the functional testing here, but most of the concepts apply to UI validation and Performance as well.

Before we start, however, I'll define some factors to evaluate the efficacy of our optimization with:
  • The number of tests required to be run (or reduced as a result of the optimization)
  • The cost of running the tests (in terms of time, resources, $, etc)
  • The overall quality of the product is still within acceptable limits

Optimizing testing - Round 1: Scenario Based Testing

This optimization concept is very simple - reduce each set mentioned above to a representative subset. Each element of this subset would "stand for" or represent an entire class of values within the original set. A success or failure of the chosen value is deemed a success or failure of the entire class.

To continue the Hotel Search example, the inputs could be reduced to the following scenarios (for example):

Destinations
  • Top 10 destinations
  • 1 non-US destination
  • 3 destinations with "known issues" or special edge cases
Booking Dates
  • Last minute bookings (1 date in the next 3 days)
  • Planned bookings (4 dates in the next 2 quarters)
  • Peak date bookings (2 National holidays)
  • Weekend bookings (2 weekends in the next 2 months)
Guests

  •            4 is small enough a number to test all combos, but we could still pick a subset, say 1 and 4 only
With this, the input size drops to 13 x 9 x 2 = 234
And the test universe becomes 234 x (3 + 100 x 7 ) = 164,502

That's still a huge number, but 8 orders of magnitude less already! We could optimize this further by reducing the validation on the output values if we want to. Realistically, we can probably get away with checking 4 of the 7 details for the first 25 hotels; and checking the result set details just once throughout. So the test universe reduces to:
234 x ( 25 x 4) + 3 = 23,403

How has this impacted our evaluation factors?
  • The number of tests required to be run has obviously come down.
  • The cost of running each of the tests still remains the same; we haven't optimized that yet.
  • The resultant overall quality depends on the scenarios chosen. If we've chosen well; it should be within acceptable limits.

Note that there are more algorithmic ways of arriving at a subset of tests; Orthogonal Array testing to name one. I'll not elaborate on this further as the optimization is the same in principle - that of reducing the total number of tests required in validating a feature.

Similarly, on the regression testing side of the house, scenario-based reduction of tests can be done by carefully analyzing the areas that changed code is likely to impact; aka Impact Analysis.

Optimizing testing - Round 2: Automation

When you have many features similar to the one depicted above, scale effects come to bear:
  • The total number of tests to be run is still a large overall number
  • The cost of regression is a constant despite reducing the regression test suite using impact analysis.

The optimization to counter this is conceptually simple - relegate repeatable tests to a machine so that human cycles can be spent on the unique ones. This is easier said than done in practice; and the quickest way to get started is - surprisingly similar to BDD precepts - Outside In. That is, start at the outermost layer and automate tests at that layer. Work gradually inwards; or even not at all. Automating regression alone can have significant benefits.

One of the biggest issues with automation, however, is that you miss out on the human ingenuity bit. Scripts WILL break if data changes over time, so environments have to be stable; something that the human tester would easily sidestep by validating "outside the box" that the changed data is indeed still valid.

To continue with the Hotel Search example, assuming both the human and the machine take the same time for a test, the gain in human cycles due to various levels of automation are:

Feature
Manual Tests
Human cycles saved with 10% automation
25%
50%
75%
Hotel Search
23403
23404 * .1 = 2340.4
23404 * .25 = 5851
23404 * .5 = 11702
23404 * .75 = 17553.0

Reviewing our factors again, we see that with this additional optimization,
  • The number of tests required to be run by humans has come down again and the tests run by machines can be run repeatably so.
  • The cost of running each of the tests has reduced, but the cost of maintaining test environments went up as both manual and automated environments have to be kept running.
  • The resultant overall quality still depends on the scenarios chosen and our trust in our automated tests. If we've chosen well and trust our scripts; it should still be within the same acceptable limits.

Optimizing testing - Round 3: Component based testing

The cost of maintaining test environments mentioned above is typically the tip of the iceberg. All testing espoused to this point has been strictly end-to-end, ie, the environment has been a live one from the UI all the way to the database (or back-end). There is a non-trivial cost associated with maintaining these environments; and a collateral cost of maintaining scripts (or known data for use by the scripts) as those environments change. Additionally, some kinds of testing may not be possible in live environments. Booking scenarios are typically such tests - contractual obligations or the cost of test bookings may deter such tests from being done in a large scale or at all.

In addition, end-to-end testing forces the entire organization into a train wreck of dependencies. Since all testing is done from the UI, all code must be built and integrated before ANY testing can start. This not only delays testing, it also puts pressure on changes to the inner layers of the application - that code has to be completed WAY IN ADVANCE of the UI code, but cannot validate their output until the UI is done.
Component testing attempts to fix these issues by testing each component
at ITS interface, not at the final user interface. That way, the owners of that component know for sure that they produce valid output for given input; a large live environment need not be maintained; and the validation of a test scenario is spread across multiple component tests which together comprise the larger validation.

Component testing almost always predicates the mocking of dependent components because the cost gains are not realized otherwise. That is, if A -> B -> C is a string of components involved in a particular test scenario, C must be mocked out to test B and B must be mocked out to test A; otherwise we've taken on the additional job of maintaining a separate instances of A,B and C solely for component testing purposes, thereby increasing cost of maintaining environments more than without it.

Component testing also typically requires some means of creating mock data - manual means will not suffice; especially if the request-response payloads are huge object graphs.

The choice, adoption and usage of an organization-wide mocking framework is therefore a non-trivial task and I will not go into the details of how to achieve this. I will, however, analyze the impact of adopting such component testing on the evaluation factors mentioned above.

To continue the Hotel Search example, a hotel search typically involves a string of internal components:
                                                               |------------> GDS
UI -> Business  -> Business  -> Business  ->    Back End-------|------------> CRS1
Dispatcher        Facade     Logic Executor  Abstraction Layer
                                                               |------------> CRS2
(Some details may be missing; sorry. I'm trying to make a larger point here).

Let's take the following Test Scenario:

Input
Expected Output
Top Destination (LAS), Weekend (start: next Friday, ret: next sun), 1 guest
  • 1st page has 25 hotels
  • 1st hotel is Caeasar's Palace @ $100
  • 2nd hotel is MGM @ $125
  • …and so on
…and break it into component tests:


Component
Given a test script that provides this input
..should provide this output
..using this mock data
UI Layer
LAX, Next fri, Next Sun, 1 guest
  • 1st page has 25 hotels
  • 1st hotel is Caeasar's Palace @ $100
  • 2nd hotel is MGM @ $125
  • …and so on
  • Business Dispatcher response containing the 25 hotels
Business Dispatcher/ Facade
LAX, mm/dd/yyyy,mm/dd/yyyy,1
+ other Dispatcher required params
Arraylist of Business Layer objects
  • Executor response containing the 25 hotels
Business Logic Executor
LAX, mm/dd/yyyy,mm/dd/yyyy,1
+ other Executor required params
Arraylist of Executor-specific objects
  • LAX-to-internal-location-object response
  • Back end Abstraction Layer responses
Back end Abstraction Layer
LAX, mm/dd/yyyy,mm/dd/yyyy,1
+ other Back end required params
Arraylist of Back end Abstraction Layer objects
  • Back end-specific responses (1 per link)

We'd have to do a similar exercise for each such scenario identified before as required, but if we did, the impact on the factors would be:
  • The number of tests required to be run by humans has come down again and the tests run by machines can be run with lesser resources.
  • The cost of running each of the tests has reduced; so has the cost of maintaining test environments as live environments no longer need be maintained.
  • The resultant overall quality now depends largely on the fidelity with which the end-to-end scenarios have been mapped to component tests. If there is a high correlation between the two, overall quality should still remain within the original acceptable limits.

This last is not easy to do for various reasons:
  • There needs to be a coherent effort to ensure scenarios matchup between components; else component owners could define their sphere of influence differently from the scenarios deemed organizationally important.
  • The larger the size of mock data to be created, the more difficult it is to create it with high fidelity. Shortcuts such as record-replay mechanisms might help, but only if they've been sanitized to remove volatile data and then made generic to match the expected scenarios.
  • Ownership is shared; so division of labor should be clearly defined via policy. For example, each component owner should be responsible for her front end, ie, provide mock data and a set of regression test scripts to his clients; she can therefore rely on her dependents to do the same. Without such a policy, key pieces of the puzzle will be missing and the overall quality will suffer.

Optimizing testing - Round 4: Common Test Scenario Database

These issues with implementation of component testing may even lead to a regression back to manual testing. The crux of the problem is that the cohesive force of the end-to-end test is lost in component testing very easily.

The central idea with the common test scenario database is retain the benefits of component testing while bringing back that cohesion via data: we need to ensure that test data that is distributed across the various component test scripts still have the same tie-in to the original scenario. That way, every component owner in a particular scenario refers to the same scenario using the same language. While we're at it, it would also be beneficial to change the mock data in two ways:
  • Replace significant pieces of live data with values that stand for the class of test data that we will use in the specific scenario. E.g., the destination data item when used in a scenario where it represents a top destination could be given the canonical name "Top Dest1". Alternatively - assuming this is clearly understood company-wide - a real value can be used as a canonical one ; eg, in this case "Las Vegas" could stand for top destination; but then it shouldn't be used in any other scenario.
  • Clear out any recorded values from the mock data so only known values remain. This eliminates side effects from remnant data but requires a higher degree of discipline.

The larger change would be to introduce full-fledged exemplar data sets for application domain concepts that cannot be confused with live data, but clearly denote the exact scenario in which they can be used; and use policy to drive adoption of these exemplar data sets as the mock data backbone.

To continue on the hotel search example, the first step would be to define the following exemplar data:

Concept
Exemplar Data
Comment
Top Destination
LAS

Regular Destination
USDEST1

Special Destination
Acme Cozumel
Added "Acme" to Destination to call out that this is a test value
Next Week Friday
(Computed Value)
Mock data framework should be able to generate such values and respond appropriately
Hotel Chain
Acme Hotels LLC

Hotel
Grand Acme Ritz Chicago

Top Hotel @ Top Destination
Acme Hotel and Casino


The component test from before can then be rewritten like so:

Component
Given a test script that provides this input
..should provide this output
..using this mock data
Web-wl
LAS, Next fri, Next Sun, 1 guest
  • 1st page has 25 hotels
  • 1st hotel is Acme Hotel & Casino @ $100
  • 2nd hotel is Acme MGM @ $125
  • …and so on
  • TBS response containing the 25 hotels
    (Note: other hosts required to bring wl up ignored for now)
TBS/Plugin
LAS, mm/dd/yyyy,mm/dd/yyyy,1
+ other TBS required params
Arraylist of BookProduct objects
  • HSE response containing the 25 hotels
HSE
LAS, mm/dd/yyyy,mm/dd/yyyy,1
+ other HSE required params
Arraylist of objects
  • Market/Markup response
  • Supplier Link responses
SL Host(s)
LAS, mm/dd/yyyy,mm/dd/yyyy,1
+ other Supplier Link required params
Arraylist of objects
  • SL-specific responses (1 per link)

More importantly, when a second scenario has to be mapped to component tests, the exemplar data table above should be checked to see if the concepts in that scenario already exist, and if so they should be reused.

So, to convert the following scenario:

Input
Expected Output
Top Destination , Peak Weekend, 4 guest
  • 1st page has 25 hotels
  • 1st hotel is Acme Hotel & Casino @ $100
  • 2nd hotel is Acme MGM @ $125
  • …and so on

...into component tests, the following data items will have to be reused:
  • Top Destination (LAS)
  • Top Hotels (Acme Hotel & Casino, Acme MGM)
…and some new data items will have to be added:
  • Peak Weekend (Thanksgiving dates, for eg)

…which will further be reused when automating the scenario:

Input
Expected Output
Top Packaging Destination , Peak Weekend, n guests
  • Acme Mexico Dest1
  • Labor Day Weekend dates
  • 2 guests
  • 1st page has 25 hotels
  • 1st hotel is Acme Cozumel Resort1 @ $100
  • 2nd hotel is Acme Cozumel Resort2 @ $125
  • …and so on
.. Which will further require new data items to be created, and so on.

When a new feature is added, say separate pricing for children or prepaid hotel rooms, that's the time for a completely new set of hotel chains and hotels to be created.
Over time, this practice of reusing test scenarios results in the creation of the Test Scenario Database which becomes the lingua franca across the organization when talking about Quality issues.

Let's see how our measuring factors score with this optimization:
  • The number of tests required to be run by humans hasn't changed since the last round.
  • The cost of running each of the tests remains as before. If there were any collateral increase in costs of using live environments due to inability to trust component tests, that is removed however.
  • The resultant overall quality still depends largely on the fidelity with which the end-to-end scenarios have been mapped to component tests; but there's a direct correlation possible because the organization now has the "common language" of the scenarios enshrined in the test data. This is the killer quality of the common scenario database.

Notes on implementation


  • Just as with automation, creating a Test Scenario Database is easier said than done. Policy from "up top" will certainly help; but a grassroots approach is also possible because the database can be built one scenario at a time. It does require a few enthusiastic converts, but some key component owners being convinced will create the kernel of good, reusable (and reused) test scenarios which can then be supported via policy.Once their use is common, network effects will take care of their popularity.
  • The Quality Organization should own the Test Scenario database and gate-keep use and reuse. Developers may create test scenarios, but should get approval from Q.
  • Component owners should be responsible for their component and their front end interface; and expect their dependents to do the same. That way they have the responsibility towards their clients and expect the same from their dependent servers.


Todo

  • Create a Test Scenario database tie-in for the Robot Framework

Sunday, March 06, 2011

The need to code - my version

JaccquesM (long timer on  Hacker News) posted about his need to code in a passionate article that resonated with me a lot.

My experiences were similar, if not the same.We had some simple computer classes in school which were mostly spent playing Digger and Frogger, but we wrote some basic programs too. There was something visceral about writing something on a screen, and seeing it come live. Basic as a language and DOS Basic as an environment nailed that aspect - and how! You started the pc, typed basic on the prompt and were dropped into an editor that let you type programs that just ran! And BASIC made no presumption of modularity, so you could just use graphics commands in your program because the language had them.

Compare that to any attempt these days to teach programming to kids - all bound up libraries to be imported before the first step taken - and i'm including specific attempts like shoes et al.

When it came time to pick an elective for Pre-University I decided to pick up Computers simply because I didnt want to do Biology, and the other option - electronics - I was neutral about. The instructors were indifferent, and the syllabus was not that great, but I was hooked. I found a friend who also knew basic and we devoured the Peter Norton book on x86 programming. We'd write assembly programs using Peek and Poke in Basic - mainly TSRs for the fun of it. Our other project was writing a 2D graphics editor in Basic. This took us all year because we wrote it on paper using pencils (to erase out and rewrite lines of code that needed to be shifted), and went to another friend's house to use his dad's pc to enter the programs and see it run. There were a lot of GOSUB XXX lines (read sphagetti code), but we pretty much carried the code in our heads and didnt stop talking about it. We finally did manage to get it working, and I believe I still have the batman portrait I drew using it. Have you ever printed anything by calling the DOS interrupt to dump the screen to the printer? That was what our editor's print function did :)

We graduated to Pascal from there because it was in our syllabus, and quickly discovered the joys of Turbo Pascal and all its cool extensions - especially the asm blocks and the graphics libraries. Of course, we wrote fractal programs and marveled at fractint.

From there it was to Turbo C - to revel in the freedom of C. I remember my first C program being pretty mundane - it converted a number into words. The challenge was that it had to print out the number in the Indian way of doing so which is not the simple thousands, millions, billions model. Instead we have the thousands, lakhs, crores model; and I remember agonizing over it to make it work. Mind you I still didn't have a computer, so this was all still paper and pencil and mental debugging.

Here's why I completely agree with Jacques that coding is a drug: I was so happily engrossed in doing it that I didnt get good enough overall scores to get into a CS bachelors. So I picked mechanical instead, and focused on the areas that were computer centric - Finite Element Method, Graphics etc. My interest in graphics had led me to read up the Schaum's book on Computer Graphics and I already knew the vector math for those pieces long before we did vector math. When we started doing engineering drawing using pencil and paper, projection systems were already familiar to me - because I was on my way to build my own 3d Graphics engine using Turbo C++. All thanks Robert Lafore for making OO click for me. No other book since has made it so lucid, at least for me.

That's the other thing - books. Unlike today, there was no easy access to books in India. My best source were the scores of street booksellers who sold old books - and what a treasure trove of books they had. I learnt of Russian expert systems and computer architectures like nothing else, of APL the language in a book written by Iverson, tons of old British computer magazines that introduced me to editors like Brief and Hypercard-clones, and whole lot more - in all a heady whiff of the ocean of opportunity that lay outside the staid old land of E.Balaguruswamy and Yashwant Kanetkar (these are probably still revered in Indian academia). Sadly, those booksellers now exclusively sell pirated Harry Potters and self help best sellers, but thankfully there are Indian editions of good books nowadays.

But I digress. Throughout my undergrad my only access to computers was at college - so most of it was spent trying to get as much lab time as possible. Lots of social engineering went into this endeavor (sucking up to the CS Dept head to allow use of their lab, helping the lab assistant in the Robotics lab to use the only Sun machine, etc), and pretty little code came out, but it was a heady time because anything was possible. I did manage to get the basic 3D graphics engine working (it proudly spun a sphere around, IIRC), and managed to present a paper on (what is now obviously basic) AI at the Computer Society of India Student convention while doing mechanical engineering.

Fast forward to today: I've been an IT professional for 13 years now. I still code as much as possible at work (architecture/design decisions or helping my teams with complex fixes), but I have a healthy github account, and some more side projects at work. Coding makes me happy. I dont want to stop.

Thanks JacquesM for helping me remember why I do this.