I have been thinking lately about visualizations of a codebase - spurred on because of recently rediscovering Software Catography and its successor - Codemap. Coincidentally, I also wanted to create my personal website as a "visual representation of the web of connections that it is", which essentially boiled down to a stable visualization.
When I looked at the tools that are currently available to do this, it seems like they are overly complex. The closest was Gource, but it focuses on the people who worked on the code and doesnt generate a stable visualization.
So here's my idea:
When I looked at the tools that are currently available to do this, it seems like they are overly complex. The closest was Gource, but it focuses on the people who worked on the code and doesnt generate a stable visualization.
So here's my idea:
- The visualization will be created from the commit history of the codebase.
- Once created, the visualization is not a snapshot, but can be enhanced over time to show changes. So the output format should contain the history of changes.
- The visualization is essentially a Treemap-ish diagram with time along the X-axis and size along the Y.
- Each object(file or directory) is drawn as it comes to life in the commit log and is represented as a rectangle.
- Position: The first object that is created gets the position x=0 within its parent, the second gets x=1 and so forth. Once assigned, these positions are permanent even after the object is moved or removed.
- Dimensions: The width remains the same for all: files have a width of 1 unit and directories have a width equal to the sum of the widths of its contents. The height is equal to the size of the file.
- When an object is changed, its old size shows up as a faded outline within the newly sized rectangle - somewhat akin to the age rings of trees. Size reductions may show age rings outside the current rectangle.
- When an object is moved, its old position shows a faded outline and objects after it do not move to take up the position.
- Similarly when an object is deleted, its old position shows a faded outline.
- Keeping the visualization contained: This is where the Treemap concepts are helpful. The complete visualization's size will be calculated inside-out: the size of the deepest directory will control the % contribution of its parent and therefore transitively its grandparent, and so forth. This way, the visualization can be contained in a finite space. At its smallest size, each "rectangle" will be reduced to a line: the position still remains as described above, the width is reduced to 1 pixel and the length is still the size of the file. No rings are possible at this level of compaction.
- Controls: The visualization will have:
- Play: A way to see the evolution of the codebase a la Gource
- Zoom in and out
- Time Filter: A way to filter out older rectangles. This will essentially show the current state of the codebase, but since all positions are fixed, it will give an idea of how far the current state is from the original.
- Object Highlight: this will highlight a particular file or directory to "show where it is in the map"
- Object Trace: This will high light the path of the object throughout its evolution in the codebase.
- Commit Highlight: Highlight all files in a commit
The advantages I see with such a visualization is that it combines a stable spatial representation of the code along with its evolution over time. Using a treemap representation essentially keeps it bounded so that the view could be injected into current developer environments without taking up too much screen space.
Implementation notes:
- A quick way to implement this might be using html divs.
No comments:
Post a Comment