sum-csv

Reflections on a little thing I made, to learn how to better create the bigger ones.

After the last blog categories reorganization, I realized that I talk less about what I do and more about what others do. That make sense, as this blog is part of my learning process and I’m always looking around to find ways to improve myself. Yet, I’d like to start writing more about the little things I do. Writing helps me to reflect upon the how, so eventually I’ll learn more about my thought processes. These are likely to be very small things.

sum-csv

sum-csv is a small utility I have built to help me to crunch some statistics I was working with. I had a complete dataset in a CSV file, but what I wanted was an ordered list of the number of times something happened.

Original CSV: What I wanted:

A data transformation

This is a small task – my old self whispered. Yet, instead of opening the editor and start coding right away, the first thing I did was drawing things. I am a visual person and drawing helps me to gain understanding. The algorithm I came up with was a succession of mathematical transformations: Which is to say:

  • transpose the original matrix
  • eliminate the rows I was not interested in
  • for each row, group all numerical values (from column 1 onwards) by adding them, to calculate the total
  • sort the rows by the total

Now, I was prepared to write some code. Amusingly, the gist of it is almost pure English:

d3-array.transpose( matrix ).filter( isWhitelisted ).map( format ).sort( byCount );

Reflection

Creating production-ready code took me four times the effort of devising an initial solution: finding good and tested libraries for some of the operations not built-in in the language such as reading a CSV into a matrix or transpose the matrix itself, creating the tests for being able to sleep well at night, distributing the code in a way which is findable (GitHub/npm) and usable by others/my future self, and, actually, writing the code.

I am not always able to write code as a series of mathematical transformations, but I find pleasure when I do: it is much easier to conceptually proof whether the code is correct. I also like how the code embodied some of the ideas I’m more interested in lately, such as how a better vocabulary helps you to make things simpler.

A more expressive vocabulary for programming

map and friends are more precise, sophisticated ways to talk about consistent patterns in data manipulation. Using them over for is analogous to using the word “cake” instead of “the kind of food that you make by whipping egg whites and maybe adding sugar”.

Interestingly, you can eventually add new layers of category on top of established layers: just like saying that butter cakes constitute a specific family of cakes, one could say that pluck is a specialization of map.

Of vocabulary and contracts, Miguel Fonseca.

Simple made easy

Precise words make communication more efficient. Arguably, software development is about managing conceptual complexity. Simple made easy, by Rich Hickey is a talk that tackles those two topics.

Two takeaways from this talk:

  • The differences between simple and easy.
    • Simplicity is an objective measure, and its units are the level of interleaving (of concepts). Simplicity should not be measured by the cardinality, the number of concepts/units.
    • Easiness is a subjective measure, and it is related to how familiar you are with some topic or your past experience.
  • We can make things simple with the tools we already have by favoring concepts that make things simpler to reason about, not quicker to write. He recommends:

See also: