Code and decision trees

A lot of what programs do is transforming inputs into outputs. Take, for example, a piece of JavaScript code like this:

itemsToMarkup( items, viewType, galleryType ) {

    let markup;

    switch ( viewType ) {
        case 'gallery':
            if ( 'individual' === galleryType ) ) {
                markup = getHTML( items );
            } else {
                markup = getShortcode( items );
            }
            break;

        case 'image':
        default:
            markup = getHTML( items );
    }

    return markup;
}

What’s the purpose of this code? It takes some input data structures and outputs a markup, either proper HTML or a code to be processed by later stages of the pipeline. At the core, what we are doing is taking a decision based on the input’s state so it can be modeled as a decision tree:

tree-original

By restating the problem in a more simple language, the structure is made more evident. We are free of the biases that code as a language for thinking introduces (code size, good-looking indenting, a certain preference to use switch or if statements, etc). In this case, conflating the two checks into one reduces the tree depth and the number of leaves:

cof

Which back to code could be something like:

itemsToMarkup( items, viewType, galleryType ) {

    let markup;
    const isGalleryButNotIndividual = ( view, gallery ) => view === 'gallery' && gallery !== 'individual';

    if( isGalleryButNotIndividual( viewType, galleryType ) ) {
        markup = getShortcode( items );
    } else {
        markup = getHTML( items );
    }

    return markup;

}

By having a simpler decision tree, the second piece of code makes the input/output mapping more explicit and concise.

In this paper, we make the case that the high-productivity digital firms are starting to generate a new middle class. It’s a virtuous circle. Consumers flock to those firms because they offer lower prices and better service. Workers migrate there from low-productivity firms because the high-productivity firms offer better wages for the same occupations—and, often, steadier hours and better benefits.

— The Creation of a New Middle Class?: A Historical and Analytic Perspective on Job and Wage Growth in the Digital Sector. [PDF]

The Google repository

I’ve been reading how Google organizes its codebase: they maintain a hyper-large repository containing everything, since the beginning of the company. I guess you may find Gmail, Photos, or AdWords there. You won’t find Android or Chrome, though – these are open source projects.

The repository is 86Tb of data, 1 billion of files, and 35 billion of commits. To manage this complexity, they needed to build their own tools: a home-grown Version Control System that can work effectively with such a repository at this scale, editor integration, building and automated testing tools, etc.

They develop all the code against trunk/master, meaning that if you are updating a library, you’ll also need to fix all applications that depend on it. Every project will be up-to-date, even abandoned projects.

The advantages

The main reasons they claim this approach works for them are: it makes easier reusing blocks of knowledge company-wide and reduces the friction to contribute between projects/teams. UI primitives, building tools, etc, all are shared by any project that wants them, it’s just a matter of depending on the master version. It minimizes the costs of versioning/integration and the curse of being left behind when something is updated and you cannot keep up with the changes (the experts will do it for you!).

As a side effect, when working on libraries/frameworks it’s easier to understand the performance/impact/etc of a specific change (you can run tests on real projects) and to put together a task-force to fix issues affecting several applications.

The disadvantages

This approach comes with downsides as well: they mention the amount of maintenance this setup requires even with all the tooling they have already built. With a monolithic repo, it’s easy to run into unnecessary dependencies that bloat the binary size of a project (and they do), the costs inherent to updating basic blocks used through the whole company, etc.

Another point is that it makes difficult having external contributors. Although they have a space in the repository for public/open-sourced projects, the article is unclear on how they manage 3rd-party contributions there – external programmers don’t have access to the internal building tools that Google programmers have. High-profile products like Android or Chrome -where outside contributors are expected and encouraged- have walked away from this approach.

Coda

I highly recommend reading the paper, it’s a pretty unique approach, and the article does a good job on presenting a balanced perspective.

Ryanair boarding songs

Some years ago, when boarding on Ryanair, the music played over the speakers was a classical and vibrant song – being Ryanair, likely a royalty-free one. Some declare it was Bach’s Brandenburg Concerto No. 5:

Lately, they’ve changed to a more modern electronic style, at least as vibrant as before:

I like the new one much better. For a few seconds, I thought «they have probably hired someone with a better sense for music», but then I realized that it’s the same kind of song you hear on the teenager stores these days. And I know why they changed it: to keep you moving!

Faster, if possible.