The Google repository

I’ve been reading how Google organizes its codebase: they maintain a hyper-large repository containing everything, since the beginning of the company. I guess you may find Gmail, Photos, or AdWords there. You won’t find Android or Chrome, though – these are open source projects.

The repository is 86Tb of data, 1 billion of files, and 35 billion of commits. To manage this complexity, they needed to build their own tools: a home-grown Version Control System that can work effectively with such a repository at this scale, editor integration, building and automated testing tools, etc.

They develop all the code against trunk/master, meaning that if you are updating a library, you’ll also need to fix all applications that depend on it. Every project will be up-to-date, even abandoned projects.

The advantages

The main reasons they claim this approach works for them are: it makes easier reusing blocks of knowledge company-wide and reduces the friction to contribute between projects/teams. UI primitives, building tools, etc, all are shared by any project that wants them, it’s just a matter of depending on the master version. It minimizes the costs of versioning/integration and the curse of being left behind when something is updated and you cannot keep up with the changes (the experts will do it for you!).

As a side effect, when working on libraries/frameworks it’s easier to understand the performance/impact/etc of a specific change (you can run tests on real projects) and to put together a task-force to fix issues affecting several applications.

The disadvantages

This approach comes with downsides as well: they mention the amount of maintenance this setup requires even with all the tooling they have already built. With a monolithic repo, it’s easy to run into unnecessary dependencies that bloat the binary size of a project (and they do), the costs inherent to updating basic blocks used through the whole company, etc.

Another point is that it makes difficult having external contributors. Although they have a space in the repository for public/open-sourced projects, the article is unclear on how they manage 3rd-party contributions there – external programmers don’t have access to the internal building tools that Google programmers have. High-profile products like Android or Chrome -where outside contributors are expected and encouraged- have walked away from this approach.


I highly recommend reading the paper, it’s a pretty unique approach, and the article does a good job on presenting a balanced perspective.

A more expressive vocabulary for programming

map and friends are more precise, sophisticated ways to talk about consistent patterns in data manipulation. Using them over for is analogous to using the word “cake” instead of “the kind of food that you make by whipping egg whites and maybe adding sugar”.

Interestingly, you can eventually add new layers of category on top of established layers: just like saying that butter cakes constitute a specific family of cakes, one could say that pluck is a specialization of map.

Of vocabulary and contracts, Miguel Fonseca.

Simple made easy

Precise words make communication more efficient. Arguably, software development is about managing conceptual complexity. Simple made easy, by Rich Hickey is a talk that tackles those two topics.

Two takeaways from this talk:

  • The differences between simple and easy.
    • Simplicity is an objective measure, and its units are the level of interleaving (of concepts). Simplicity should not be measured by the cardinality, the number of concepts/units.
    • Easiness is a subjective measure, and it is related to how familiar you are with some topic or your past experience.
  • We can make things simple with the tools we already have by favoring concepts that make things simpler to reason about, not quicker to write. He recommends:

See also:

Taking PHP seriously

Slack se une al club de los que usan PHP:

Most programmers who have only casually used PHP know two things about it: that it is a bad language, which they would never use if given the choice; and that some of the most extraordinarily successful projects in history use it. This is not quite a contradiction, but it should make us curious. Did Facebook, Wikipedia, WordPress, Etsy, Baidu, Box, and more recently Slack all succeed in spite of using PHP? Would they all have been better off expressing their application in Ruby? Erlang? Haskell?

Relacionados: inversión y amortización en la selección tecnológica.

Dos ideas funcionales

En la evolución de cómo se hacen aplicaciones web, la mutación principal del ciclo anterior habría sido la separación API/interfaz. En el actual, apostaría que lo fundamental se deriva de que el ecosistema JavaScript (tanto el lenguaje como las librerías a su alrededor) ha interiorizado dos ideas del mundo de la programación funcional: las funciones puras y la inmutabilidad. Esta hipótesis serviría para explicar, por ejemplo, la popularidad de React y Redux que no son más que la aplicación de estos conceptos a la creación de interfaces y la gestión del estado.