Brain registers and code readability
Some background
At my work I organise a Tech Talk Tuesday. Every Tuesday I try to find an interesting technical video. Anyone who’s interested can sit down together at lunchtime and watch it. During the current Coronavirus lockdown, when we’re working from home, we’ve continued the tradition by using watch2gether.
Last Tuesday’s video was The Mental Game of Python by Raymond Hettinger (whose blog, I notice, is updated even less frequently than this one!). This was the keynote he delivered at PyBay2019.
It’s an interesting talk, well worth watching. Most of it is language-agnostic: it’s about patterns of behaviour for successful problem solving. In particular, it’s about how to tackle complicated things by doing (more of) the simpler things instead.
The first strategy Raymond talks about is “chunking and aliasing”. This is essentially breaking up big things into chunks, and giving names to those chunks. If you’ve spent any time thinking about function lengths in your code then this will be very familiar to you. But as part of the explanation of this point, he talked about brain registers. That set me thinking, and the rest of this post is just about that one small point from the talk.
Registers
Central processing unit (CPU) registers
Before talking about brain registers, let’s consider computer registers. Raymond said that the computer he was presenting his talk from had a CPU with “16 general-purpose registers.”
If you’ve done any programming in assembly language (the what little I’ve done was a long time ago) you’ll know that a register is a location in the processor for storing the values being used in the current instruction. These values might be addresses (essentially pointers into the main memory) or data values. A general-purpose register can hold either type.
Registers are really fast to access, but there aren’t many of them. If a program needs to sum a list of 1,000 numbers, it’s going to have to do it a bit at a time and orchestrate moving those numbers between main memory and the CPU registers.
So, what is a brain register?
Suppose we think of the human brain as analogous to a computer processor. Miller’s Law, from a paper published in 1956, states that humans can only hold seven (plus or minus two) objects in short-term memory at once. It’s reasonable to equate these five to nine available slots in human short-term memory with CPU registers.
This idea often comes up in things like GUI design, where we are encouraged not to have more than, say, seven items in a menu because otherwise it will be hard to use (by the time people have read to the bottom, they will have forgotten what was at the top).
Raymond applies the same idea to understanding code. If we can write the same thing using fewer “chunks” then we’re less likely to exhaust our capacity to understand it.
Chunking
Chunking is the process of replacing long sequences of instructions with fewer, combined instructions. Many of the refactorings available in modern IDEs are different types of chunking.
Suppose you have some Python code where you want to do something if a datetime is during the working day and not a holiday. It might be coded like this:
if 0 <= d.weekday() <= 4 and 9 <= d.time().hour <= 17 and d.date() not in holidays:
...
That’s throwing lots of details at the person reading the code, using up lots of their mental registers. There are a couple of approaches for reducing this cognitive load. First, we could “extract variable” a few times, the result of which might look like this:
is_in_working_week = 0 <= d.weekday() <= 4
is_working_hour = 9 <= d.time().hour <= 17
is_holiday = d.date in holidays
if is_in_working_week and is_working_hour and not is_holiday:
...
This gives us less to think about at once. Whichever way you want to count the brain registers, I think this version requires fewer of them (at the expense of longer code, if we care about that, and losing the lazy evaluation).
Alternatively, we might “extract function”. Then the problematic line simply becomes:
if is_at_work(d):
...
Aliasing
Aliasing can go even further. If we can replace a chunk by a reference to something we already know, then we’ve freed up its register completely. This is effectively how extraordinary feats of memory are achieved, using things like memory palaces, tying back new things to other things that we are very familiar with.
One example of the use of aliasing would be to identify any handcrafted code that is just replicating some functionality that’s already present in our standard library. Replacing that with a call to the library function instead frees up those registers.
A similar idea, not explicitly mentioned in the talk, is using design patterns so that we understand how whole groups of classes interact with each other just from their names, without having to keep the details in our short-term memory.
What does this have to do with code readability?
I’m passionate about clean code. I also have a terrible memory. Raymond’s talk set me thinking about whether the two things might be related, and how brain registers might fit in.
To some extent, clean code is a coping strategy for having a poor memory in general. I’m drawn towards using descriptive names so that I don’t have to remember the mapping from an abbreviation to the real thing. I add comments quite freely because without them I know I won’t remember why I chose to do certain things.
So, this chunking is a natural thing to me. My register count is at the bottom end of the scale. If I encounter code that exceeds my ability to process it, I’ll start breaking it up. Everything is so much nicer afterwards. Things make sense again.
But I’ve worked with some very smart people over the years. Some of them take a very different approach. They have excellent memories. They can remember what the half-dozen cryptic variables represent. They don’t need comments because it’s obvious what’s going on.
Now, here’s the interesting part. Splitting complex code out into variables, functions and classes can impair the ability of those clever people to work with it. Instead of having all the information available in one place, they now have to jump around looking for the details. They had plenty of spare registers with the original version: one person’s simplification is another’s unnecessary complexity.
Does this mean I’m going to stop simplifying things? Well, no. But I might consider raising the bar for acceptable complexity just a little rather an optimising the code to minimise my own discomfort.