The Rule of Seven: how to overload the brain of a programmer for no good reason

There is an old, but very important paper in psychology called The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information. It measures the limits of the brain processing information, and puts a number to it: the human brain can jiggle 5 to 9 concepts simultaneously. This has many interesting implications, but for us, software developers, there are two implications that are brutally important for us:

Simple constructs (models, implementations, designs, patterns...) are better, because they need less concepts to describe them.
Well-constructed abstractions that have very few special rules and compose without surprises are better, because they need less concepts to describe them.

Which, at the end of the day, is a battle to use the minimum amount of mental space for each concept, because mental space is scarce.

Simple constructs

As Hoare said:

There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies.

Unfortunately, he also added:

The first method is far more difficult.

In software development it's very easy to add another module, another class, more and more code. This addresses more use cases required for the software, but requires more and more code and has diminishing results. As Jamie Zawinski said:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

Essentially, we expand programs to do more and more and more. We add more modules, more classes and more code, and, for each piece of code we have to make it work with all the previous code, in a never-ending expansion of interactions between pieces that grows exponentially in complexity. We do this because it is easy, at least in the beginning.

On the other hand, making something as simple as possible is actually very hard. It requires looking for underlying patterns and behavior, and requires eliminating what does not fit. But, ultimately, it is better, because it allows you to use less concepts and less brainpower to describe it, which means you can concentrate on the bigger overall instead of on the details.

Detecting overcomplicated constructs

It's actually very easy to detect overcomplicated and bloated constructs. You just look at the manual. If it has a section describing the meaning of many words within the context of the construct, it's probably overcomplicated.

Let's look at an example: the Abstract Factory pattern. To describe how it works, we need to describe a lot of names:

AbstractFactory interface
<<creates>> action
Factory1 (or ConcreteFactory)
ProductA and ProductB interfaces
ProductA1 and ProductB1 classes
new keyword
Abstract Factory class diagram
:Client and factory:Factory1 sequence diagram

All these names and diagrams exist because it's easier to create complexity than is to reduce it. It also forces the programmer into the over-specific solution of creating objects instead of other possible solutions, such as reusing objects stored somewhere. Instead, we could reduce complexity by:

Remove the "Factory" name from the whole pattern.
Removing the new keyword and the <<creates>> action.

Now the pattern is about an object that will return an instance of ProductA or ProductB. It may do it by instantiating the class, fetching a pre-built object from a data structure or stealing it from Area 51, but we don't care about it, because all we care is that we start without the object, and we end with an object. Now, the list of names is:

Two operations:
- The first one obtains an object with interface ProductA
- The second one obtains an object with interface ProductB

You are free to put the operations in a class, or in a structure, or in pointers, or in your living room. To use it, you call either of the operations with whatever parameters they may require, and they get you the corresponding object. To set up it, you set the first operation to anything that produces a ProductA, and the second operation to anything that produces a ProductB. There are no class diagrams, no sequence diagrams, no extra classes, no unneeded words. There is no ambiguity because all you can do with an operation is call it, pass it as parameter, return it and store it. Because functions and operations are nothing more than values with arguments. How do you get a something with interface ProductA? What things can return a ProductA? The first operation, therefore that's what you have to use.

Well-constructed abstractions that compose well

We already described a well-constructed abstraction above, on the Abstract Factory fixup. A well-constructed abstraction that composes well is made out of two obvious parts:

Well-constructed abstraction

The AbstractFactory pattern is a horrible abstraction because it leaks all the insignificant details we don't care about, such as it being abstract, it having to create objects, and it using the new keyword. As the user of it, I couldn't care less about all these details, but all these details are thrown to my face, and now I have to deal with them.

Instead, I want to have "an operation that gets me an object with interface ProductA". I call it, and it handles me back a ProductA. I don't know and I don't care how it was obtained.

That compose well

The AbstractFactory pattern is a horrible abstraction (again) because it doesn't compose. See how it has two methods: one to create a ProductA, the other to create a ProductB. Why do I have to put them together in the same Factory? Why can't I have the two operations separated, so that whoever wants to have them together can choose to put them in a tuple? By having them in a class, they also fail at composing. How do I combine them? What if I have somewhere another operation that requires a ProductA? Does that operation need to know about the AbstractFactory chain of mistakes?

AbstractFactory myFactory = new Factory1();
ProductA myProduct = myFactory.createProductA();
AbstractFactory somethingDifferent = new SomethingDifferentFactory();
SomethingDifferent result = somethingDifferent.fabricate(myProduct);
return result

Instead, if I have "an operation that gets me an object with interface ProductA", and another operation that requires a ProductA to construct something different, I call the first operation to get the ProductA, and then the second operation with it. In terms of types:

operation : () -> ProductA
otherOperation: ProductA -> SomethingDifferent

Therefore: otherOperation(operation()). Done.

Detecting good compositionality

Unfortunately, it's hard to detect the semantics of good compositionality, because we developers are so accustomed to stuff that doesn't compose that we just don't know the "shape" of things that compose well. Good composing is easy to feel. It feels like legos, where each piece can easily fit with the others, and making big programs is just a case of

I have something with a triangle prong, let's see what's available with a triangle hole.

When you have good compositionality, you don't use adapters (or you don't even name them adapters, because they are so obviously simple). When you have good compositionality, you just connect the parts to construct a whole without having to care that the parts interact between themselves in unexpected ways.

The total is the exact sum of its parts, nothing more and nothing less.

Conclusion

We have seen how the AbstractFactory brutally offends the Rule of Seven by corrupting the single, simple concept of "gets a ProductA" into

A Factory that is Abstract which implements a createProductA method that uses new to <<construct>> a ProductA1 that has a ProductA <<interface>>.

That is 10 words and concepts alone, which is more than the smartest brains are happy to handle. Even worse, it is too rigid as it only allows creating and not reusing objects.

This is one of the reasons why programming is hard: because we make it unnededly hard.

Javier Casas

A random walk through computer science