Unitiliy

Immutability, Abstraction and Composition

In the previous article we talked about how we naturally solve large problems by breaking them down into smaller ones. When writing code, we break our code down into individual units and test them in isolation. We then compose those units together to form programs. In this article we discuss how to write units so we can compose them together easily and reliably.

Public interfaces and abstraction

Suppose we are writing a FirstNameReportWriter unit that requires a list of usernames to generate a report. We will need to inject a dependency to provide us with that list of usernames. What type should that dependency be? Well it actually doesn't matter, as long as it provides me with a list of usernames.

We generally define a type that exposes only the neccessary functionality, and our dependency will be of that type. So for the FirstNameReportWriter unit, we define a FirstNameDataStore type and inject it into the constructor. We say that the FirstNameReportWriter class is the comsumer of the FirstNameDataStore.

public interface FirstNameDataStore {
    Collection<String> getAllFirstNames();
}

public class FirstNameReportWriter {
    private final FirstNameDataStore firstNameDataStore;

    public FirstNameReportWriter(FirstNameDataStore firstNameDataStore) {
        this.firstNameDataStore = firstNameDataStore;
    }

    // ...
}

Any unit that can perform the FirstNameDataStore functionality can be used, for example a unit that reads from a database.

public class Database implements FirstNameDataStore {
    private final String url;

    public Database(String url) {
        this.url = url;
    }

    @Override
    public Collection<String> getAllFirstNames() {

        DatabaseClient databaseClient = DatabaseClientFactory.newclient(url);

        return databaseClient.runQuery(
                "select firstname from users");
    }
}

This is called an abstraction. When you create an abstraction you hide the details. The consumer of the FirstNameDataStore doesn't know anything about SQL or database tables, these details have been abstracted away.

The less we have to know about the inner workings of a unit, the easier it is to compose. The more details we hide, the less information we need to keep in our heads, and the easier and faster we can work.

The public interface of a unit consists of all the types of its function's inputs and outputs, and in some situations its function's name(s). We must always know the public interface of a unit to consume it as a dependency. The better a units public interface abstracts away its implementation, the easier that unit is to compose.

So whats a good abstraction? Our goal is to abstract away as much as possible, and no more. Imagine if our Database implementation looked like this

public class Database {
    private final String url;

    public Database(String url) {
        this.url = url;
    }

    public Collection<String> getAllFirstNames(String tableName) {

        DatabaseClient databaseClient = DatabaseClientFactory.newclient(url);

        return databaseClient.runQuery(
                String.format("select firstname from %s", tableName));
    }
}

Now the consumer must know which table the users are stored in. We havent abstracted away as much as we could, we must now know more about the internals of that class when we consume it as a dependency.

Conversely we also need to make sure that we do not abstract away too much, otherwise when we compose our units together things might happen that we didn't expect. If our database implementation was

public class Database implements FirstNameDataStore {
    private final String url;

    public Database(String url) {
        this.url = url;
    }

    @Override
    public Collection<String> getAllFirstNames() {
        if (OnlineBanking.isLoggedIn()) {
            OnlineBanking.setBalance(0);
        }

        DatabaseClient databaseClient = DatabaseClientFactory.newclient(url);

        return databaseClient.runQuery(
                "select firstname from users");
    }
}

then we'll probably see a developer swearing at the monitor sometime soon.

The last example may be quite extreme, but the point is that we must be able to have a strong intuition about what a unit does looking only at its public interface i.e. By not looking at its implementation. We want a units public interface to be a good abstraction.

Mutability

If we write our programs using individually tested units, and we strive to write good public interfaces, then we are close to having composable units of code. There is however one problem waiting to bite us, and that is mutability.

A type is mutable if the value of an instance can be changed. It sounds pretty innocuous doesn't it? We may have been mutating variables all our lives, but the sad truth is that this makes it really difficult to compose units of code without knowing about their implementation.

Mutable state creates a spooky action at a distance problem. Imagine we have the following class

public class GameState {
    int numberOfPlayers = 0;

    public int getNumberOfPlayers() {
        return numberOfPlayers;
    }

    public void setNumberOfPlayers(int i) {
        numberOfPlayers = i;
    }
}

and the following code:

GameState gameState = new GameState();
gameState.setNumberOfPlayers(1);
dependency.playGame(gameState);
System.out.println(gameState.getNumberOfPlayers());
System.out.println(gameState.getNumberOfPlayers());
System.out.println(gameState.getNumberOfPlayers());

What do we expect to be printed to the screen? The answer is that it could be anything. We have seen on the second line that the number of players in the GameState can be mutated, and we have no idea what happens in the playGame function. We have to learn about the internals of the dependency unit before we can compose its functionality. We need to consider a much larger proportion of the codebase when reasoning about code.

The spooky action at a distance is a value changing due to a completely different part of the codebase. Running multiple threads is when this issue really begins to drive you insane. Imagine if the dependency.playGame function starts a new thread that mutates the gameState periodically, it is possible that each of the three print statements print different values.

It gets worse, a thread could mutate the value of the gameState whilst another thread is reading it. Remarkabley it may read something that doesnt match the value before or after the mutation. This can be solved with locks, but locks really don't compose well. You may have two units that use locks that work perfectly well individually, but calling them at the same time from a different unit may introduce deadlocking.

Units that mutate state almost always abstract away too much. Look at the previous example again

dependency.playGame(gameState)

it is very difficult to know what the unit is doing from its public interface. The play game function doesnt return anything, so it must either do nothing, or cause a side effect. There's no point calling a function that does nothing, so my money's on it causing a side effect. The only question is what? It's probable that the playgame function only mutates the gameState, but it's possible that it launches a full nuclear strike.

Immutabliity

Ultimately mutable state impedes abstraction and therefore makes it harder to compose units. Given my definition of good code in article 1, I would call programs with lots of mutable state bad code.

Let's now consider how the code looks if we use an imutable data structure. Here is an alternative declaration of the Gamestate class

public class GameState {
    public final int numberOfPlayers;

    public GameState(int numberOfPlayers) {
        this.numberOfPlayers = numberOfPlayers;
    }
}

the calling code would now probably look something like this:

GameState gameState = new GameState(1);
GameState newgameState = dependency.playGame(gameState);

We still can't be sure the playGame function isn't launching nukes (at least not in Java, in Haskell we can), but at least we know we can access both gameState and newGamestate without some other thread messing about with it as we do.

Conclusion

So how does immutablity effect our previous discussion about individually tested units?

  1. We must add a restriction to our workflow units to be stateless. They must not have any fields other than their dependencies. We only need to add this restriction to workflow units because pure units must be stateless by definition, and conversly side effect units are statefull by definition.

  2. We also add a restriction to the public interfaces of all units that they must only pass immutable data types.

  3. When we need mutable state we wrap that state in a side effect unit, but still pass immutable data structures through its public interface.

If we revisit the complete FirstNameReportWriter from the previous article, it is an example of a stateless workflow unit.

public class FirstNameReportWriter {
    private final ReportCreator reportCreator;
    private final FirstNameDataStore firstNameDataStore;
    private final ReportWriter reportWriter;

    public FirstNameReportWriter(
            ReportCreator reportCreator,
            FirstNameDataStore firstNameDataStore,
            ReportWriter reportWriter
    ) {
        this.reportCreator = reportCreator;
        this.firstNameDataStore = firstNameDataStore;
        this.reportWriter = reportWriter;
    }

    public void writeReport(String reportTitle) throws IOException {
        Collection<String> firstNames = firstNameDataStore.getAllFirstNames();

        String report = reportCreator.createReport(reportTitle, firstNames);

        reportWriter.writeReport(report);
    }
}

It's only fields are its dependencies that are injected into its constructor. Its public interface also only exposes imutable data types, which is true for all the units in the previous article.

We end up with something that looks quite functional where we have services and data. Services are stateless units that have other services injected into their constructors. Data are the immutable data structures that flow through the services. This way of writing code is best shown in the visualisation of the previous article's example.

If we write our code like this we have individual units that we know work on their own, and we know can be composed together robustly. Therefore we have code that can be further developed quickly and easily. This is good code.