Unitiliy

The SOLID principles

In the previous articles we have discussed a generic way of writing code we can change quickly and easily. We will now show this code obeys the SOLID principles.

Single responsibility

Each unit should have a single responsibility.

But what does it mean to have a single responsibility? Let's go back to the very first code example we looked at, this was a single unit of code which generated a report of all the first names in a database.

public class App {
    public static void main(String[] args) throws IOException {
        new App().run(args[0]);
    }

    private void run(String reportTitle) throws IOException {
        DatabaseClient databaseClient = DatabaseClientFactory.newclient(
            "jdbc:sqlserver://db1:2000;databaseName=prodDb");

        Collection<String> firstNames = databaseClient.runQuery(
                "select firstname from users");

        Map<String, Integer> counts = new HashMap<>();

        for (String firstName : firstNames) {
            String capitalisedFirstName = Character.toUpperCase(firstName.charAt(0)) +
                    firstName.substring(1);

            Integer currentCount = counts.get(capitalisedFirstName);
            counts.put(capitalisedFirstName, currentCount == null ? 1 : currentCount + 1);
        }

        StringBuilder report = new StringBuilder();
        report.append(reportTitle);
        report.append(System.lineSeparator());

        for (Map.Entry<String, Integer> nameAndCounts : counts.entrySet()) {
            report.append(
                    String.format(
                            "%s:%d%s",
                            nameAndCounts.getKey(),
                            nameAndCounts.getValue(),
                            System.lineSeparator()
                    )
            );
        }

        FileUtils.writeStringToFile(
                new File("/home/reports/firstNames.txt"),
                report.toString(),
                StandardCharsets.UTF_8
        );
    }
}

Now does this unit have a single responsibility? I would say that it does. Its responsibility is to 'write a report of all the first names in a database'. Just like we talked about breaking down big problems into lots of small problems, responsibility also composes in exactly the same way. It is the CEO's responsibility to manage a company, but he/she delegates parts of that responsibility to other people. Extracting dependencies is entirely equivalent to delegating part of a units single responsibility.

So when should a unit delagate some of it's responsibility to a dependency? and how much should it delegate?

If our only rule is that each unit must have a single responsibilty then it actually doesnt matter how we extract our dependencies. They are almost always going to have a single responsibility. We could extract dependencies like this:

  • FirstNameReportWriter -> write a report of all the first names in a database

    • FirstNameDataStore -> get the first name data
    • ReportCreator -> generate a report from the data
    • ReportWriter -> save the report

or like this:

  • FirstNameReportWriter -> write a report of all the first names in a database

    • FirstNameDataStore -> get the first name data
      • QueryGenerator -> generate a SQL query
      • QueryRunner -> run a SQL query
    • StringCapitaliser -> capitalise a list of strings
    • StringCounter -> count occurences of items in a list
    • FormattedCountWriter -> write a formatted list of counts to a file

They will always have single responsibilites because the way we compose units together mirrors how repsonsibility is delegated.

It is now more useful to say all units should have a single type (pure, side effect or workflow), rather than a single responsibility. If we follow this rule then whenever we are considering extracting dependencies from a unit

  • we must extract all or none of the logic. If we extract some of the code the unit will not be a single type. It will have dependencies like a workflow unit but some remaining logic like a pure or side effect unit.

  • If we choose to extract a dependency then we must extract at least two. There is no point having a workflow unit with a single dependency (just like there is no point having a worker who delegates all his/her responsibility to a single person).

  • side effect units are difficult to test so we should delegate as little responsibility to those as possible. We must extract them otherwise we would end up with just a single side effect unit (all programs perform side effects). As we saw at the very start of the first article a unit like this cant be tested.

  • we should extract as much logic as possible into any pure units because this allows us to abstract away the most code as discussed in the previous article. The larger a unit is compared to its public interface the less details we have to worry about when using it.

With this in mind, if we look again at the FirstNameReportWriter unit we must extract the side effects ReportCreator and FirstNameDataStore keeping them as small as possible. We should then extract all the remaining logic into a single pure ReportCreator unit.

... And so we end up at the very same place we finished the first article. Our FirstNameReportWriter unit looks like this

public class FirstNameReportWriter {
    private final ReportCreator reportCreator;
    private final FirstNameDataStore firstNameDataStore;
    private final ReportWriter reportWriter;

    public FirstNameReportWriter(
            ReportCreator reportCreator,
            FirstNameDataStore firstNameDataStore,
            ReportWriter reportWriter
    ) {
        this.reportCreator = reportCreator;
        this.firstNameDataStore = firstNameDataStore;
        this.reportWriter = reportWriter;
    }

    public void writeReport(String reportTitle) throws IOException {
        Collection<String> firstNames = firstNameDataStore.getAllFirstNames();

        String report = reportCreator.createReport(reportTitle, firstNames);

        reportWriter.writeReport(report);
    }
}

and all the units have a single responsibilty, but more importantly they all have a single type.

Open closed

Each unit should be open for extension, but closed for modification.

This means that we should be able to extend the functionality of our units without modifying them. We cant break anything if we dont change it right? So as an example let's pretend our product manager has asked us to extend the FirstNameReportWriter to report on the first names in a second database as well. Our first thought might be to simply inject a second FirstNameDataStore dependency and combine the results like this:

public class FirstNameReportWriter {
    private final ReportCreator reportCreator;
    private final FirstNameDataStore firstNameDataStore;
    private final FirstNameDataStore firstNameDataStore2;
    private final ReportWriter reportWriter;

    public FirstNameReportWriter(
            ReportCreator reportCreator,
            FirstNameDataStore firstNameDataStore,
            FirstNameDataStore firstNameDataStore2,
            ReportWriter reportWriter
    ) {
        this.reportCreator = reportCreator;
        this.firstNameDataStore = firstNameDataStore;
        this.firstNameDataStore2 = firstNameDataStore2;
        this.reportWriter = reportWriter;
    }

    public void writeReport(String reportTitle) throws IOException {
        Collection<String> firstNames = firstNameDataStore.getAllFirstNames();
        firstNames.addAll(firstNameDataStore2.getAllFirstNames());

        String report = reportCreator.createReport(reportTitle, firstNames);

        reportWriter.writeReport(report);
    }
}

but we have modified our unit, which goes against the open closed principle. We can actually extend the functionality of FirstNameDataStore to act on multiple datastores without modifiying it and heres how. We create a new FirstNameDataStore, that querys multiple other FirstNameDataStores.

public class MultipleFirstNameStores implements FirstNameDataStore {
    private final Collection<FirstNameDataStore> stores;

    public MultipleFirstNameStores(Collection<FirstNameDataStore> stores) {
        this.stores = stores;
    }

    @Override
    public Collection<String> getAllFirstNames() {
        List<String> result = new ArrayList<>();

        for (FirstNameDataStore store : stores) {
            result.addAll(store.getAllFirstNames());
        }

        return result;
    }
}

We can now inject this into the original (unmodified) version of the FirstNameReportWriter and we have extended its behaviour. If we can always modify our programs without changing code (only by adding new code) then its going to be more robust. The only part of the codebase we need to modify is the bit that creates the dependencies, so for the ongoing example in these articles we would change the root App from

public class App {
    public static void main(String[] args) throws IOException {
        FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"),
                new FileWriter()
        );


        firstNameReportWriter.writeReport(args[0]);
    }
}

to

public class App {
    public static void main(String[] args) throws IOException {
        FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new MultipleFirstNameStores(
                        Arrays.asList(
                                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"),
                                new Database("jdbc:sqlserver://db2:2000;databaseName=prodDb")
                        )
                ),
                new FileWriter()
        );

        firstNameReportWriter.writeReport(args[0]);
    }
}

But this is a relatively easy modification, because we are only calling constructors, the type system has your back.

So what was special about the FirstNameReportWriter unit, that allowed us to extend its functionality without modifiying it. It comes down to abstraction as discussed previously. When we inject a dependency we inject the most general public interface that provides the functionality we need. This means that there is a wide range of implementations that could fulfil the requirement. The better our abstractions, the more open closed our code is.

Liskov substitution principle

Objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.

This principle states that if a class Dog inherits from a class Animal, then any usage of an instance of Animal should be able to be replaced with an instance of Dog without changing the progam's behaviour. We havent mentioned anything about inheritence up till now. In fact I would go so far as to say you shouldn't use inheritence. But this doesnt mean we dont have to pay attention to this principle.

As discussed in the last section, we should consume the most general public interface as possible. This means that there are many units that can be injected into that unit. The Liskov substitution can be thought of as the restriction to the open closed princple, it states don't inject something you shouldnt.

Let's imagine our FirstNameDataStore public interface dependency was a little more powerful

public interface FirstNameDataStore {
    void write(String value);
    Collection<String> getAllFirstNames();
}

and our FirstNameReportWriter unit injected this as its dependency. Now our project manager asks us if we can write the report for all the people who work at our company. It's very tempting to try and bend that list into something that looks like a FirstNameDataStore, then we can inject it into our FirstNameReportWriter unit and we are finished. So we create this unit

public class EmployeeNames implements FirstNameDataStore {

    private Collection<String> firstNames = Arrays.asList(
            "steve", "mike", "sarah", "claire"
    );

    @Override
    public void write(String value) {}

    @Override
    public Collection<String> getAllFirstNames() {
        return firstNames;
    }
}

This works fine for the report writing but else where in the codebase we have a unit that looks like this

public class StoreUpdator {
    private final FirstNameDataStore firstNameDataStore;

    public StoreUpdator(FirstNameDataStore firstNameDataStore) {
        this.firstNameDataStore = firstNameDataStore;
    }

    public Collection<String> addNames(Collection<String> names) {
        for (String name : names) {
            firstNameDataStore.write(name);
        }
        return firstNameDataStore.getAllFirstNames();
    }
}

When we inject the EmployeeNames into this unit we get very strange results. We may have code that assumes (probably reasonably) that the names passed into addNames will be in the return value. We could make the EmployeeNames unit behave like a proper FirstNameDataStore by implementing write. It would be simple to update its internal list of first names when write is called. But then we'd have another problem, our EmployeeNames data store would contain names of people that aren't employees of the company.

The underlying problem is that we have used something that isn't a real FirstNameDataStore as per the contract of that public interface. That's exactly what the Liskov Substituion principle is there to protect against. If we ignore it, strange things happen when we add depnedencies. We can't compose units quickly and easily.

Interface segregation

No client should be forced to depend on methods it does not use.

If we revisit the last section it was actually a pretty reasonable expectation to be able to inject the EmployeeNames unit into the FirstNameReportWriter. The reason we couldn't is because (for just that section of the article) the FirstNameReportWriter required a FirstNameDataStore with both a getAllFirstNames and a write function. This was despite the fact the write wasnt used. If we had obeyed the Interface segregation policy, the FirstNameReportWriter unit would have consumed a dependency that only had a getAllFirstNames function. Then we would have been able to implement it with the EmployeeNames unit.

The interface segregation principle is usually discussed with reference to prefering many small interfaces to one big one. In this context we could have split the previous sections FirstNameDataStore unit into two interfaces, one with a write and one with a getAllFirstNames. In the context of this series of articles however, its actually pretty much equivalent to just **_abstract away** as much as you can_. e.g. in the previous example there was no need to worry about the write function so we could have just **abstracted** it away.

Dependency inversion

high level modules should not depend on low level modules.

We can summarise this principle as all **_units** should know as little as possible about their dependencies_. We have come across the benefits of this when discussing composition and the open closed principle. We learnt that the public interface of our **dependencies** should be as general as possible.

We may not have mentioned it at the time, but this principle has been driving the design of our coding style right from the start. If a unit must only know as little as possible about its dependencies, it is not able to construct any. In java a unit would need to know about the concrete type to call new Instance() and, as a result, would know everything about its dependency. It would be strongly coupled to its dependency and would be violating the dependency inversion principle.

If a unit cannot create its dependencies, it must be given them. This is why we have injected a unit's dependencies into its constructor. This is called the dependency injection pattern, it can be thought of as an implementation of the dependency inversion principle.

In any application there should only be one place where dependencies are created. It should be decoupled from any logic. In our ongoing example this is in the App class

public class App {
    public static void main(String[] args) throws IOException {
        FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"),
                new FileWriter()
        );

        firstNameReportWriter.writeReport(args[0]);
    }
}

We dont actually have to create the object hierarchy. We can use a dependency injection (DI) framework to do it for us. We then simply register all our units and the framework handles creating them when needed. Here is what it would look like in Java using Guice. First we define what units to inject.

public class ReportModule extends AbstractModule {
    @Override
    protected void configure() {
        bind(FirstNameReportWriter.class).to(FirstNameReportWriter.class);
        bind(ReportCreator.class).to(ReportCreatorImpl.class);
        bind(FirstNameDataStore.class).toInstance(
            new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"));
        bind(ReportWriter.class).to(FileWriter.class);
    }
}

bind(x).to(y) means whenever a units requires an x inject a y. So for example the above code defines that whenever a unit requires the general ReportWriter interface, inject a FileWriter unit. Then we create a container from the definitions, (it contains all out units) and ask it for the root unit when we start our app.

public class App {
    public static void main(String[] args) throws IOException {
        Injector injector = Guice.createInjector(new ReportModule());
        injector.getInstance(FirstNameReportWriter.class).writeReport(args[0]);
    }
}

At first glance the two approaches (with or without a DI framework) seem pretty equivalent. The main difference is that the manual approach depends on the tree structure of the dependencies. If we create our units manually with the new keyword, it wont compile if we forget to inject something. So let's pretend we forget about the FileWriter dependency, the manual approach wont compile:

public class App {
    public static void main(String[] args) throws IOException {
        FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb")
        );
        firstNameReportWriter.writeReport(args[0]);
    }
}

With a DI container this code will compile

public class ReportModule extends AbstractModule {
    @Override
    protected void configure() {
        bind(FirstNameReportWriter.class).to(FirstNameReportWriter.class);
        bind(ReportCreator.class).to(ReportCreatorImpl.class);
        bind(FirstNameDataStore.class).toInstance(
            new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"));
    }
}

but it will throw an exception when it tries to create an instance of FirstNameReporter because all of its dependencies havn't been registered. On the flip side if your dependency tree is very large, the DI framework can significantly simplify the creation code.

With a DI framework we can mock out a dependency deep down in the tree with very little effort. Imagine we are testing the FirstNameReporter unit and want to mock the ReportWriter unit, if we are manually creating units our test will have some code like this

FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"),
                mock(FileWriter.class)
        );

if we are using DI framework it might look like this

Module module = Modules.override(new ReportModule()).with((Module) binder -> {
            binder.bind(FileWriter.class).toInstance(mock(FileWriter.class));
        });

        Injector injector = Guice.createInjector(module);
        FirstNameReportWriter firstNameReportWriter = injector.getInstance(FirstNameReportWriter.class);

Again the manual case looks a bit simpler for this unit because its dependencies dont have dependencies. If the mocked unit was much further down a dependency tree, the DI approach is often easier, requires updating less, and more closely matches production.

The DI container really comes into its own when you want to control the lifecycle of your units. Imagine we've built a web framework and we want to inject a new Logger unit for every user, or for every request. It would be quite a task to modify the manual creation process to perform this functionality, but with a DI framework it can be pretty straight forward.

So should we use one or not? I dont want to be too opiniated on this, I'd suggest we start by manually creating units. As long as we only do it once at the start of your application, and decouple it from any logic, we can always switch to a DI framework when the unit heirarchy grows large, or if we want to do clever things with the units' life cycles.

What should we inject?

We want to completely decouple the lifecycle of our dpendencies from our units. We don't want to write logic that checks whether a dependency has been created or still exists, and if neccessary, we want to be abel to delegate the responsibility of managing unit lifecycles to a DI framework. We should be careful not to inject anything with a predefined lifecycle, e.g. if we inject a GameState dependency the lifecycle of this class is determined by the people playing the game. We can't easily hand over the responsibiity of managing that lifecyle to a DI container so we shouldnt inject it.

As we discussed at the end of the previous article, our coding pattern consists of stateless services and imutable data. We only inject services into other services and the data flows through them. Because our services are stateless they have no predefined lifecycle. Usually we create them once at the start of the app and they are singletones, but we are free to create as many as we want when we want. This makes them ideal for DI based lifecycle management. So on the whole we only inject services and we dont inject data.

You may have noticed that our Database unit took a String (data) in its constructor. This was a bit of a shortcut to keep the examples simple. We could just have easily extracted a ConfigProvider unit that would read the application config (urls, ports, etc) and keep them in its state (remember its ok for side effect units to have state). The config would then be passed into the Database unit for the query

If we revisit the earlier example, we can see that injecting data into the Database unit meant that we couldnt hand lifecycle management over to the DI container.

bind(FirstNameReportWriter.class).to(FirstNameReportWriter.class);
bind(ReportCreator.class).to(ReportCreatorImpl.class);
bind(FirstNameDataStore.class).toInstance(
    new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"));
bind(ReportWriter.class).to(FileWriter.class);

For all our units that were stateless services, we could just inform the DI container about their type. In this case the DI container is able to destory and create instances as it wishes. The Database unit needed to be constructed with data, so we had to register a specific instance. In this case the DI container doesnt know how to create them for itself. So the instance we gave it will last the lifetime of the application.

summary

So thats it, if we write code like we have learnt to do in these articles, we can be sure its going to obey the solid principles. It's an unfair way of wording it though, because these prinicples have been driving the design of our coding style right from the start.