Unitiliy

Functional programming

Up to now I have claimed that all the techniques we have learnt in the previous articles are applicable to any language. However I've only shown examples in Java. In this article we will apply what we have learnt to a not only a different language, but a completely different programming paradigm, functional programming. Well reproduce the final result from the first article in Haskell.

I don't expect many people to know Haskell so I'll keep the examples as simple as possible. Hopefully we'll be able to follow the code even if we dont understand every part of it. I'll use as few language features as possible and I'll ignore some idiomatic aspects of the language. If youre a keen Haskeller please try and see past this.

A very brief introduction to haskell

We'll start with a few simple examples to learn the syntax. The first class citizen in Haskell is the function. We can declare a function add and call it like this

add x y = x + y
result = add 3 2 -- equals 5

the equivalent in Java would be something like this

int add(int x, int y) {
    return x + y;
}
int result = add(3, 2) // equals 5

We can see that Haskell doesnt use brackets in function calls, it simply uses spaces. This takes a little getting used to at first, but I'd argue it's much simpler it's just not what we are used to.

Functions in Haskell have a type, and if we wish to do so we can declare them (we don't need to, the compiler can infer them).

add :: Int -> Int -> Int
add x y = x + y

we can read the type signiture add :: Int -> Int -> Int left to right as add is a function that takes an Int and takes another Int and returns an Int. Let's have a look at another example

increment :: Int -> Int
increment x = x + 1

increment is a function that takes an Int and returns another Int. It increments its argument by one. We can now write a type alias for all functions of this type

type UnaryInt = Int -> Int
increment :: UnaryInt
increment x = x + 1

This is simply giving the type Int -> Int a different name, but it hints at something interesting. What if we rewrite our add function like this

add :: Int -> UnaryInt
add x y = x + y

This implies that we can also interpret the type signiture Int -> Int -> Int as something that takes an Int and returns a function that takes an Int and returns an Int. An example should clear this up.

add :: Int -> Int -> Int
add x y = x + y

add3 :: Int -> Int
add3 = add 3

result1 = add3 2 -- equals 5
result2 = add3 0 -- equals 3
result3 = add3 10 -- equals 13

This is called partial application or currying.

In Haskell there's a special type IO. It tells us that a function interacts with the outside world (causes a side effect). Let's consider these two functions

transformAString1 :: String -> String
transformAString2 :: String -> IO String

transformAString1 takes a String and returns a String, I can be sure this is a pure function. For the same input, it will allways return the same output. This is not true for transformAString2. This returns IO String which tells us that it returns a String but in doing so causes a side effect. We may get something different each time we run it, e.g. if we are reading from a database.

The last thing we need to know for the time being is the equavialent of Java's void in Haskell is called unit. We use the symbol (). So a function with type

aFunction :: String -> IO ()

takes a single String as its input, returns nothing and causes a side effect.

At this point we have learnt enough Haskell to start implementing what we have learnt in the previoius articles, so let's do that now.

Composable units

Let's revisit our program to create a report of all the first names in a database. We visualised our code like this

This visualisation is language agnostic (ignore the Java function signature, we could have written that in some sore of psuedo code) so let's create these units in Haskell. Its a functional language so our units will be functions.

We'll work from the inner units outwards, testing them in isolation (car driven development). We begin with the pure unit ReportCreator. It takes a title and a list of first names, and outputs a report. Firstly we'll create an alias for its type

type ReportCreator = String -> [String] -> String

Now any unit that needs to create a report can consume a ReportCreator dependency. Now all we need to do is implement it.

createReport :: ReportCreator
createReport title names = foldl addRow title (group capitalisedNames)
    where
        capitalisedNames = map capitaliseFirstLetter names
        capitaliseFirstLetter (x:xs) = toUpper x : xs
        addRow content duplicateNames = content ++ (createLine (head duplicateNames) (length duplicateNames))
        createLine name count = "\n" ++ name ++ ":" ++ show count

Unsuprisinlgy, this is a more functional approach than the equivalent Java unit . Theres also few constructs here we havn't discussed yet. I'm not going to go into the details because I don't want this article to become an intro to Haskell. We just need to remember the actual implementation isn't that important if you have a good set of tests. In haskell this is quite simple, all the code we need is

createReportTest1 = createReport "an interesting title" ["steve", "mike"] @=?
    "an interesting title\nSteve:1\nMike:1"
createReportTest2 = createReport "an interesting title" ["mike", "steve", "steve"]
    @=? "an interesting title\nMike:1\nSteve:2"
createReportTest3 = createReport "an interesting title" ["mike", "steve", "mike", "steve", "steve"] @=?
    "an interesting title\nMike:2\nSteve:3"

This uses a package called HUnit, and we can read @=? as must equal.

We now create the other units that don't have dependencies. FirstNameDataStore is defined like this

type FirstNameDataStore = IO [String]

it has no inputs, returrns a list of first names and causes a side effect (reading from the database). We can implement a Database version

database :: String -> FirstNameDataStore
database url = do
    connection <- connectTo url
    query connection "select * from users"

which we create and use like this

aDatabaseUnit = database "jdbc:sqlserver://db1:2000;databaseName=prodDb"
firstNames = aDatabaseUnit

This is equivalent to how we defined the Database unit in Java

public class Database implements FirstNameDataStore {
    private final String url;

    public Database(String url) {
        this.url = url;
    }

    @Override
    public Collection<String> getAllFirstNames() {
        DatabaseClient databaseClient = DatabaseClientFactory.newclient(url);
        return databaseClient.runQuery(
                "select firstname from users");
    }
}

which we created and used like this

Database aDatabaseUnit = new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb")
Collection<String> firstNames = aDatabaseUnit.getAllFirstNames()

We want to create the Database unit with the database url so we don't need to worry about it again however often we run it. In Java we pass the url into the unit's constructor. In Haskell we use currying to closure the url into the database function unit. The exact same difficulties arise when testing the database unit in Haskell as for Java, and we'll sidestep them once again in exactly the same way.

Then we have our ReportWriter unit. This is defined as

type ReportWriter = String -> IO ()

It takes the String representation of the report and returns nothing but causes a side effect (writing to a file). Our FileWriter implementation is simply

writeToFile :: ReportWriter
writeToFile report = writeFile "/home/reports/firstNames.txt" report

which is relatively straight forward to test

writeToFileTest = do
    let report = "a great report"
    writeToFile report
    fileContents <- readFile "/home/reports/firstNames.txt"
    fileContents @=? report

The remaining unit (apart from the runtime) is the FirstNameReportWriter. We can define its type

type FirstNameReportWriter = String -> IO ()

It takes a title (String), and returns nothing but causes a side effect (reading from the database and writing to a file). We want the implementation to consume each of the units we have just defined as dependencies. In Java we could inject a unit's dependencies into its constructor. In haskell we only have functions. But thats no problem, as we saw for the Database unit we can closure the dependencies using currying. All we need is a function that takes the dependencies and returns a FirstNameReporter we define it like this

create :: ReportCreator -> FirstNameDataStore -> ReportWriter -> FirstNameReportWriter
create createReport firstNames writeReport title = do
    userNames <- firstNames
    let report = createReport title userNames
    writeReport report

compare how the dependencies are composed together with the equivalent Java implementation

public class FirstNameReportWriter {
    private final ReportCreator reportCreator;
    private final FirstNameDataStore firstNameDataStore;
    private final ReportWriter reportWriter;

    public FirstNameReportWriter(
            ReportCreator reportCreator,
            FirstNameDataStore firstNameDataStore,
            ReportWriter reportWriter
    ) {
        this.reportCreator = reportCreator;
        this.firstNameDataStore = firstNameDataStore;
        this.reportWriter = reportWriter;
    }

    public void writeReport(String reportTitle) throws IOException {
        Collection<String> firstNames = firstNameDataStore.getAllFirstNames();
        String report = reportCreator.createReport(reportTitle, firstNames);
        reportWriter.writeReport(report);
    }
}

When we tested this unit in Java, we used a mock FirstNameDataStore unit to return a predefined list of first names. We also used an argument capture to verify a mock ReportWriter unit was called with what we expect it to be called with. We can do the exact same test in Haskell, we use an IORef as the capture

runTest :: String -> [String] -> IO String
runTest title names = do
    ioRef <- newIORef ""
    let writer = writeIORef ioRef
    let database = return names
    let writeFirstNameReport = create createReport database writer
    writeFirstNameReport title
    readIORef ioRef

test1 :: Assertion
test1 = do
    report <- runTest "an interesting title" ["steve", "mike"]
    report @=? "an interesting title\nSteve:1\nMike:1"

Finally our main method constructs the units, and runs the root one passing in the program argument.

main :: IO ()
main = do
    (reportTitle:_) <- getArgs
    let prodDatabase = database "jdbc:sqlserver://db1:2000;databaseName=prodDb"
    let writeFirstNameReport = create createReport prodDatabase writeToFile
    writeFirstNameReport reportTitle

Once again here`s the Java equivalent for comparison

public class App {
    public static void main(String[] args) throws IOException {
        FirstNameReportWriter firstNameReportWriter = new FirstNameReportWriter(
                new ReportCreatorImpl(),
                new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"),
                new FileWriter()
        );

        firstNameReportWriter.writeReport(args[0]);
    }
}

Dependency inversion

In the solid principles articles we discussed how an injection framework can simplify the creation of our units as the dependency tree becomes more complicated. There is a pattern in Haskell which acheives the same goal called the ReaderT pattern. This part's a little harder to follow, so feel free to skip to the last section. I want to include it because manually creating units doesn't scale for ever. Not in Java, Haskell or any other languge, so a complete solution must include some level of automated dependency injection.

We create a container with our dependencies in, and our units can ask for the dependencies it needs. If we only register the units we wish to mock it would be defined like this

data Container = Container {
    reportWriter :: ReportWriter,
    firstNameDataStore :: FirstNameDataStore
}

then we can ask for the FirstNameReportWriter's dependencies before we create it.

run :: String ->  ReaderT Container IO ()
run title = do
    writer <- asks reportWriter
    database <- asks firstNameDataStore
    let firstNameReportWriter = create createReport database writer
    liftIO $ firstNameReportWriter title

Haskell is great at abstracting away details e.g. passing the Container between functions. It's not completely transparent though, you may have noticed the type signiture has changed and a strange liftIO function has appeared. The main function also has a bit more boiler plate when we create the Container.

main :: IO ()
main = do
    (reportTitle:_) <- getArgs
    let prodDatabase = database "jdbc:sqlserver://db1:2000;databaseName=prodDb"
    let container = Container writeToFile prodDatabase
    runReaderT (run reportTitle) container

But the point is we no longer have to create the whole tree of units and their dependencies, we simply register each implementation in exactly the same way as we did with Guice for the Java implementation.

public class App {
    public static void main(String[] args) throws IOException {
        Injector injector = Guice.createInjector(new AbstractModule() {
            @Override
            protected void configure() {
                bind(FirstNameReportWriter.class).to(FirstNameReportWriter.class);
                bind(ReportCreator.class).to(ReportCreatorImpl.class);
                bind(FirstNameDataStore.class).toInstance(
                        new Database("jdbc:sqlserver://db1:2000;databaseName=prodDb"));
                bind(ReportWriter.class).to(FileWriter.class);
            }
        });
        injector.getInstance(FirstNameReportWriter.class).writeReport(args[0]);
    }
}

Finally we can create a Container with the mock units for the tests.

runTest :: String -> [String] -> IO String
runTest title names = do
    ioRef <- newIORef ""
    runReaderT (run title) (Container (writeIORef ioRef) (pure names))
    readIORef ioRef

test1 :: Assertion
test1 = do
    report <- runTest "an interesting title" ["steve", "mike"]
    report @=? "an interesting title\nSteve:1\nMike:1"

Java vs Haskell

Let's finish by comparing the two implementations of our coding guidelines. In Java our units are classes that consume their dependencies through constructor injection. Dependencies are referenced by the most generic interface possible. In Haskell, units are functions, they consume dependencies through currying and reference them by their type. For this purpose Haskell is better. Any unit with the right type can be used as a dependency, in Java the unit class needs to explicitly implement the interface (@FunctionalInterface can help with this). So in Haskell our code is more open closed, and less inhibited by the Liskov substitution principle as discussed in the solid principles article. Haskell also has the IO type labeling functions that cause side effects. We can be much more confident about a units's functionality from it's public interface, we can create better abstractions, and we can compose our units faster and more easily.

In the article on abstraction and composition we decided to limit mutable state and work with services and immutable data. At this point, we were well within the functional programming paradigm. This is the reason it fits so well with Haskell. It's natural to program like this in a functional language and we actually need to bend Java a little bit to fit. Though we said in the previous paragraph that Java units are classes, it's not quite that clear cut. They are classes because that is the most idiomatic way of representing them in such an object orientated language. However in most cases the class doesn't add any functionality beyond the single function it contains. Notice how the class names mirror the function names. We have a FileWriter class with a writeFile function, and a ReportCreator with a createReport function. So even though we wrap them in classes in the object orientated languages, our units are always functions really.