Mocking comes up a lot in discussions of testing effectful code in Haskell.
One of the advantages for mtl
type classes or Eff
freer monads is that you can swap implementations and run the same program on different underlying interpretations.
This is cool!
However, it’s an extremely heavy weight technique, with a ton of complexity.
I’ve recently gravitated to mostly doing everything in this sort of type:
newtype App a = App { unApp :: ReaderT AppCtx IO a }
It’s simple, has great error messages, and is easy to hook into existing libraries and frameworks by writing instances for either AppCtx
or App
.
There’s a small cost: I have to call lift
manually if I use an App a
function inside of a Conduit or MaybeT
block or similar.
This is a fairly small cost to pay, all told, and the benefits in getting new developers up to speed on our projects is a big sell.
Now, how would I go about testing this sort of function?
doWork :: App ()
doWork = do
query <- runHTTP getUserQuery
users <- runDB (usersSatisfying query)
for_ users $ \user -> do
thing <- getSomething user
let result = compute thing
runRedis (writeKey (userRedisKey user) result)
If we have our mtl
or Eff
or OOP mocking hats on, we might think:
I know! We need to mock our HTTP, database, and Redis effects. Then we can control the environment using mock implementations, and verify that the results are sound!
Let’s step back and apply some more elementary techniques to this problem. I bet we can simplify our solution to testing.
The first thing we need to do is recognize that effects and values are separate, and try to keep them as separate as possible. This is a basic principle of purely functional programming, and we would be wise to take its heed. Generally speaking, functions that look like:
doWork :: App ()
are not functional (in the “functional programming” sense). The only point to this is to run it for the effect it has on the outside world. We can tell just by looking at the type signature! So, let’s look at what it does, and how we might test it:
doWork :: App ()
doWork = do
query <- runHTTP getUserQuery
users <- runDB (usersSatisfying query)
for_ users $ \user -> do
thing <- getSomething user
let result = compute thing
runRedis (writeKey (userRedisKey user) result)
We get a bunch of stuff – inputs – that are acquired as an effect. We can make this a lot easier to test by simply taking those things as inputs.
doWork :: App ()
doWork = do
query <- runHTTP getUserQuery
users <- runDB (usersSatisfying query)
doWorkHelper users
doWorkHelper :: [User] -> App ()
doWorkHelper users =
for_ users $ \user -> do
thing <- getSomething user
let result = compute thing
runRedis (writeKey (userRedisKey user) result)
Now, the only effect we need to mock for the doWorkHelper
is getSomething
and runRedis
.
But I’m not satisfied.
We can get rid of the getSomething
by factoring another helper out.
doWorkHelper :: [User] -> App ()
doWorkHelper users = do
things'users <- for users $ \user -> do
thing <- getSomething user
pure (thing, user)
lookMaNoInputs things'users
lookMaNoInputs :: [(Thing, User)] -> App ()
lookMaNoInputs things'users =
for_ things'users $ \(thing, user) -> do
let result = compute thing
runRedis (writeKey (userRedisKey user) result)
We’ve now extracted all of the “input effects.” Can we decompose this further? We can! Let’s inspect our output effect:
runRedis (writeKey (userRedisKey user) result)
It expects two things:
thing
.We can prepare the redis key and computed result fairly easily:
businessLogic :: (Thing, User) -> (RedisKey, Result)
businessLogic (thing, user) = (userRedisKey user, compute thing)
lookMaNoInputs :: [(Thing, User)] -> App ()
lookMaNoInputs users = do
for_ (map businessLogic users) $ \(key, result) -> do
runRedis (writeKey key result)
neat! We’ve isolated the core business logic out and now we can write nice unit tests on that business logic. All of the business logic has been excised from the effectful code, and we’ve reduced the amount of code we need to test.
Streaming libraries like Pipes
and Conduit
are a great way to handle large data sets and interleave effects.
They’re also a great way to decompose functions and provide “inverted mocking” facilities to your programs.
Most conduits look like this:
import Data.Conduit (runConduit, (.|))
import qualified Data.Conduit.List as CL
streamSomeStuff :: IO ()
streamSomeStuff = do
runConduit
$ conduitThatGetsStuff
.| conduitThatProcessesStuff
.| conduitThatConsumesStuff
You have some Source
or Producer
that initially provides things.
This can be from a database action, an HTTP request, or from a file handle.
Now, each part of this conduit can itself have many conduits inside of it:
conduitThatGetsStuff :: Producer IO ByteString
conduitThatGetsStuff = ...
conduitThatProcessesStuff :: Conduit ByteString IO RealThing
conduitThatProcessesStuff =
CL.mapM (\bs ->
case parseFromByteString bs of
Left err ->
throwIO err
Right yesss ->
pure yesss
)
.| CL.map convertSomeThing
.| CL.filter someFilterCondition
passThrough :: (a -> IO ()) -> Conduit a IO a
passThrough action = CL.mapM (\a -> do
action a
pure a)
conduitThatConsumesStuff :: Consumer RealThing IO ()
conduitThatConsumesStuff =
passThrough print
.| passThrough makeHttpPost
.| CL.mapM_ saveToDatabase
We have a bunch of small, decomposed things.
Our conduitThatProcessesStuff
doesn’t care where it gets the ByteString
s that it parses – you can hook it up to anything.
Databases, HTTP calls, file IO, or even just CL.sourceList [example1, example2, example3]
.
Likewise, the conduitThatConsumesStuff
doesn’t care where the RealThing
s come from.
You can use CL.sourceList
to provide a bunch of fake input.
We’re not usually working directly with Conduit
s here, either – most of the functions are provided to CL.mapM_
or CL.filter
or CL.map
.
That allows us to write functions that are simple a -> m b
or a -> Bool
or a -> b
, and these are really easy to test.
Always keep in mind the lightest and most general techniques in functional programming:
These will get you very, very far.
Let’s revisit the doWork
business up top:
doWork :: App ()
doWork = do
query <- runHTTP getUserQuery
users <- runDB (usersSatisfying query)
for_ users $ \user -> do
thing <- getSomething user
let result = compute thing
runRedis (writeKey (userRedisKey user) result)
We can make this abstract by taking concrete terms and making them function parameters. The literal definition of lambda abstraction!
doWorkAbstract
:: Monad m
=> m Query -- ^ The HTTP getUserQuery
-> (Query -> m [User]) -- ^ The database action
-> (User -> m Thing) -- ^ The getSomething function
-> (RedisKey -> Result -> m ()) -- ^ finally, the redis action
-> m ()
doWorkAbstract getUserQuery getUsers getSomething redisAction = do
query <- getUserQuery
users <- getUsers query
for_ users $ \user -> do
thing <- getSomething user
let result = compute thing
redisAction (userRedisKey user) result
There are some interesting things to note about this abstract definition:
Identity
, State
, IO
, whatever. You choose!Given the above abstract definition, we can easily recover the concrete doWork
by providing the necessary functions:
doWork :: App ()
doWork =
doWorkAbstract
(runHTTP getUserQuery)
(\query -> runDB (usersSatisfying query))
(\user -> getSomething user)
(\key result -> runRedis (writeKey key result))
We can also easily get a testing variant that logs the actions taken:
doWorkScribe :: Writer [String] ()
doWorkScribe =
doWorkAbstract getQ getUsers getSomething redis
where
getQ = do
tell ["getting users query"]
pure AnyUserQuery
getUsers _ = do
tell ["getting users"]
pure [exampleUser1, exampleUser2]
getSomething u = do
tell ["getting something for " <> show u]
pure (fakeSomethingFor u)
redis k v = do
tell ["wrote k: " <> show k]
tell ["wrote v: " <> show v]
All without having to fuss about with monad transformers, type classes, or anything else that’s terribly complicated.
Ultimately, this is all about decomposition of programs into their smallest, most easily testable parts. You then unit or property test these tiny parts to ensure they work together. If all the parts work independently, then they should work together when composed.
Your effects should ideally not be anywhere near your business logic.
Pure functions from a
to b
are ridiculously easy to test, especially if you can express properties.
If your business logic really needs to perform effects, then try the simplest possible techniques first: functions and abstractions.
Ultimately, I believe that it’s simpler and easier to write and test functions that take pure values.
These are agnostic to where the data comes from, and don’t need to be mocked at all.
This transformation is typically easier than introducing mtl
classes, monad transformers, Eff
, or similar techniques.
Sometimes, you really just can’t avoid testing effectful code. A common pattern I’ve noticed is that people want to make things abstract at a level that is far too low. You want to make the abstraction as weak as possible, to make it easy to mock.
Consider the common case of wanting to mock out the database. This is reasonable: database calls are extremely slow! Implementing a mock database, however, is an extremely difficult task – you essentially have to implement a database. Where the behavior of the database differs from your mock, then you’ll have test/prod mismatch that will blow up at some point.
Instead, go a level up – create a new indirection layer that can be satisfied by either the database or a simple to implement mock. You can do this with a type class, or just by abstracting the relevant functions concretely. Abstracting the relevant functions is the easiest and simplest technique, but it’s not unreasonable to also write:
data UserQuery
= AllUsers
| UserById UserId
| UserByEmail Email
class Monad m => GetUsers m where
runUserQuery :: UserQuery -> m [User]
This is vastly more tenable interface to implement that a SQL database!
Let’s write our instances, one for the persistent
library and another for a mock that uses QuickCheck’s Gen
type:
instance MonadIO m => GetUsers (SqlPersistT m) where
runUserQuery = selectList . convertToQuery
instance GetUsers Gen where
runUserQuery query =
case query of
AllUsers ->
arbitrary
UserById userId ->
take 1 . fmap (setUserId userId) <$> arbitrary
UserByEmail userEmail ->
take 1 . fmap (setUserEmail userEmail) <$> arbitrary
Alternatively, you can just pass functions around manually instead of using the type class mechanism to pass them for you.
Oh, wait, no!
That GetUsers Gen
instance has a bug!
Can you guess what it is?
In the UserById
and UserByEmail
case, we’re not ever testing the “empty list” case – what if that user does not exist?
A fixed variant looks like this:
instance GetUsers Gen where
runUserQuery query =
case query of
AllUsers ->
arbitrary
UserById userId -> do
oneOrZero <- choose (0, 1)
take oneOrZero . fmap (setUserId userId) <$> arbitrary
UserByEmail userEmail -> do
oneOrZero <- choose (0, 1)
take oneOrZero . fmap (setUserEmail userEmail) <$> arbitrary
I made a mistake writing a super simple generator. Just think about how many mistakes I might have made if I were trying to model something more complex!