Skip to content

Instantly share code, notes, and snippets.

@piq9117
Created November 27, 2019 20:59
Show Gist options
  • Save piq9117/823dce0f2bada6d29442da1e02972cdb to your computer and use it in GitHub Desktop.
Save piq9117/823dce0f2bada6d29442da1e02972cdb to your computer and use it in GitHub Desktop.
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE QuasiQuotes #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
module Main where
import Database.Persist
import Database.Persist.Postgresql
import Database.Persist.TH
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
name String
age Int Maybe
deriving Show
|]
getPerson :: MonadIO m => PersionId -> ( Maybe ( Entity Person ) )
getPerson pid = selectFirst [ PersionId ==. pid ] [ ]
connStr = "host=localhost dbname=persistent_expirement user=user password=user port=5432"
main :: IO ()
main = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ do
flip runSqlPersistMPool pool $ do
runMigration migrateAll
@friedbrice
Copy link

friedbrice commented Nov 28, 2019

I don't know that getPerson will compile (even if we change its signature to getPerson :: MonadIO m => PersonId -> m (Maybe (Entity Person)), which I'm guessing you intended). The reason it won't compile is because there's no way for selectFirst to connect to the database.

We're using the function

selectFirst :: (MonadIO m, PersistRecordBackend a b) =>
  [Filter a] -> [SelectOpt a] -> ReaderT b m (Maybe (Entity a))

from Persistent. Let's unpack it. Unpacking the ReaderT b, we see the signature could have equivalently been as follows:

selectFirst :: (MonadIO m, PersistRecordBackend a b) =>
  [Filter a] -> [SelectOpt a] -> b -> m (Maybe (Entity a))

Perhaps surprisingly, we can keep going. MonadIO m => m a unpacks to simply IO a. This is because the only things you can do with a value of type MonadIO m => m a are use it as a MonadIO m => m a and propagate the constraint or instantiate it at some concrete type that has a MonadIO instance. Conversely, if given any IO a, it can already be used as a MonadIO m => m a by calling liftIO on it, and that in turn can be instantiated at any concrete type that has a MonadIO instance.

So, morally, Persistent could have exported the following:

selectFirst :: (PersistRecordBackend a b) =>
  [Filter a] -> [SelectOpt a] -> b -> IO (Maybe (Entity a))

1/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

This is not meant to be a judgement of Persistent, I'm simply unpacking the type signature of selectFirst in order to help understand how we'd use such a function. So, from the unpacked signature, we see that this function does some IO and we see that this function needs a b that represents the database connection. Now, the question persists, how do we fit this function into our application?

2/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

If we let GHC infer the the "most general" signature of getPerson pid = selectFirst [PersonId ==. pid] [], we'd get the following:

getPerson :: (PersistRecordBackend Person b, MonadIO m) => ReaderT b m Person

I put most general in quotes above because MonadIO m => m a is not any more general than simply IO a. In particular, we don't lose any functionality by providing our own signature

getPerson :: (PersistRecordBackend Person b) => ReaderT b IO Person
getPerson = selectFirst [PersonId ==. pid] []

with m instantiated at IO.

We still have a few unanswered questions: (1) If IO a is equivalent to MonadIO m => m a, then why does Persistent use such a roundabout signature? (2) What do we do about the ReaderT b?

3/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

The easiest way to get rid of a ReaderT b m a is to turn it back into a function. We could write the following:

getPerson :: (PersistRecordBackend Person b) =>
  b -> PersonId -> IO Person
getPerson b =
  runReaderT (selectFirst [PersonId ==. pid] []) b

In fact, we probably have one concrete Persistent backend we're using throughout our app, so we can write it like so:

type MyBackend = ...
getPerson :: MyBackend -> PersonId -> IO Person
getPerson b pid = runReaderT (selectFirst [PersonId ==. pid] []) b

4/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

It's annoying that we have to pass around the backend to every function that touches the database. If we're only using one backend throughout our app, why not factor it out as a global value?

theBackend :: MyBackend
theBackend = ...

getPerson :: PersonId -> IO Person
getPerson pid = runReaderT (selectFirst [PersonId ==. pid] []) theBackend

Here is where we press up against the fundamental difference between Haskell and other programming languages. Persistent gives us ways to construct the backend we need, but we have to do IO to do so. Another way of thinking about this is that the database connection is something that we won't have until runtime, and in Haskell that means it's stuck in IO. The best we could do is have a global IO MyBackend.

makeBackend :: IO MyBackend
makeBackend = ...

getPerson :: PersonId -> IO Person
getPerson pid = fmap (runReaderT (selectFirst [PersonId ==. pid] [])) theBackend

But we absolutely don't want the above, because that will create a new backend each time getPerson is used. We want to create one backend at application start and reuse it throughout. It seems that we're stuck explicitly passing around the backend in all our functions. So annoying.

5/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

So now we get to talk about How to Structure Haskell Apps. The most straightforward way is to just bite the bullet: explicitly pass the database connection into functions that need it, as we had above:

getPerson :: MyBackend -> PersonId -> IO Person

This is fine for small applications, but for large applications it quickly gets tedious. For larger applications, there are a number of things we can do.

The next easiest thing is to make a bespoke datatype for your app usually termed an app monad, and usually named App.

data AppEnv =
  AppEnv {
    theBackend :: MyBackend,
    ... -- whatever runtime globals you need
  }

initAppEnv :: IO AppEnv
initAppEnv = ... -- bootstrap your app env

newtype App a = App { runApp :: AppEnv -> IO a }
  deriving (MonadReader AppEnv, MonadIO) via (ReaderT AppEnv IO)

entryPoint :: App ()
entryPoint = ... -- all your program logic

main :: IO ()
main = do
  appEnv <- initAppEnv
  runApp entryPoint appEnv

We derive a MonadIO instance for App so that we can essentially inherit all of the built-in IO functions from Haskell's standard library and any IO functions provided by third-party libraries.

In this universe, getPerson could be written as follows:

getPerson :: PersonId -> App Person
getPerson pid = App $ \AppEnv{ theBackend } ->
  runReaderT (selectFirst [PersonId ==. pid] []) theBackend

6/n

@friedbrice
Copy link

friedbrice commented Nov 28, 2019

As it stands, there's a lot of ceremony surrounding calling selectFirst. We'd like to reduce that and make it more convenient. There are a number of ways we can do this.

Here's a straightforward way to do that:

liftPersist :: ReaderT MyBackend IO a -> App a
liftPersist p = App $ \AppEnv{ theBackend } -> runReaderT p theBackend

getPerson :: PersonId -> App Person
getPerson pid = liftPersist $ selectFirst [PersonId ==. pid] []

This approach has the advantage of being dead simple: write a function that has all the boilerplate. It has the disadvantage that it's rather specific. This is, typically, the approach I find most useful.

7/n

@friedbrice
Copy link

There are a few drawbacks with the above approach. One is that it threads Persistent, a third-party library, throughout your application logic. If, later, you decide to swap out Persistent for something else, you'll have a lot of code to change. Another drawback is that you will tend to write a lot of functions that return an App a. App can do arbitrary IO, since it has a MonadIO instance. If we'd like to either decouple our application from third-party libraries or enforce principle of least privilege, we can add another layer.

class GetPerson m where
  getPerson :: PersonId -> m Person

entryPoint :: GetPerson m => m ()
entryPoint = ... -- all your program logic

instance GetPerson App where
  getPerson pid = liftPersist $ selectFirst [PersonId ==. pid] []

Your initApp and main stay exactly the same. Your entryPoint (and all the functions it calls, all the functions that comprise your application logic) never mention App, IO, or MyBackend directly: they express all their needs in terms of GetPerson and a host of similar bespoke classes you wrote specifically for this application.

8/n

@friedbrice
Copy link

friedbrice commented Nov 29, 2019

When you take this approach, use a module structure analogous to this:

┣ Classes
┃  ┗ GetPerson.hs
┃    ┣ class GetPerson
┃    ┗ getPerson
┃  ...
┣ AppLogic
┃  ┗ EntryPoint.hs
┃     ┣ import Classes.GetPerson
┃     ┗ entryPoint
┃     ...
┃  ...
┗ AppWiring
   ┣ AppEnv.hs
   ┃  ┣ import Database.Persist
   ┃  ┣ data MyBackend
   ┃  ┣ data AppEnv
   ┃  ┗ initAppEnv
   ┣ App.hs
   ┃  ┣ import Database.Persist
   ┃  ┣ import Classes.GetPerson
   ┃  ┣ import AppWiring.AppEnv
   ┃  ┣ newtype App
   ┃  ┣ runApp
   ┃  ┗ instance GetPerson App
   ┗ Main.hs
      ┣ import AppWiring.AppEnv
      ┣ import AppWiring.App
      ┣ import AppLogic.EntryPoing
      ┗ main = fmap (runApp entryPoint) initAppEnv

The important characteristic about the above module structure is that GetPerson doesn't depend on Persist, App, or MyBackend. This extends to every submodule of AppLogic: GetPerson completely abstracts those dependencies. The only place where we couple our application to Persist is in the GetPerson instance for App.

9/n

@friedbrice
Copy link

friedbrice commented Nov 29, 2019

Also when we take this approach, you probably don't need classes as granular as GetPerson. You probably want classes like DatabaseRead, DatabaseWrite, LocationService, FileSystemRead, FileSystemWrite, Log, and similar things. The point of writing such bespoke classes is twofold: (1) hide the details of third-party libraries (e.g. selectFirst) and runtime dependencies (e.g. MyBackend); and (2) get an idea of what each function might be doing (and not doing) from its signature.

Along those lines, you should never need to write a function with a MonadIO constraint: if you need to do some IO, write a bespoke class for it in Classes and write an instance of that class for App in AppWiring. Try not to make your classes general-purpose things like Fetch. Make them represent particular services that your application needs, like LocationService. For example a class like

class Fetch m where
  fetch :: Url -> m Response

doesn't hide the details of a service from application developers. Neither does it guarantee that the correct URL is used or that the response is parsed correctly. Instead, using a bespoke class like

class LocationService where
  findLocation :: Coordinates -> m Location

means you only need to construct the URL and parse the Response in one place, in the LocationService instance for App.

Remember that the point of these bespoke classes isn't for them to be reused in other projects: the point is to abstract the dependencies of this particular application only and to segregate I/O action by class constraints to enforce principle of least privilege.

10/n

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment