Skip to content

Instantly share code, notes, and snippets.

@cdepillabout
Last active June 13, 2022 09:42
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cdepillabout/a570d673b2341f109d30a91d243125dd to your computer and use it in GitHub Desktop.
Save cdepillabout/a570d673b2341f109d30a91d243125dd to your computer and use it in GitHub Desktop.

Haskell DB Library Comparison

We will be comparing the following DB libraries:

  • HRR
  • Opaleye
  • Persistent + Esqueleto
  • Groundhog
  • postgresql-simple (and postgresql-transactional)

HRR

HRR will connect to our database and define our datatypes for us. Let's say we have a table like this:

CREATE TABLE account
 (account_id integer primary key autoincrement not null,
  product_cd varchar(10) not null,
  cust_id integer not null,
  open_date date not null,
  close_date date,
  last_activity_date date,
  status text not null,
  open_branch_id integer,
  open_emp_id integer,
  avail_balance float(10,2),
  pending_balance float(10,2),
  check(status = 'ACTIVE' or status = 'CLOSED' or status = 'FROZEN')
  constraint fk_product_cd foreign key (product_cd)
    references product (product_cd),
  constraint fk_a_cust_id foreign key (cust_id)
    references customer (cust_id),
  constraint fk_a_branch_id foreign key (open_branch_id)
    references branch (branch_id),
  constraint fk_a_emp_id foreign key (open_emp_id)
        references employee (emp_id)
 );

Using Template Haskell, we can get it to create types like this:

data Account
  = Account {accountId :: !GHC.Int.Int64,
             productCd :: !String,
             custId :: !GHC.Int.Int64,
             openDate :: !Day,
             closeDate :: !(Maybe Day),
             lastActivityDate :: !(Maybe Day),
             status :: !String,
             openBranchId :: !(Maybe GHC.Int.Int64),
             openEmpId :: !(Maybe GHC.Int.Int64),
             availBalance :: !(Maybe Double),
             pendingBalance :: !(Maybe Double)}
  deriving (Show)

-- Relation type corresponding to Table
account :: Relation () Account
account =  ...

-- Column selectors for This DSL
accountId' :: Pi Account GHC.Int.Int64
accountId'
  = definePi 0
productCd' :: Pi Account String
productCd'
  = definePi 1
custId' :: Pi Account GHC.Int.Int64
custId'
  = definePi 2
....

Here is a simple query for accounts, using the types we defined above.

account1 :: Relation () Account
account1 = relation $ do
  a <- query account
  wheres $ a ! Account.productCd' `in'` values ["CHK", "SAV", "CD", "MM"]
  return a

Here is a projection (only selecting certain rows):

account_3_7 :: Relation () (Maybe Int64, String)
account_3_7 = relation $ do
  a <- query account
  let proj = (,) |$| a ! Account.openEmpId'
                 |*| a ! Account.productCd'
  asc proj
  return proj

Generated SQL:

SELECT ALL T0.open_emp_id AS f0,
           T0.product_cd AS f1
FROM MAIN.account T0
ORDER BY T0.open_emp_id ASC, T0.product_cd ASC

Subquery:

account_9_1 :: Relation () Account1
account_9_1 = relation $ do
  a  <- query account
  ma <- queryScalar $ aggregatedUnique account Account.accountId' max'
  wheres $ just (a ! Account.accountId') .=. flattenMaybe ma
  return $ Account1 |$| a ! Account.accountId'
                    |*| a ! Account.productCd'
                    |*| a ! Account.custId'
                    |*| a ! Account.availBalance'

Generated SQL:

SELECT ALL T0.account_id AS f0,
           T0.product_cd AS f1,
           T0.cust_id AS f2,
           T0.avail_balance AS f3
FROM MAIN.account T0
WHERE (T0.account_id = (SELECT ALL MAX (T1.account_id) AS f0
                        FROM MAIN.account T1))

Joins:

join_5_1_2aT :: Relation () ((String, String), String)
join_5_1_2aT = relation $ do
  e <- query employee
  d <- query department
  on $ e ! Employee.deptId' .=. just (d ! Department.deptId')
  return $ e ! Employee.fname' >< e ! Employee.lname' >< d ! Department.name'

Generated SQL:

SELECT ALL T0.fname AS f0,
           T0.lname AS f1,
           T1.name AS f2
FROM MAIN.employee T0 INNER JOIN MAIN.department T1
ON (T0.dept_id = T1.dept_id)

Pros:

  • Works with DB2, PostgreSQL, SQLite, MySQL, Microsoft SQL Server and OracleSQL.
  • Able to connect to database at compile time and generate datatypes for tables programatically.
  • Seems to cover almost all of SQL.

Cons:

  • No feature to create tables or do migrations.
  • Seems just a little harder to read than Opaleye.

Opaleye

Completely typesafe SQL library. Similar to HRR.

-- define table
personTable :: Table (Column PGText, Column PGInt4, Column PGText)
                     (Column PGText, Column PGInt4, Column PGText)
personTable = Table "personTable" (p3 ( required "name"
                                      , required "age"
                                      , required "address" ))

-- Here is a standard query of all people in the personTable:
personQuery :: Query (Column PGText, Column PGInt4, Column PGText)
personQuery = queryTable personTable

-- Here is the SQL generated for `personQuery`.
--
-- ghci> printSql personQuery
-- SELECT name0_1 as result1,
--        age1_1 as result2,
--        address2_1 as result3
--   FROM (SELECT *
--      FROM (SELECT name as name0_1,
--                   age as age1_1,
--                   address as address2_1
--            FROM personTable as T1) as T1) as T1
--

-- Data types and convenient type synonyms.
data Birthday' a b = Birthday { bdName :: a, bdDay :: b }
type Birthday = Birthday' String Day
type BirthdayColumn = Birthday' (Column PGText) (Column PGDate)

-- To get user defined types to work with the typeclass magic they must
-- have instances defined for them.  The instances are derivable with
-- Template Haskell.
$(makeAdaptorAndInstance "pBirthday" ''Birthday')

-- Create birthdayTable
birthdayTable :: Table BirthdayColumn BirthdayColumn
birthdayTable = Table "birthdayTable"
                       (pBirthday Birthday { bdName = required "name"
                                           , bdDay  = required "birthday" })

-- Query all birthdays:
birthdayQuery :: Query BirthdayColumn
birthdayQuery = queryTable birthdayTable

-- Produces the following SQL:
--
-- ghci> printSql birthdayQuery
-- SELECT name0_1 as result1,
--        birthday1_1 as result2
-- FROM (SELECT *
--       FROM (SELECT name as name0_1,
--                    birthday as birthday1_1
--             FROM birthdayTable as T1) as T1) as T1

-- Throw away some columns of query:
nameAge :: Query (Column PGText, Column PGInt4)
nameAge = proc () -> do
  (name, age, _) <- personQuery -< ()
  returnA -< (name, age)

-- Where clause (restriction):
youngPeople :: Query (Column PGText, Column PGInt4, Column PGText)
youngPeople = proc () -> do
  row@(_, age, _) <- personQuery -< ()
  restrict -< age .<= 18

  returnA -< row

-- Inner join:
personAndBirthday ::
  Query (Column PGText, Column PGInt4, Column PGText, Column PGDate)
personAndBirthday = proc () -> do
  (name, age, address) <- personQuery -< ()
  birthday             <- birthdayQuery -< ()

  restrict -< name .== bdName birthday

  returnA -< (name, age, address, bdDay birthday)

-- Composability of queries and joins.  We can't generate "the SQL of" these
-- `QueryArr` combinators.  They are not `Query`s so they don't have any SQL.
restrictIsTwenties :: QueryArr (Column PGInt4) ()
restrictIsTwenties = proc age -> do
  restrict -< (20 .<= age) .&& (age .< 30)

restrictAddressIs1MyStreet :: QueryArr (Column PGText) ()
restrictAddressIs1MyStreet = proc address -> do
  restrict -< address .== pgString "1 My Street, My Town"

twentiesAtAddress' :: Query (Column PGText, Column PGInt4, Column PGText)
twentiesAtAddress' = proc () -> do
  row@(_, age, address) <- personQuery -< ()

  restrictIsTwenties -< age
  restrictAddressIs1MyStreet -< address

  returnA -< row

Pros:

Cons:

  • Arrow syntax. (Although this might be appropriate for ARoW?? ;-p)
  • Only supports PostgreSQL.
  • No support for creating tables (or migrations). Assumes tables already exist.

Esqueleto

Esqueleto builds on persistent and gives us a way to write joins and other complex queries.

Example SQL:

SELECT BlogPost.*, Person.*
FROM BlogPost, Person
WHERE BlogPost.authorId = Person.id
ORDER BY BlogPost.title ASC

In esqueleto, we may write the same query above as:

select $
    from $ \(b, p) -> do
        where_ (b ^. BlogPostAuthorId ==. p ^. PersonId)
        orderBy [asc (b ^. BlogPostTitle)]
        return (b, p)

Here's an example using Yesod to just return a few fields of a record:

getHomeR :: Handler Html
getHomeR = do
    blogs <- runDB
    $ E.select
        $ E.from $ \(blog `E.InnerJoin` author) -> do
            E.on $ blog ^. BlogAuthor E.==. author ^. AuthorId
            return ( blog   ^. BlogId
                   , blog   ^. BlogTitle
                   , author ^. AuthorName
                   )
    defaultLayout $ do
        setTitle "Blog posts"
            [whamlet|
                <ul>
                    $forall (E.Value blogid, E.Value title, E.Value name) <- blogs
                        <li>
                            <a href=@{BlogR blogid}>#{title} by #{name}
            |]

Pros:

  • Used with persistent.
  • Joins.
  • Pretty simple if you already know persistent.

Cons:

  • Less typesafe than Opaleye / HRR (with regards to aggregation).

Groundhog

Similar to Persistent. It aims to be an ORM for Haskell. It seems to have a little more power than Persistent (it's projections), but still doesn't have the ability to do joins.

-- define data types to represent our tables
data Customer = Customer {
  customerName :: String,
  phone :: String
} deriving Show
data Product = Product {
  productName :: String,
  quantity :: Int,
  customer :: DefaultKey Customer
}
deriving instance Show Product

-- This creates all instances and definitions for datatypes so they can be
-- mapped to and from sql tables.
mkPersist defaultCodegenConfig [groundhog|
- entity: Customer               # Name of the datatype
  constructors:
    - name: Customer
      fields:
        - name: customerName
          # Set column name to "name" instead of "customerName"
          dbName: name
      uniques:
        - name: NameConstraint
          fields: [customerName] # Inline format of list
- entity: Product
|]

-- Show how to do queries and insertions.
main :: IO ()
main = withSqliteConn ":memory:" $ runDbConn $ do
  runMigration defaultMigrationLogger $ do
    migrate (undefined :: Customer)
    migrate (undefined :: Product)
  johnKey <- insert $ Customer "John Doe" "0123456789"
  get johnKey >>= liftIO . print
  insert $ Product "Oranges" 3 johnKey
  insert $ Product "Apples" 5 johnKey
  janeKey <- insert $ Customer "Jane Doe" "9876543210"
  insert $ Product "Oranges" 4 janeKey
  johnOrders <- select $ (CustomerField ==. johnKey)
    `orderBy` [Asc ProductNameField]
  liftIO $ putStrLn $ "Products for John: " ++ show johnOrders

Groundhog has interesting projections, so you can do something like this:

project ("username: " `append` upper UserNameField)
  $ lower UserNameField `like` "%smith%"

project ((toArith NumberField + 1) * 2)
  $   upper FirstNameField ==. upper UserNameField
  ||. (toArith NumberField * 5) >. (25 :: Int)

update (EmailField =. upper EmailField) (AutoKeyField ==. k)

This would produce the following sql statements:

SELECT 'username: ' || upper(username) FROM mytable WHERE lower(username) LIKE '%smith%'

SELECT (number + 1) * 2 FROM mytable WHERE upper(firstname) = upper(username) OR number * 5 > 25

UPDATE mytable SET email = upper(email) WHERE id = k

Pros:

  • supports migrations
  • supports composite keys
  • Seems to provide more sql functionality than persistent.

Cons:

  • Doesn't do joins.
  • Not as commonly used as Persistent?

postgresql-simple

Very simple. Single class for converting from SQL row to Haskell data types:

-- Some data types
data Present = Present { presentName :: Text }

data Location = Location { locLat :: Double
                         , locLong :: Double
                         }

data Child = Child { childName :: Text
                   , childLocation :: Location
                   }

-- Classes for converting from SQL rows to Haskell datatypes
instance FromRow Present where
  fromRow :: RowParser Present
  fromRow = Present <$> field

instance FromRow Child where
  fromRow :: RowParser Child
  fromRow = do
    childName' <- field
    locLat' <- field
    locLong' <- field
    pure $ Child childName' locLat' locLong'

-- Some functions for querying
allChildren :: Connection -> IO [Child]
allChildren c = query_ c "SELECT name, loc_lat, loc_long FROM child"

allPresents :: Connection -> IO [Present]
allPresents c = query_ c "SELECT name FROM present"

Pros:

  • Relatively simple.
  • Allows arbitrary sql statements.
  • Allows easy use of transactions.

Cons:

  • Not type safe. The SQL statements are not checked in any way.
  • Using postgresql-simple would make it more difficult to switch to a different database.
  • No template Haskell for generating relations/tables for you.
  • No migration support.

Migrations

When using a library like HRR, Opaleye, or Groundhog, it may be necessary to perform migrations with a separate tool. This section lists different tools available. The three most popular tools are Flyway, Liquibase, and MyBatis Migrations.

liquibase vs flyway #1

liquibase vs flyway #2

flyway vs liquibase vs mybatis migrations

Flyway

Relatively simple solution. Seems to be widely used.

Pros:

  • Relatively simple.

Cons:

Can't migrate backwards.

Liquibase

Seems like the more "enterprise" solution.

Liquibase Homepage

Pros:

  • Generate database diffs.
  • Allows easy rollbacks.
  • Seems pretty powerful.

Cons:

  • Can't write in normal SQL. Must use either XML or annotated sql.

MyBatis Migrations

MyBatis Migrations Homepage

Cons:

  • Seems like it is relatively tightly linked to Java.
  • Documentation not too good?

Sqitch

Sqitch Homepage

This records the current state of the database in git commits.

Pros:

  • Pretty powerful.

Cons:

  • Pretty complicated command line tool.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment