We will be comparing the following DB libraries:
- HRR
- Opaleye
- Persistent + Esqueleto
- Groundhog
- postgresql-simple (and postgresql-transactional)
HRR will connect to our database and define our datatypes for us. Let's say we have a table like this:
CREATE TABLE account
(account_id integer primary key autoincrement not null,
product_cd varchar(10) not null,
cust_id integer not null,
open_date date not null,
close_date date,
last_activity_date date,
status text not null,
open_branch_id integer,
open_emp_id integer,
avail_balance float(10,2),
pending_balance float(10,2),
check(status = 'ACTIVE' or status = 'CLOSED' or status = 'FROZEN')
constraint fk_product_cd foreign key (product_cd)
references product (product_cd),
constraint fk_a_cust_id foreign key (cust_id)
references customer (cust_id),
constraint fk_a_branch_id foreign key (open_branch_id)
references branch (branch_id),
constraint fk_a_emp_id foreign key (open_emp_id)
references employee (emp_id)
);
Using Template Haskell, we can get it to create types like this:
data Account
= Account {accountId :: !GHC.Int.Int64,
productCd :: !String,
custId :: !GHC.Int.Int64,
openDate :: !Day,
closeDate :: !(Maybe Day),
lastActivityDate :: !(Maybe Day),
status :: !String,
openBranchId :: !(Maybe GHC.Int.Int64),
openEmpId :: !(Maybe GHC.Int.Int64),
availBalance :: !(Maybe Double),
pendingBalance :: !(Maybe Double)}
deriving (Show)
-- Relation type corresponding to Table
account :: Relation () Account
account = ...
-- Column selectors for This DSL
accountId' :: Pi Account GHC.Int.Int64
accountId'
= definePi 0
productCd' :: Pi Account String
productCd'
= definePi 1
custId' :: Pi Account GHC.Int.Int64
custId'
= definePi 2
....
Here is a simple query for accounts, using the types we defined above.
account1 :: Relation () Account
account1 = relation $ do
a <- query account
wheres $ a ! Account.productCd' `in'` values ["CHK", "SAV", "CD", "MM"]
return a
Here is a projection (only selecting certain rows):
account_3_7 :: Relation () (Maybe Int64, String)
account_3_7 = relation $ do
a <- query account
let proj = (,) |$| a ! Account.openEmpId'
|*| a ! Account.productCd'
asc proj
return proj
Generated SQL:
SELECT ALL T0.open_emp_id AS f0,
T0.product_cd AS f1
FROM MAIN.account T0
ORDER BY T0.open_emp_id ASC, T0.product_cd ASC
Subquery:
account_9_1 :: Relation () Account1
account_9_1 = relation $ do
a <- query account
ma <- queryScalar $ aggregatedUnique account Account.accountId' max'
wheres $ just (a ! Account.accountId') .=. flattenMaybe ma
return $ Account1 |$| a ! Account.accountId'
|*| a ! Account.productCd'
|*| a ! Account.custId'
|*| a ! Account.availBalance'
Generated SQL:
SELECT ALL T0.account_id AS f0,
T0.product_cd AS f1,
T0.cust_id AS f2,
T0.avail_balance AS f3
FROM MAIN.account T0
WHERE (T0.account_id = (SELECT ALL MAX (T1.account_id) AS f0
FROM MAIN.account T1))
Joins:
join_5_1_2aT :: Relation () ((String, String), String)
join_5_1_2aT = relation $ do
e <- query employee
d <- query department
on $ e ! Employee.deptId' .=. just (d ! Department.deptId')
return $ e ! Employee.fname' >< e ! Employee.lname' >< d ! Department.name'
Generated SQL:
SELECT ALL T0.fname AS f0,
T0.lname AS f1,
T1.name AS f2
FROM MAIN.employee T0 INNER JOIN MAIN.department T1
ON (T0.dept_id = T1.dept_id)
Pros:
- Works with DB2, PostgreSQL, SQLite, MySQL, Microsoft SQL Server and OracleSQL.
- Able to connect to database at compile time and generate datatypes for tables programatically.
- Seems to cover almost all of SQL.
Cons:
- No feature to create tables or do migrations.
- Seems just a little harder to read than Opaleye.
Completely typesafe SQL library. Similar to HRR.
-- define table
personTable :: Table (Column PGText, Column PGInt4, Column PGText)
(Column PGText, Column PGInt4, Column PGText)
personTable = Table "personTable" (p3 ( required "name"
, required "age"
, required "address" ))
-- Here is a standard query of all people in the personTable:
personQuery :: Query (Column PGText, Column PGInt4, Column PGText)
personQuery = queryTable personTable
-- Here is the SQL generated for `personQuery`.
--
-- ghci> printSql personQuery
-- SELECT name0_1 as result1,
-- age1_1 as result2,
-- address2_1 as result3
-- FROM (SELECT *
-- FROM (SELECT name as name0_1,
-- age as age1_1,
-- address as address2_1
-- FROM personTable as T1) as T1) as T1
--
-- Data types and convenient type synonyms.
data Birthday' a b = Birthday { bdName :: a, bdDay :: b }
type Birthday = Birthday' String Day
type BirthdayColumn = Birthday' (Column PGText) (Column PGDate)
-- To get user defined types to work with the typeclass magic they must
-- have instances defined for them. The instances are derivable with
-- Template Haskell.
$(makeAdaptorAndInstance "pBirthday" ''Birthday')
-- Create birthdayTable
birthdayTable :: Table BirthdayColumn BirthdayColumn
birthdayTable = Table "birthdayTable"
(pBirthday Birthday { bdName = required "name"
, bdDay = required "birthday" })
-- Query all birthdays:
birthdayQuery :: Query BirthdayColumn
birthdayQuery = queryTable birthdayTable
-- Produces the following SQL:
--
-- ghci> printSql birthdayQuery
-- SELECT name0_1 as result1,
-- birthday1_1 as result2
-- FROM (SELECT *
-- FROM (SELECT name as name0_1,
-- birthday as birthday1_1
-- FROM birthdayTable as T1) as T1) as T1
-- Throw away some columns of query:
nameAge :: Query (Column PGText, Column PGInt4)
nameAge = proc () -> do
(name, age, _) <- personQuery -< ()
returnA -< (name, age)
-- Where clause (restriction):
youngPeople :: Query (Column PGText, Column PGInt4, Column PGText)
youngPeople = proc () -> do
row@(_, age, _) <- personQuery -< ()
restrict -< age .<= 18
returnA -< row
-- Inner join:
personAndBirthday ::
Query (Column PGText, Column PGInt4, Column PGText, Column PGDate)
personAndBirthday = proc () -> do
(name, age, address) <- personQuery -< ()
birthday <- birthdayQuery -< ()
restrict -< name .== bdName birthday
returnA -< (name, age, address, bdDay birthday)
-- Composability of queries and joins. We can't generate "the SQL of" these
-- `QueryArr` combinators. They are not `Query`s so they don't have any SQL.
restrictIsTwenties :: QueryArr (Column PGInt4) ()
restrictIsTwenties = proc age -> do
restrict -< (20 .<= age) .&& (age .< 30)
restrictAddressIs1MyStreet :: QueryArr (Column PGText) ()
restrictAddressIs1MyStreet = proc address -> do
restrict -< address .== pgString "1 My Street, My Town"
twentiesAtAddress' :: Query (Column PGText, Column PGInt4, Column PGText)
twentiesAtAddress' = proc () -> do
row@(_, age, address) <- personQuery -< ()
restrictIsTwenties -< age
restrictAddressIs1MyStreet -< address
returnA -< row
Pros:
- Seems to be more popular than HRR in the western Haskell world: [https://www.reddit.com/r/haskell/comments/3fuq4s/opaleye_or_relationalrecord/](reddit thread)
- Used in production at multiple companies.
- Typesafe (even in aggregation functions).
- Complete composability in queries.
Cons:
- Arrow syntax. (Although this might be appropriate for ARoW?? ;-p)
- Only supports PostgreSQL.
- No support for creating tables (or migrations). Assumes tables already exist.
Esqueleto builds on persistent and gives us a way to write joins and other complex queries.
Example SQL:
SELECT BlogPost.*, Person.*
FROM BlogPost, Person
WHERE BlogPost.authorId = Person.id
ORDER BY BlogPost.title ASC
In esqueleto, we may write the same query above as:
select $
from $ \(b, p) -> do
where_ (b ^. BlogPostAuthorId ==. p ^. PersonId)
orderBy [asc (b ^. BlogPostTitle)]
return (b, p)
Here's an example using Yesod to just return a few fields of a record:
getHomeR :: Handler Html
getHomeR = do
blogs <- runDB
$ E.select
$ E.from $ \(blog `E.InnerJoin` author) -> do
E.on $ blog ^. BlogAuthor E.==. author ^. AuthorId
return ( blog ^. BlogId
, blog ^. BlogTitle
, author ^. AuthorName
)
defaultLayout $ do
setTitle "Blog posts"
[whamlet|
<ul>
$forall (E.Value blogid, E.Value title, E.Value name) <- blogs
<li>
<a href=@{BlogR blogid}>#{title} by #{name}
|]
Pros:
- Used with persistent.
- Joins.
- Pretty simple if you already know persistent.
Cons:
- Less typesafe than Opaleye / HRR (with regards to aggregation).
Similar to Persistent. It aims to be an ORM for Haskell. It seems to have a little more power than Persistent (it's projections), but still doesn't have the ability to do joins.
-- define data types to represent our tables
data Customer = Customer {
customerName :: String,
phone :: String
} deriving Show
data Product = Product {
productName :: String,
quantity :: Int,
customer :: DefaultKey Customer
}
deriving instance Show Product
-- This creates all instances and definitions for datatypes so they can be
-- mapped to and from sql tables.
mkPersist defaultCodegenConfig [groundhog|
- entity: Customer # Name of the datatype
constructors:
- name: Customer
fields:
- name: customerName
# Set column name to "name" instead of "customerName"
dbName: name
uniques:
- name: NameConstraint
fields: [customerName] # Inline format of list
- entity: Product
|]
-- Show how to do queries and insertions.
main :: IO ()
main = withSqliteConn ":memory:" $ runDbConn $ do
runMigration defaultMigrationLogger $ do
migrate (undefined :: Customer)
migrate (undefined :: Product)
johnKey <- insert $ Customer "John Doe" "0123456789"
get johnKey >>= liftIO . print
insert $ Product "Oranges" 3 johnKey
insert $ Product "Apples" 5 johnKey
janeKey <- insert $ Customer "Jane Doe" "9876543210"
insert $ Product "Oranges" 4 janeKey
johnOrders <- select $ (CustomerField ==. johnKey)
`orderBy` [Asc ProductNameField]
liftIO $ putStrLn $ "Products for John: " ++ show johnOrders
Groundhog has interesting projections, so you can do something like this:
project ("username: " `append` upper UserNameField)
$ lower UserNameField `like` "%smith%"
project ((toArith NumberField + 1) * 2)
$ upper FirstNameField ==. upper UserNameField
||. (toArith NumberField * 5) >. (25 :: Int)
update (EmailField =. upper EmailField) (AutoKeyField ==. k)
This would produce the following sql statements:
SELECT 'username: ' || upper(username) FROM mytable WHERE lower(username) LIKE '%smith%'
SELECT (number + 1) * 2 FROM mytable WHERE upper(firstname) = upper(username) OR number * 5 > 25
UPDATE mytable SET email = upper(email) WHERE id = k
Pros:
- supports migrations
- supports composite keys
- Seems to provide more sql functionality than persistent.
Cons:
- Doesn't do joins.
- Not as commonly used as Persistent?
Very simple. Single class for converting from SQL row to Haskell data types:
-- Some data types
data Present = Present { presentName :: Text }
data Location = Location { locLat :: Double
, locLong :: Double
}
data Child = Child { childName :: Text
, childLocation :: Location
}
-- Classes for converting from SQL rows to Haskell datatypes
instance FromRow Present where
fromRow :: RowParser Present
fromRow = Present <$> field
instance FromRow Child where
fromRow :: RowParser Child
fromRow = do
childName' <- field
locLat' <- field
locLong' <- field
pure $ Child childName' locLat' locLong'
-- Some functions for querying
allChildren :: Connection -> IO [Child]
allChildren c = query_ c "SELECT name, loc_lat, loc_long FROM child"
allPresents :: Connection -> IO [Present]
allPresents c = query_ c "SELECT name FROM present"
Pros:
- Relatively simple.
- Allows arbitrary sql statements.
- Allows easy use of transactions.
Cons:
- Not type safe. The SQL statements are not checked in any way.
- Using postgresql-simple would make it more difficult to switch to a different database.
- No template Haskell for generating relations/tables for you.
- No migration support.
When using a library like HRR, Opaleye, or Groundhog, it may be necessary to perform migrations with a separate tool. This section lists different tools available. The three most popular tools are Flyway, Liquibase, and MyBatis Migrations.
flyway vs liquibase vs mybatis migrations
Relatively simple solution. Seems to be widely used.
Pros:
- Relatively simple.
Cons:
Can't migrate backwards.
Seems like the more "enterprise" solution.
Pros:
- Generate database diffs.
- Allows easy rollbacks.
- Seems pretty powerful.
Cons:
- Can't write in normal SQL. Must use either XML or annotated sql.
Cons:
- Seems like it is relatively tightly linked to Java.
- Documentation not too good?
This records the current state of the database in git commits.
Pros:
- Pretty powerful.
Cons:
- Pretty complicated command line tool.