Skip to content

Instantly share code, notes, and snippets.

@tfausak
Last active August 23, 2021 14:13
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save tfausak/62fe82b7593f8247da153f9995478d1d to your computer and use it in GitHub Desktop.
Save tfausak/62fe82b7593f8247da153f9995478d1d to your computer and use it in GitHub Desktop.
Cursed Haskell: Deriving via plugin

Cursed Haskell: Deriving via plugin

TL;DR: You can derive instances using a GHC plugin. See Main.hs for an example.

⚠️ Please don't use this! It's a proof of concept. It's meant to show that it's theoretically possible to derive instances using GHC plugins.

Given that both deriving via and Template Haskell exist, why did I make this? As you may already know, using generics to derive instances is slow. Template Haskell is faster, but it causes unnecessary recompilation. And of course manually defining instances is the fastest in terms of compilation speed but it's tedious and error prone.

So that got me thinking: Is there a way to write effectively a macro that generates instances like Template Haskell but avoids unnecessary recompilation? The obvious answer to that is to use CPP, but if you did that you would have to wrap your entire data declaration in some CPP macro. That means it wouldn't be normal Haskell anymore. That means you could derive instances, but the CPP would break a bunch of other tools and workflows.

What if instead you could hijack the existing deriving via clause to generate instances some other way? That's possible using a GHC compiler plugin. That's exactly what I've implemented here.

From a high level, this plugin looks for deriving ToJSON via plugin and replaces it with a generated instance. Check out the Main.hs file for an example of how to use it. This approach gives you the speed of Template Haskell deriving with the recompilation avoidance of generic deriving. Unfortunately it has many downsides:

  • For starters, just look at DerivingViaPlugin.hs, which implements the plugin. It's pretty gross.
  • Since this operates over syntax ...
    • It doesn't work for standalone deriving. It needs to access the actual AST for the data type in order to determine what the generated instance should look like.
    • Supporting things that are syntactically different, like GADTs, would need to be done separately.
    • The generated code relies on the rest of the module. For example, it currently assumes the OverloadedStrings language extension is on and Data.Aeson has been imported unqualified.
  • Since this is a proof of concept ...
    • It only works for one type class (ToJSON). It could support more type classes, but they could become tedious to define.
    • It only works for data types with no type variables that have a single constructor, and that constructor has to be a record.
  • Since this uses the GHC AST ...
    • It's tied to a particular version of the compiler.
    • The GHC AST has a lot of information in it, which makes things more complicated than generic deriving.
-- https://downloads.haskell.org/~ghc/9.0.1/docs/html/users_guide/extending_ghc.html#compiler-plugins
-- https://downloads.haskell.org/~ghc/9.0.1/docs/html/libraries/ghc-9.0.1/GHC-Plugins.html
module DerivingViaPlugin where
import qualified Control.Monad as Monad
import qualified GHC.Data.Bag as G
import qualified GHC.Hs as G
import qualified GHC.Plugins as P
import qualified GHC.Types.Basic as G
import qualified GHC.Utils.Error as G
import qualified GHC.Utils.Outputable as G
plugin :: P.Plugin
plugin = P.defaultPlugin
{ P.pluginRecompile = P.purePlugin
, P.parsedResultAction = handleHsParsedModule
}
handleHsParsedModule
:: [P.CommandLineOption]
-> P.ModSummary
-> P.HsParsedModule
-> P.Hsc P.HsParsedModule
handleHsParsedModule _ _ hsParsedModule = do
lHsModule <- handleLHsModule $ P.hpm_module hsParsedModule
pure hsParsedModule{ P.hpm_module = lHsModule }
handleLHsModule
:: P.Located G.HsModule
-> P.Hsc (P.Located G.HsModule)
handleLHsModule lHsModule = do
lHsDecls <- handleLHsDecls . G.hsmodDecls $ P.unLoc lHsModule
pure $ fmap (\ hsModule -> hsModule{ G.hsmodDecls = lHsDecls }) lHsModule
handleLHsDecls
:: [G.LHsDecl G.GhcPs]
-> P.Hsc [G.LHsDecl G.GhcPs]
handleLHsDecls lHsDecls = do
lHsDeclss <- mapM handleLHsDecl lHsDecls
pure $ concat lHsDeclss
handleLHsDecl
:: G.LHsDecl G.GhcPs
-> P.Hsc [G.LHsDecl G.GhcPs]
handleLHsDecl lHsDecl = case P.unLoc lHsDecl of
G.TyClD xTyClD tyClDecl1 -> do
(tyClDecl2, lHsDecls) <- handleTyClDecl tyClDecl1
pure $ P.L (P.getLoc lHsDecl) (G.TyClD xTyClD tyClDecl2) : lHsDecls
_ -> pure [lHsDecl]
handleTyClDecl
:: G.TyClDecl G.GhcPs
-> P.Hsc (G.TyClDecl G.GhcPs, [G.LHsDecl G.GhcPs])
handleTyClDecl tyClDecl = case tyClDecl of
G.DataDecl tcdDExt tcdLName tcdTyVars tcdFixity tcdDataDefn -> do
(hsDataDefn, lHsDecls) <- handleHsDataDefn tcdLName tcdDataDefn
pure
( G.DataDecl tcdDExt tcdLName tcdTyVars tcdFixity hsDataDefn
, lHsDecls
)
_ -> pure (tyClDecl, [])
handleHsDataDefn
:: G.LIdP G.GhcPs
-> G.HsDataDefn G.GhcPs
-> P.Hsc (G.HsDataDefn G.GhcPs, [G.LHsDecl G.GhcPs])
handleHsDataDefn lIdP hsDataDefn = case hsDataDefn of
G.HsDataDefn dd_ext dd_ND dd_ctxt dd_cType dd_kindSig dd_cons dd_derivs -> do
(hsDeriving, lHsDecls) <- handleHsDeriving lIdP dd_cons dd_derivs
pure
( G.HsDataDefn dd_ext dd_ND dd_ctxt dd_cType dd_kindSig dd_cons hsDeriving
, lHsDecls
)
handleHsDeriving
:: G.LIdP G.GhcPs
-> [G.LConDecl G.GhcPs]
-> G.HsDeriving G.GhcPs
-> P.Hsc (G.HsDeriving G.GhcPs, [G.LHsDecl G.GhcPs])
handleHsDeriving lIdP lConDecls hsDeriving = do
(lHsDerivingClauses, lHsDeclss) <- fmap unzip
. mapM (handleLHsDerivingClause lIdP lConDecls)
$ P.unLoc hsDeriving
pure (P.L (P.getLoc hsDeriving) lHsDerivingClauses, concat lHsDeclss)
handleLHsDerivingClause
:: G.LIdP G.GhcPs
-> [G.LConDecl G.GhcPs]
-> G.LHsDerivingClause G.GhcPs
-> P.Hsc (G.LHsDerivingClause G.GhcPs, [G.LHsDecl G.GhcPs])
handleLHsDerivingClause lIdP lConDecls lHsDerivingClause = case P.unLoc lHsDerivingClause of
G.HsDerivingClause deriv_clause_ext deriv_clause_strategy deriv_clause_tys
| isPlugin deriv_clause_strategy -> do
lHsDecls <- handleLHsSigTypes lIdP lConDecls $ P.unLoc deriv_clause_tys
pure
( P.L (P.getLoc lHsDerivingClause)
. G.HsDerivingClause deriv_clause_ext Nothing
$ fmap (const []) deriv_clause_tys
, lHsDecls
)
_ -> pure (lHsDerivingClause, [])
handleLHsSigTypes
:: G.LIdP G.GhcPs
-> [G.LConDecl G.GhcPs]
-> [G.LHsSigType G.GhcPs]
-> P.Hsc [G.LHsDecl G.GhcPs]
handleLHsSigTypes lIdP lConDecls = fmap concat . mapM (handleLHsSigType lIdP lConDecls)
handleLHsSigType
:: G.LIdP G.GhcPs
-> [G.LConDecl G.GhcPs]
-> G.LHsSigType G.GhcPs
-> P.Hsc [G.LHsDecl G.GhcPs]
handleLHsSigType lIdP lConDecls lHsSigType = do
let
srcSpan :: P.SrcSpan
srcSpan = case lHsSigType of
G.HsIB _ lHsType -> P.getLoc lHsType
l :: a -> P.Located a
l = P.L srcSpan
x :: G.NoExtField
x = G.NoExtField
hsImplicitBndrs :: G.HsImplicitBndrs G.GhcPs (G.LHsType G.GhcPs)
hsImplicitBndrs = G.HsIB x . l $ G.HsAppTy x
(l . G.HsTyVar x P.NotPromoted . l . P.Unqual $ P.mkClsOcc "ToJSON")
(l $ G.HsTyVar x P.NotPromoted lIdP)
toJSON :: G.LIdP G.GhcPs
toJSON = l . P.Unqual $ P.mkVarOcc "toJSON"
it :: G.LIdP G.GhcPs
it = l . P.Unqual $ P.mkVarOcc "it"
Monad.unless (isToJSON lHsSigType) $ throw srcSpan "unsupported type class"
hsConDeclDetails <- case fmap P.unLoc lConDecls of
[G.ConDeclH98 _ _ _ _ _ hsConDeclDetails _] -> pure hsConDeclDetails
_ -> throw srcSpan "unsupported constructors"
lConDeclFields <- case hsConDeclDetails of
G.RecCon lConDeclFields -> pure lConDeclFields
_ -> throw srcSpan "unsupported constructor"
occNames <- fmap concat
. Monad.forM (P.unLoc lConDeclFields)
$ \ lConDeclField -> case P.unLoc lConDeclField of
G.ConDeclField _ lFieldOccs _ _ -> Monad.forM lFieldOccs
$ \ lFieldOcc -> case P.unLoc lFieldOcc of
G.FieldOcc _ lRdrName -> case P.unLoc lRdrName of
P.Unqual occName -> pure occName
_ -> throw srcSpan "unsupported field"
let
lHsExprs :: [G.LHsExpr G.GhcPs]
lHsExprs = fmap (\ occName -> l $ G.OpApp x
(l . G.HsLit x . G.HsString G.NoSourceText $ P.occNameFS occName)
(l . G.HsVar x . l . P.Unqual $ P.mkVarOcc ".=")
(l $ G.HsApp x
(l . G.HsVar x . l . P.Unqual $ occName)
(l $ G.HsVar x it))) occNames
grhs :: G.GRHS G.GhcPs (G.LHsExpr G.GhcPs)
grhs = G.GRHS x [] . l $ G.HsApp x
(l . G.HsVar x . l . P.Unqual $ P.mkVarOcc "object")
(l $ G.ExplicitList x Nothing lHsExprs)
grhss :: G.GRHSs G.GhcPs (G.LHsExpr G.GhcPs)
grhss = G.GRHSs x [l grhs] . l $ G.EmptyLocalBinds x
match :: G.Match G.GhcPs (G.LHsExpr G.GhcPs)
match = G.Match x (G.FunRhs toJSON G.Prefix G.NoSrcStrict)
[l $ G.VarPat x it] grhss
matchGroup :: G.MatchGroup G.GhcPs (G.LHsExpr G.GhcPs)
matchGroup = G.MG x (l [l match]) P.Generated
lHsBindLR :: G.LHsBindLR G.GhcPs G.GhcPs
lHsBindLR = l $ G.FunBind x toJSON matchGroup []
lHsBindLRs :: G.Bag (G.LHsBindLR G.GhcPs G.GhcPs)
lHsBindLRs = G.listToBag [lHsBindLR]
lHsDecl :: G.LHsDecl G.GhcPs
lHsDecl = l . G.InstD x . G.ClsInstD x $
G.ClsInstDecl x hsImplicitBndrs lHsBindLRs [] [] [] Nothing
pure [G.pprTraceIt "-- DerivingViaPlugin:" lHsDecl]
throw
:: P.SrcSpan
-> String
-> P.Hsc a
throw srcSpan string = do
dynFlags <- P.getDynFlags
P.throwOneError . G.mkPlainErrMsg dynFlags srcSpan $ P.text string
isPlugin
:: Maybe (G.LDerivStrategy G.GhcPs)
-> Bool
isPlugin mLDerivStrategy = case mLDerivStrategy of
Just lDerivStrategy -> case P.unLoc lDerivStrategy of
G.ViaStrategy xViaStrategy -> case xViaStrategy of
G.HsIB _ lHsType -> case P.unLoc lHsType of
G.HsTyVar _ _ lIdP -> case P.unLoc lIdP of
P.Unqual occName -> P.occNameString occName == "plugin"
_ -> False
_ -> False
_ -> False
_ -> False
isToJSON
:: G.LHsSigType G.GhcPs
-> Bool
isToJSON lHsSigType = case lHsSigType of
G.HsIB _ lHsType -> case P.unLoc lHsType of
G.HsTyVar _ _ lIdP -> case P.unLoc lIdP of
P.Unqual occName -> P.occNameString occName == "ToJSON"
_ -> False
_ -> False
{-# options_ghc -fplugin=DerivingViaPlugin #-}
{-# language OverloadedStrings #-}
import Data.Aeson
import qualified Data.ByteString.Lazy as L
data Pizza = Pizza
{ size :: String
, toppings :: [String]
} deriving ToJSON via plugin
{-
-- DerivingViaPlugin:
instance ToJSON Pizza where
toJSON it = object ["size" .= size it, "toppings" .= toppings it]
-}
main :: IO ()
main = L.putStr $ encode Pizza
{ size = "medium"
, toppings = ["pepperoni", "pineapple"]
}
{-
{"toppings":["pepperoni","pineapple"],"size":"medium"}
-}
@tfausak
Copy link
Author

tfausak commented Aug 23, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment