Skip to content

Instantly share code, notes, and snippets.

Created July 1, 2015 09:41
Show Gist options
  • Save anonymous/ae1b0f336a3cd82b7b57 to your computer and use it in GitHub Desktop.
Save anonymous/ae1b0f336a3cd82b7b57 to your computer and use it in GitHub Desktop.
A short spelunking inside GHC to see how hard it would be to produce JSON that can be used for more valuable error reporting.

A quick dive into GHC's codebase (compiler/typecheck/TcUnify.hs) and searching for "but its type" gives me this (inside a where clause):

mk_ctxt :: TidyEnv -> TcM (TidyEnv, MsgDoc)
mk_ctxt env = do { (env', ty) <- zonkTidyTcType env orig_ty
                 ; let (args, _) = tcSplitFunTys ty
                       n_actual = length args
                       (env'', orig_ty') = tidyOpenType env' orig_ty
                 ; return (env'', mk_msg orig_ty' ty n_actual) }

mk_msg orig_ty ty n_args
  = herald <+> speakNOf arity (ptext (sLit "argument")) <> comma $$
    if n_args == arity
      then ptext (sLit "its type is") <+> quotes (pprType orig_ty) <>
           comma $$
           ptext (sLit "it is specialized to") <+> quotes (pprType ty)
      else sep [ptext (sLit "but its type") <+> quotes (pprType ty),
                if n_args == 0 then ptext (sLit "has none")
                else ptext (sLit "has only") <+> speakN n_args]

Seems like the TidyEnvis being updated and some of its elements passed to the actual error message creation function.

TidyEnv itself is:

type TidyEnv = (TidyOccEnv, VarEnv Var)

When tidying up print names, we keep a mapping of in-scope occ-names (the TidyOccEnv) and a Var-to-Var of the current renamings

Hmm. Let's see where mk_ctxtis used as well.

Note [matchExpectedFunTys]
~~~~~~~~~~~~~~~~~~~~~~~~~~
matchExpectedFunTys checks that an (Expected rho) has the form
of an n-ary function.  It passes the decomposed type to the
thing_inside, and returns a wrapper to coerce between the two types

It's used wherever a language construct must have a functional type,
namely:
        A lambda expression
        A function definition
     An operator section

This is not (currently) where deep skolemisation occurs;
matchExpectedFunTys does not skolmise nested foralls in the
expected type, because it expects that to have been done already
-}

matchExpectedFunTys :: SDoc     -- See Note [Herald for matchExpectedFunTys]
                    -> Arity
                    -> TcRhoType
                    -> TcM (TcCoercionN, [TcSigmaType], TcRhoType)

-- If    matchExpectFunTys n ty = (co, [t1,..,tn], ty_r)
-- then  co : ty ~N (t1 -> ... -> tn -> ty_r)
--
-- Does not allocate unnecessary meta variables: if the input already is
-- a function, we just take it apart.  Not only is this efficient,
-- it's important for higher rank: the argument might be of form
--              (forall a. ty) -> other
-- If allocated (fresh-meta-var1 -> fresh-meta-var2) and unified, we'd
-- hide the forall inside a meta-variable

matchExpectedFunTys herald arity orig_ty
  = go arity orig_ty
  where
    -- If     go n ty = (co, [t1,..,tn], ty_r)
    -- then   co : ty ~ t1 -> .. -> tn -> ty_r

    go n_req ty
      | n_req == 0 = return (mkTcNomReflCo ty, [], ty)

    go n_req ty
      | Just ty' <- tcView ty = go n_req ty'

    go n_req (FunTy arg_ty res_ty)
      | not (isPredTy arg_ty)
      = do { (co, tys, ty_r) <- go (n_req-1) res_ty
           ; return (mkTcFunCo Nominal (mkTcNomReflCo arg_ty) co, arg_ty:tys, ty_r) }

    go n_req ty@(TyVarTy tv)
      | ASSERT( isTcTyVar tv) isMetaTyVar tv
      = do { cts <- readMetaTyVar tv
           ; case cts of
               Indirect ty' -> go n_req ty'
               Flexi        -> defer n_req ty (isReturnTyVar tv) }

       -- In all other cases we bale out into ordinary unification
       -- However unlike the meta-tyvar case, we are sure that the
       -- number of arguments doesn't match arity of the original
       -- type, so we can add a bit more context to the error message
       -- (cf Trac #7869).
       --
       -- It is not always an error, because specialized type may have
       -- different arity, for example:
       --
       -- > f1 = f2 'a'
       -- > f2 :: Monad m => m Bool
       -- > f2 = undefined
       --
       -- But in that case we add specialized type into error context
       -- anyway, because it may be useful. See also Trac #9605.
    go n_req ty = addErrCtxtM mk_ctxt $
                  defer n_req ty False

    ------------
    -- If we decide that a ReturnTv (see Note [ReturnTv] in TcType) should
    -- really be a function type, then we need to allow the argument and
    -- result types also to be ReturnTvs.
    defer n_req fun_ty is_return
      = do { arg_tys <- mapM new_ty_var_ty (nOfThem n_req openTypeKind)
                        -- See Note [Foralls to left of arrow]
           ; res_ty  <- new_ty_var_ty openTypeKind
           ; co   <- unifyType fun_ty (mkFunTys arg_tys res_ty)
           ; return (co, arg_tys, res_ty) }
      where
        new_ty_var_ty | is_return = newReturnTyVarTy
                      | otherwise = newFlexiTyVarTy

Ok, so it seems like a lot of the unification code is done alongside the error reporting code. That's pretty sucky.

However, what we could do, is to modify SDoc, which is what all messages end up as:

newtype SDoc = SDoc { runSDoc :: SDocContext -> Doc }

data SDocContext = SDC
  { sdocStyle      :: !PprStyle
  , sdocLastColour :: !PprColour
    -- ^ The most recently used colour.  This allows nesting colours.
  , sdocDynFlags   :: !DynFlags
  }

 data Doc
 = Empty                                -- empty
 | NilAbove Doc                         -- text "" $$ x
 | TextBeside !TextDetails FastInt Doc       -- text s <> x
 | Nest FastInt Doc                         -- nest k x
 | Union Doc Doc                        -- ul `union` ur
 | NoDoc                                -- The empty set of documents
 | Beside Doc Bool Doc                  -- True <=> space between
 | Above  Doc Bool Doc                  -- True <=> never overlap

Perhaps we could have SDoc produce a tuple of (Doc,JSON), or the actual data (probably a list of Outputable things?).

Let's see what we would have to do:

  • Modify all the really low-level stuff like mk_ctxtto not only return an environment with a MsgDoc (= SDoc) in it, but a bunch of JSON as well.

    mk_ctxt :: TidyEnv -> TcM (TidyEnv, MsgDoc) -- Change this to a 3-tuple of TidyEnv, MsgDoc and JSON? mk_ctxt env = do { (env', ty) <- zonkTidyTcType env orig_ty ; let (args, _) = tcSplitFunTys ty n_actual = length args (env'', orig_ty') = tidyOpenType env' orig_ty ; return (env'', mk_msg orig_ty' ty n_actual) }

  • `addErrCtxtM also needs to be updated:

    -- Adds another message to the context. addErrCtxtM :: (TidyEnv -> TcM (TidyEnv, MsgDoc)) -> TcM a -> TcM a addErrCtxtM ctxt = updCtxt (\ ctxts -> (False, ctxt) : ctxts)

    type ErrCtxt = (Bool, TidyEnv -> TcM (TidyEnv, MsgDoc))

    -- Just applies a function to the tcl_ctxt :: [ErrCtxT] value in the local environment. updCtxt :: ([ErrCtxt] -> [ErrCtxt]) -> TcM a -> TcM a updCtxt upd = updLclEnv (\ env@(TcLclEnv { tcl_ctxt = ctxt }) -> env { tcl_ctxt = upd ctxt })

    -- Example use in TcUnify.hs go n_req ty = addErrCtxtM mk_ctxt $ defer n_req ty False

  • How does this environment update bubble upwards? Let's see where updLclEnv takes us:

    -- Boilerplate. updLclEnv :: (lcl -> lcl) -> TcRnIf gbl lcl a -> TcRnIf gbl lcl a updLclEnv upd = updEnv (\ env@(Env { env_lcl = lcl }) -> env { env_lcl = upd lcl })

    -- In IOEnv.hs -- | Perform a computation with an altered environment updEnv :: (env -> env') -> IOEnv env' a -> IOEnv env a {-# INLINE updEnv #-} updEnv upd (IOEnv m) = IOEnv (\ env -> m (upd env))

Aha, interesting! Especially since:

type TcRnIf a b c = IOEnv (Env a b) c

type TcRn a = TcRnIf TcGblEnv TcLclEnv a

type TcM a = TcRn a

So, the error messages are written to an environment that is passed along and that we can read out. In fact, in TcErrors.hs, a whole bunch of text processing is done with the tcl_ctxt value.

Perhaps the most interesting and non-breaking way of adding it is by adding another record element to TcLclEnv:

The Global-Env/Local-Env story

During type checking, we keep in the tcg_type_env

  • All types and classes
  • All Ids derived from types and classes (constructors, selectors)
    At the end of type checking, we zonk the local bindings, and as we do so we add to the tcg_type_env
  • Locally defined top-level Ids Why? Because they are now Ids not TcIds. This final GlobalEnv is a) fed back (via the knot) to typechecking the unfoldings of interface signatures b) used in the ModDetails of this module
data TcLclEnv		-- Changes as we move inside an expression
            -- Discarded after typecheck/rename; not passed on to desugarer
  = TcLclEnv {
    tcl_loc  :: SrcSpan,		-- Source span
    tcl_ctxt :: [ErrCtxt],		-- Error context, innermost on top
    tcl_errs :: TcRef Messages,	-- Place to accumulate errors

    tcl_th_ctxt    :: ThStage,	      -- Template Haskell context
    tcl_arrow_ctxt :: ArrowCtxt,	      -- Arrow-notation context

    tcl_rdr :: LocalRdrEnv,		-- Local name envt
        -- Maintained during renaming, of course, but also during
        -- type checking, solely so that when renaming a Template-Haskell
        -- splice we have the right environment for the renamer.
        -- 
        --   Does *not* include global name envt; may shadow it
        --   Includes both ordinary variables and type variables;
        --   they are kept distinct because tyvar have a different
        --   occurrence contructor (Name.TvOcc)
        -- We still need the unsullied global name env so that
            --   we can look up record field names

    tcl_env  :: TcTypeEnv,    -- The local type environment: Ids and
                      -- TyVars defined in this module
                    
    tcl_tyvars :: TcRef TcTyVarSet,	-- The "global tyvars"
            -- Namely, the in-scope TyVars bound in tcl_env, 
            -- plus the tyvars mentioned in the types of Ids bound
            -- in tcl_lenv. 
                        -- Why mutable? see notes with tcGetGlobalTyVars

    tcl_lie   :: TcRef WantedConstraints,    -- Place to accumulate type constraints

    -- TcMetaTyVars have 
    tcl_meta  :: TcRef Unique,  -- The next free unique for TcMetaTyVars
                        -- Guaranteed to be allocated linearly
    tcl_untch :: Unique	    -- Any TcMetaTyVar with 
                        --     unique >= tcl_untch is touchable
                        --     unique <  tcl_untch is untouchable
    }
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment