tel/indents.md

## indents.md

      
    Raw
  

              indents.md
            
          
    Indentation sensitive parser combinators

An indentation sensitive parser combinator language is one that helps you express
ideas like "this parse only succeeds if it's within the current indentation block".
The concept is somewhat small and elegant and is implemented in a few libraries. In
this writeup, I will use examples from indents.
Background and goal

The direct goal will be to write the sameOrIndented parser combinator with a type
like (but not identical to)
sameOrIndented :: Parser s u () -- aspirational, not accurate

It fails whenever the current parse is at an indentation level less than a local
"reference" level which is set by withPos
withPos :: Parser s u a -> Parser s u a -- aspirational, not accurate

The idea being that you wrap a parse with withPos after parsing the indentation whitespace and
the "inner" parse will have its reference set to the location where withPos was applied. It
makes little sense to call sameOrIndented (or other combinators like it) outside of a withPos
block but the types do not prohibit it.
Context for indentation

Indentation-sensitive parses must carry with them information about the current "reference"
indentation level. To do so, we use a monad transformer with a State SourcePos layer.
SourcePos
is the Parsec type describing a location in the source text during parsing.
The function of withPos is to set this reference context locally. Now that we have State effects
to work with this is easy, we use the Parsec monad to get the "current" location then temporarily
store it in the State monad during the inner computation
withPos run = do
  here <- getPosition -- Parsec effect
  prev <- get -- State effect, stored to restore after `run`ning
  put here
  res <- run 
  put prev
  return res

Operation within the context

Once we've established an indentation reference using withPos, implementation of combinators
like sameOrIndented is easy: we just get the "current position" and compare it with the "reference
position". To make this work with sameOrIndented we need to determine if our column count is
equal-to-or-greater-than the reference one.
sameOrIndented = do
  here <- getPosition
  ref <- get
  if (sourceColumn here) >= (sourceColumn ref)
  then return ()
  else parseFail "bad identation"

Use in practice

With these combinators and ones like them you can annotate your parsers with "indentation block
begins here" markers via withPos. Parses within these blocks may need to respect indentation
and therefore contain options which include whitespace but are guarded by combinators like
sameOrIndented and its ilk.