➖poor (impossible or very hard to achieve) ➕good (possible but requires some dancing) ✔️excellent (very easy to achieve/have)
PC stands for parser combinators and
PG — parser generators.
|Power||PC can't handle left-recursive grammars (though, there's some article¹...).|
|Research/Age||Both approaches have ~50 years of research behind them.|
|Accessibility||PG is a separate framework you need to learn. As well as some formal language theory. PC is just a library in your PL.|
|Debug||PG generates a lot of unreadable code you can't understand and debug most of the time. Though, debugging in Haskell is another story...|
|Visualization||ANTLR is capable of visualizing parsed tree on-the-fly as you change input or grammar.|
|Composability||You can't easily combine HTML grammar and Haskell grammar to get nice templating engine for example. PC are much more reusable.|
|Performance||I didn't see benchmarks. But nobody complained so far.|
|Static analysis||PG can analyze your grammar in advance and warn about ambiguity.
|Flexibility||PC is just code in your programming language. You can perform any computation there. PG achieve this through attributes and inline code.|
|Error messages||In PC you have full control on input stream so you can produce excellent error messages|
|Error recovery||You may want to report multiple parser errors or do partial code analysis of unparsed code. LALR PG can handle this better than LL.|
|Indentation||Usually it's much more difficult to parse layout-sensitive languages with PG.⁶|
|Pretty printing||Some PG allow you to specify formatting rules via attributes. Some Haskell libraries provide you bidirectional parsing/printing.|
- : Parser Combinators for Ambiguous Left-Recursive Grammars
- : megaparsec: How to introduce custom error messages
- : megaparsec: Fun with the recovery feature
- : megaparsec: Indentation-sensitive parsing
- : syntax: Reversible parsing and pretty-printing.
- : GHC and parsing layout it is effectively managed by having the lexer produce a set of «indent, dedent and virtual semicolon» tokens before the parser generator sees the token stream.
- : uu-parsinglib: Fast, online, error-correcting, monadic, applicative, merging, permuting, interleaving, idiomatic parser combinators.
Also see discussion here for more details:
- Parser Generators
- Programming languages
- Complex domain specific languages (like SQL)
- Parser Combinators
- Simple DSLs and markup languages (XML, HTML, Markdown)
- Serialisation and configuration formats (JSON, YAML, TOML)
- Creating custom markup languages (like Haskell-specific markdown)
- Parsing arbitrary non-difficult textual data (logging output, org-mode files)
Existing sources with excellent description: