You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Response to PEGjs question in the mailing list -- big for a response; blog entry size but still lacking context to be able to stand on its own
Note: I'm not native English speaking and don't have the natural language processing jargon down pat, so the 'preceeding' vs. 'proceeding' I assume the PREceeding text before 'x' in 'zzxz' is the leading 'zz' ('history') and the PROceeding text for the same is the trailing 'z', hence 'proceeding text' would be similar to 'look ahead' as it is called in computer language parsing (unambiguous language parsing).
Anyway... I'm sure I don't get everything you say but there's at least one subject which is certainly relevant here:
handling/coping with history (preceeding input) vs. look-ahead (future, incoming, proceeding input)
In your sample grammar and sample 'zzxz' I assume each character is a single token. (Yes, PEG always processes characters as it blends lexer and parser into a single specification language; I'm just sufficiently dinosaur to appreciate the difference between 'character stream' and 'token stream': the token stream is where such nice horrors as left recursion are to be ex
Here are some things you can do with Gists in GistBox.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
snip&snap extracts from our major JISON grammar file, showcasing 'code sections' a la BISON plus a few other bits & tricks. Note the %{ ... %} sections which are JISON's 'code sections'. Also note the code following that last '%%' marker: that is another 'code section' - and the most important one.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Staafjespuzzel code from Henk, tweaked. (36cube puzzle)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
JavaScript: a state-machine-based lexer + recursive descent (LL(k)) grammar parser + abstract syntax tree (AST) + generator
Basic Compiler Technology
Lexers, Parsers, Code Generators are of all ages (old skool: lex/flex, yacc/bison (modern-ish: PCCTS/ANTLR), burg/lburg/...)
Here's an example of a JavaScript based domain-specific language being parsed and converted into HTML. It is very similar to wiki and MarkDown formats; the only purpose of this code is to showcase the parsing technology, so a minimum number of 'shortcuts' and other production code 'smartness' is applied.
The recursive descent parser works by evaluating the grammar rules in order of precedence, where each function call is a grammar rule match attempt and the first one returning success is a 'grammar rule match' which will allow the parser to continue parsing the input.
Also, the input is assumed to be size limited, i.e. a infinite lookahead grammar is fine with us. (This doesn't work in situations where you need to parse a stream, but that is a problem area that few of us have to deal with in actual reality. Mostly we like limited lookahead grammars becaus