Skip to content

Instantly share code, notes, and snippets.

@erezsh
Last active September 1, 2021 18:17
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save erezsh/f1c9f0cd36ddbdde7e57348a1e10c6cd to your computer and use it in GitHub Desktop.
Save erezsh/f1c9f0cd36ddbdde7e57348a1e10c6cd to your computer and use it in GitHub Desktop.
LALR try for vanon (1)
from lark import Lark
grammar = r"""
start: _NL* section+
section: HEADER content*
HEADER: "--- SECTION ---" _NL
content: number_list
| assign
| other
number_list: NUMBER+ _NL
assign: CNAME "=" (NUMBER | CNAME) _NL
other: (ANY | CNAME)+ _NL
ANY.-100: /.+/
%import common.NEWLINE -> _NL
%import common (NUMBER, CNAME, WS_INLINE)
%ignore WS_INLINE
"""
text = r"""
--- SECTION ---
Structured content I want to capture:
123 123 432 132
564 771 561 153
--- SECTION ---
Unpredictably content
that I do not want to
capture (yet).
--- SECTION ---
content that won't be captured by this parser
--- SECTION ---
Random information
--- SECTION ---
Structured content I want to capture but of a different type
Value1 = 2
String2 = asdasd
NowAFloat = 23123.1
--- SECTION ---
"""
p = Lark(grammar, parser="lalr")
print(p.parse(text).pretty())
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment