Skip to content

Instantly share code, notes, and snippets.

@gideongrinberg
Last active March 28, 2023 21:41
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save gideongrinberg/af51df3af19aa3aa4df04424438d8fd7 to your computer and use it in GitHub Desktop.
Save gideongrinberg/af51df3af19aa3aa4df04424438d8fd7 to your computer and use it in GitHub Desktop.
BNF Grammars

BNF Grammars

I'm using this gist to collect a bunch of BNF grammars I've written, extracted, or found across the web. All of them are representations of existing languages. I found them useful, someone else might as well

Languages:

C c-grammar.bnf - An incomplete representation of the C language in BNF (from 1988, all newer versions I could find were EBNF or PEG. Let me know if you have a better one.)

Adapted from The C Programming Language, 2nd edition, by Brian W. Kernighan and Dennis M. Ritchie,Prentice Hall, 1988.

Java

java-spec.md - Markdown version of the Java spec, basically just annotated BNF.

Source: Oracle's Java Documentation (I think)

java-grammar.bnf - An actual BNF representation of Java.

I don't remember the source.

Python 3.5

python.g4 - not a BNF grammar, I'm in the process of converting. Extracted by the Grammar Zoo. cpython.gram - The official grammar, I'm also converting this (currently some weird PEG/EBNF/BNF mixture)

# PEG grammar for Python
file: [statements] ENDMARKER
interactive: statement_newline
eval: expressions NEWLINE* ENDMARKER
func_type: '(' [type_expressions] ')' '->' expression NEWLINE* ENDMARKER
fstring: star_expressions
# type_expressions allow */** but ignore them
type_expressions:
| ','.expression+ ',' '*' expression ',' '**' expression
| ','.expression+ ',' '*' expression
| ','.expression+ ',' '**' expression
| '*' expression ',' '**' expression
| '*' expression
| '**' expression
| ','.expression+
statements: statement+
statement: compound_stmt | simple_stmt
statement_newline:
| compound_stmt NEWLINE
| simple_stmt
| NEWLINE
| ENDMARKER
simple_stmt:
| small_stmt !';' NEWLINE # Not needed, there for speedup
| ';'.small_stmt+ [';'] NEWLINE
# NOTE: assignment MUST precede expression, else parsing a simple assignment
# will throw a SyntaxError.
small_stmt:
| assignment
| star_expressions
| return_stmt
| import_stmt
| raise_stmt
| 'pass'
| del_stmt
| yield_stmt
| assert_stmt
| 'break'
| 'continue'
| global_stmt
| nonlocal_stmt
compound_stmt:
| function_def
| if_stmt
| class_def
| with_stmt
| for_stmt
| try_stmt
| while_stmt
# NOTE: annotated_rhs may start with 'yield'; yield_expr must start with 'yield'
assignment:
| NAME ':' expression ['=' annotated_rhs ]
| ('(' single_target ')'
| single_subscript_attribute_target) ':' expression ['=' annotated_rhs ]
| (star_targets '=' )+ (yield_expr | star_expressions) !'=' [TYPE_COMMENT]
| single_target augassign ~ (yield_expr | star_expressions)
augassign:
| '+='
| '-='
| '*='
| '@='
| '/='
| '%='
| '&='
| '|='
| '^='
| '<<='
| '>>='
| '**='
| '//='
global_stmt: 'global' ','.NAME+
nonlocal_stmt: 'nonlocal' ','.NAME+
yield_stmt: yield_expr
assert_stmt: 'assert' expression [',' expression ]
del_stmt:
| 'del' del_targets &(';' | NEWLINE)
import_stmt: import_name | import_from
import_name: 'import' dotted_as_names
# note below: the ('.' | '...') is necessary because '...' is tokenized as ELLIPSIS
import_from:
| 'from' ('.' | '...')* dotted_name 'import' import_from_targets
| 'from' ('.' | '...')+ 'import' import_from_targets
import_from_targets:
| '(' import_from_as_names [','] ')'
| import_from_as_names !','
| '*'
import_from_as_names:
| ','.import_from_as_name+
import_from_as_name:
| NAME ['as' NAME ]
dotted_as_names:
| ','.dotted_as_name+
dotted_as_name:
| dotted_name ['as' NAME ]
dotted_name:
| dotted_name '.' NAME
| NAME
if_stmt:
| 'if' named_expression ':' block elif_stmt
| 'if' named_expression ':' block [else_block]
elif_stmt:
| 'elif' named_expression ':' block elif_stmt
| 'elif' named_expression ':' block [else_block]
else_block: 'else' ':' block
while_stmt:
| 'while' named_expression ':' block [else_block]
for_stmt:
| 'for' star_targets 'in' ~ star_expressions ':' [TYPE_COMMENT] block [else_block]
| ASYNC 'for' star_targets 'in' ~ star_expressions ':' [TYPE_COMMENT] block [else_block]
with_stmt:
| 'with' '(' ','.with_item+ ','? ')' ':' block
| 'with' ','.with_item+ ':' [TYPE_COMMENT] block
| ASYNC 'with' '(' ','.with_item+ ','? ')' ':' block
| ASYNC 'with' ','.with_item+ ':' [TYPE_COMMENT] block
with_item:
| expression 'as' star_target &(',' | ')' | ':')
| expression
try_stmt:
| 'try' ':' block finally_block
| 'try' ':' block except_block+ [else_block] [finally_block]
except_block:
| 'except' expression ['as' NAME ] ':' block
| 'except' ':' block
finally_block: 'finally' ':' block
return_stmt:
| 'return' [star_expressions]
raise_stmt:
| 'raise' expression ['from' expression ]
| 'raise'
function_def:
| decorators function_def_raw
| function_def_raw
function_def_raw:
| 'def' NAME '(' [params] ')' ['->' expression ] ':' [func_type_comment] block
| ASYNC 'def' NAME '(' [params] ')' ['->' expression ] ':' [func_type_comment] block
func_type_comment:
| NEWLINE TYPE_COMMENT &(NEWLINE INDENT) # Must be followed by indented block
| TYPE_COMMENT
params:
| parameters
parameters:
| slash_no_default param_no_default* param_with_default* [star_etc]
| slash_with_default param_with_default* [star_etc]
| param_no_default+ param_with_default* [star_etc]
| param_with_default+ [star_etc]
| star_etc
# Some duplication here because we can't write (',' | &')'),
# which is because we don't support empty alternatives (yet).
#
slash_no_default:
| param_no_default+ '/' ','
| param_no_default+ '/' &')'
slash_with_default:
| param_no_default* param_with_default+ '/' ','
| param_no_default* param_with_default+ '/' &')'
star_etc:
| '*' param_no_default param_maybe_default* [kwds]
| '*' ',' param_maybe_default+ [kwds]
| kwds
kwds: '**' param_no_default
# One parameter. This *includes* a following comma and type comment.
#
# There are three styles:
# - No default
# - With default
# - Maybe with default
#
# There are two alternative forms of each, to deal with type comments:
# - Ends in a comma followed by an optional type comment
# - No comma, optional type comment, must be followed by close paren
# The latter form is for a final parameter without trailing comma.
#
param_no_default:
| param ',' TYPE_COMMENT?
| param TYPE_COMMENT? &')'
param_with_default:
| param default ',' TYPE_COMMENT?
| param default TYPE_COMMENT? &')'
param_maybe_default:
| param default? ',' TYPE_COMMENT?
| param default? TYPE_COMMENT? &')'
param: NAME annotation?
annotation: ':' expression
default: '=' expression
decorators: ('@' named_expression NEWLINE )+
class_def:
| decorators class_def_raw
| class_def_raw
class_def_raw:
| 'class' NAME ['(' [arguments] ')' ] ':' block
block:
| NEWLINE INDENT statements DEDENT
| simple_stmt
star_expressions:
| star_expression (',' star_expression )+ [',']
| star_expression ','
| star_expression
star_expression:
| '*' bitwise_or
| expression
star_named_expressions: ','.star_named_expression+ [',']
star_named_expression:
| '*' bitwise_or
| named_expression
named_expression:
| NAME ':=' ~ expression
| expression !':='
annotated_rhs: yield_expr | star_expressions
expressions:
| expression (',' expression )+ [',']
| expression ','
| expression
expression:
| disjunction 'if' disjunction 'else' expression
| disjunction
| lambdef
lambdef:
| 'lambda' [lambda_params] ':' expression
lambda_params:
| lambda_parameters
# lambda_parameters etc. duplicates parameters but without annotations
# or type comments, and if there's no comma after a parameter, we expect
# a colon, not a close parenthesis. (For more, see parameters above.)
#
lambda_parameters:
| lambda_slash_no_default lambda_param_no_default* lambda_param_with_default* [lambda_star_etc]
| lambda_slash_with_default lambda_param_with_default* [lambda_star_etc]
| lambda_param_no_default+ lambda_param_with_default* [lambda_star_etc]
| lambda_param_with_default+ [lambda_star_etc]
| lambda_star_etc
lambda_slash_no_default:
| lambda_param_no_default+ '/' ','
| lambda_param_no_default+ '/' &':'
lambda_slash_with_default:
| lambda_param_no_default* lambda_param_with_default+ '/' ','
| lambda_param_no_default* lambda_param_with_default+ '/' &':'
lambda_star_etc:
| '*' lambda_param_no_default lambda_param_maybe_default* [lambda_kwds]
| '*' ',' lambda_param_maybe_default+ [lambda_kwds]
| lambda_kwds
lambda_kwds: '**' lambda_param_no_default
lambda_param_no_default:
| lambda_param ','
| lambda_param &':'
lambda_param_with_default:
| lambda_param default ','
| lambda_param default &':'
lambda_param_maybe_default:
| lambda_param default? ','
| lambda_param default? &':'
lambda_param: NAME
disjunction:
| conjunction ('or' conjunction )+
| conjunction
conjunction:
| inversion ('and' inversion )+
| inversion
inversion:
| 'not' inversion
| comparison
comparison:
| bitwise_or compare_op_bitwise_or_pair+
| bitwise_or
compare_op_bitwise_or_pair:
| eq_bitwise_or
| noteq_bitwise_or
| lte_bitwise_or
| lt_bitwise_or
| gte_bitwise_or
| gt_bitwise_or
| notin_bitwise_or
| in_bitwise_or
| isnot_bitwise_or
| is_bitwise_or
eq_bitwise_or: '==' bitwise_or
noteq_bitwise_or:
| ('!=' ) bitwise_or
lte_bitwise_or: '<=' bitwise_or
lt_bitwise_or: '<' bitwise_or
gte_bitwise_or: '>=' bitwise_or
gt_bitwise_or: '>' bitwise_or
notin_bitwise_or: 'not' 'in' bitwise_or
in_bitwise_or: 'in' bitwise_or
isnot_bitwise_or: 'is' 'not' bitwise_or
is_bitwise_or: 'is' bitwise_or
bitwise_or:
| bitwise_or '|' bitwise_xor
| bitwise_xor
bitwise_xor:
| bitwise_xor '^' bitwise_and
| bitwise_and
bitwise_and:
| bitwise_and '&' shift_expr
| shift_expr
shift_expr:
| shift_expr '<<' sum
| shift_expr '>>' sum
| sum
sum:
| sum '+' term
| sum '-' term
| term
term:
| term '*' factor
| term '/' factor
| term '//' factor
| term '%' factor
| term '@' factor
| factor
factor:
| '+' factor
| '-' factor
| '~' factor
| power
power:
| await_primary '**' factor
| await_primary
await_primary:
| AWAIT primary
| primary
primary:
| invalid_primary # must be before 'primay genexp' because of invalid_genexp
| primary '.' NAME
| primary genexp
| primary '(' [arguments] ')'
| primary '[' slices ']'
| atom
slices:
| slice !','
| ','.slice+ [',']
slice:
| [expression] ':' [expression] [':' [expression] ]
| expression
atom:
| NAME
| 'True'
| 'False'
| 'None'
| '__peg_parser__'
| strings
| NUMBER
| (tuple | group | genexp)
| (list | listcomp)
| (dict | set | dictcomp | setcomp)
| '...'
strings: STRING+
list:
| '[' [star_named_expressions] ']'
listcomp:
| '[' named_expression ~ for_if_clauses ']'
tuple:
| '(' [star_named_expression ',' [star_named_expressions] ] ')'
group:
| '(' (yield_expr | named_expression) ')'
genexp:
| '(' named_expression ~ for_if_clauses ')'
set: '{' star_named_expressions '}'
setcomp:
| '{' named_expression ~ for_if_clauses '}'
dict:
| '{' [double_starred_kvpairs] '}'
dictcomp:
| '{' kvpair for_if_clauses '}'
double_starred_kvpairs: ','.double_starred_kvpair+ [',']
double_starred_kvpair:
| '**' bitwise_or
| kvpair
kvpair: expression ':' expression
for_if_clauses:
| for_if_clause+
for_if_clause:
| ASYNC 'for' star_targets 'in' ~ disjunction ('if' disjunction )*
| 'for' star_targets 'in' ~ disjunction ('if' disjunction )*
yield_expr:
| 'yield' 'from' expression
| 'yield' [star_expressions]
arguments:
| args [','] &')'
args:
| ','.(starred_expression | named_expression !'=')+ [',' kwargs ]
| kwargs
kwargs:
| ','.kwarg_or_starred+ ',' ','.kwarg_or_double_starred+
| ','.kwarg_or_starred+
| ','.kwarg_or_double_starred+
starred_expression:
| '*' expression
kwarg_or_starred:
| NAME '=' expression
| starred_expression
kwarg_or_double_starred:
| NAME '=' expression
| '**' expression
# NOTE: star_targets may contain *bitwise_or, targets may not.
star_targets:
| star_target !','
| star_target (',' star_target )* [',']
star_targets_list_seq: ','.star_target+ [',']
star_targets_tuple_seq:
| star_target (',' star_target )+ [',']
| star_target ','
star_target:
| '*' (!'*' star_target)
| target_with_star_atom
target_with_star_atom:
| t_primary '.' NAME !t_lookahead
| t_primary '[' slices ']' !t_lookahead
| star_atom
star_atom:
| NAME
| '(' target_with_star_atom ')'
| '(' [star_targets_tuple_seq] ')'
| '[' [star_targets_list_seq] ']'
single_target:
| single_subscript_attribute_target
| NAME
| '(' single_target ')'
single_subscript_attribute_target:
| t_primary '.' NAME !t_lookahead
| t_primary '[' slices ']' !t_lookahead
del_targets: ','.del_target+ [',']
del_target:
| t_primary '.' NAME !t_lookahead
| t_primary '[' slices ']' !t_lookahead
| del_t_atom
del_t_atom:
| NAME
| '(' del_target ')'
| '(' [del_targets] ')'
| '[' [del_targets] ']'
targets: ','.target+ [',']
target:
| t_primary '.' NAME !t_lookahead
| t_primary '[' slices ']' !t_lookahead
| t_atom
t_primary:
| t_primary '.' NAME &t_lookahead
| t_primary '[' slices ']' &t_lookahead
| t_primary genexp &t_lookahead
| t_primary '(' [arguments] ')' &t_lookahead
| atom &t_lookahead
t_lookahead: '(' | '[' | '.'
t_atom:
| NAME
| '(' target ')'
| '(' [targets] ')'
| '[' [targets] ']'
<translation-unit> ::= {<external-declaration>}*
<external-declaration> ::= <function-definition>
| <declaration>
<function-definition> ::= {<declaration-specifier>}* <declarator> {<declaration>}* <compound-statement>
<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>
<storage-class-specifier> ::= auto
| register
| static
| extern
| typedef
<type-specifier> ::= void
| char
| short
| int
| long
| float
| double
| signed
| unsigned
| <struct-or-union-specifier>
| <enum-specifier>
| <typedef-name>
<struct-or-union-specifier> ::= <struct-or-union> <identifier> { {<struct-declaration>}+ }
| <struct-or-union> { {<struct-declaration>}+ }
| <struct-or-union> <identifier>
<struct-or-union> ::= struct
| union
<struct-declaration> ::= {<specifier-qualifier>}* <struct-declarator-list>
<specifier-qualifier> ::= <type-specifier>
| <type-qualifier>
<struct-declarator-list> ::= <struct-declarator>
| <struct-declarator-list> , <struct-declarator>
<struct-declarator> ::= <declarator>
| <declarator> : <constant-expression>
| : <constant-expression>
<declarator> ::= {<pointer>}? <direct-declarator>
<pointer> ::= * {<type-qualifier>}* {<pointer>}?
<type-qualifier> ::= const
| volatile
<direct-declarator> ::= <identifier>
| ( <declarator> )
| <direct-declarator> [ {<constant-expression>}? ]
| <direct-declarator> ( <parameter-type-list> )
| <direct-declarator> ( {<identifier>}* )
<constant-expression> ::= <conditional-expression>
<conditional-expression> ::= <logical-or-expression>
| <logical-or-expression> ? <expression> : <conditional-expression>
<logical-or-expression> ::= <logical-and-expression>
| <logical-or-expression> || <logical-and-expression>
<logical-and-expression> ::= <inclusive-or-expression>
| <logical-and-expression> && <inclusive-or-expression>
<inclusive-or-expression> ::= <exclusive-or-expression>
| <inclusive-or-expression> | <exclusive-or-expression>
<exclusive-or-expression> ::= <and-expression>
| <exclusive-or-expression> ^ <and-expression>
<and-expression> ::= <equality-expression>
| <and-expression> & <equality-expression>
<equality-expression> ::= <relational-expression>
| <equality-expression> == <relational-expression>
| <equality-expression> != <relational-expression>
<relational-expression> ::= <shift-expression>
| <relational-expression> < <shift-expression>
| <relational-expression> > <shift-expression>
| <relational-expression> <= <shift-expression>
| <relational-expression> >= <shift-expression>
<shift-expression> ::= <additive-expression>
| <shift-expression> << <additive-expression>
| <shift-expression> >> <additive-expression>
<additive-expression> ::= <multiplicative-expression>
| <additive-expression> + <multiplicative-expression>
| <additive-expression> - <multiplicative-expression>
<multiplicative-expression> ::= <cast-expression>
| <multiplicative-expression> * <cast-expression>
| <multiplicative-expression> / <cast-expression>
| <multiplicative-expression> % <cast-expression>
<cast-expression> ::= <unary-expression>
| ( <type-name> ) <cast-expression>
<unary-expression> ::= <postfix-expression>
| ++ <unary-expression>
| -- <unary-expression>
| <unary-operator> <cast-expression>
| sizeof <unary-expression>
| sizeof <type-name>
<postfix-expression> ::= <primary-expression>
| <postfix-expression> [ <expression> ]
| <postfix-expression> ( {<assignment-expression>}* )
| <postfix-expression> . <identifier>
| <postfix-expression> -> <identifier>
| <postfix-expression> ++
| <postfix-expression> --
<primary-expression> ::= <identifier>
| <constant>
| <string>
| ( <expression> )
<constant> ::= <integer-constant>
| <character-constant>
| <floating-constant>
| <enumeration-constant>
<expression> ::= <assignment-expression>
| <expression> , <assignment-expression>
<assignment-expression> ::= <conditional-expression>
| <unary-expression> <assignment-operator> <assignment-expression>
<assignment-operator> ::= =
| *=
| /=
| %=
| +=
| -=
| <<=
| >>=
| &=
| ^=
| |=
<unary-operator> ::= &
| *
| +
| -
| ~
| !
<type-name> ::= {<specifier-qualifier>}+ {<abstract-declarator>}?
<parameter-type-list> ::= <parameter-list>
| <parameter-list> , ...
<parameter-list> ::= <parameter-declaration>
| <parameter-list> , <parameter-declaration>
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+
<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
<enum-specifier> ::= enum <identifier> { <enumerator-list> }
| enum { <enumerator-list> }
| enum <identifier>
<enumerator-list> ::= <enumerator>
| <enumerator-list> , <enumerator>
<enumerator> ::= <identifier>
| <identifier> = <constant-expression>
<typedef-name> ::= <identifier>
<declaration> ::= {<declaration-specifier>}+ {<init-declarator>}* ;
<init-declarator> ::= <declarator>
| <declarator> = <initializer>
<initializer> ::= <assignment-expression>
| { <initializer-list> }
| { <initializer-list> , }
<initializer-list> ::= <initializer>
| <initializer-list> , <initializer>
<compound-statement> ::= { {<declaration>}* {<statement>}* }
<statement> ::= <labeled-statement>
| <expression-statement>
| <compound-statement>
| <selection-statement>
| <iteration-statement>
| <jump-statement>
<labeled-statement> ::= <identifier> : <statement>
| case <constant-expression> : <statement>
| default : <statement>
<expression-statement> ::= {<expression>}? ;
<selection-statement> ::= if ( <expression> ) <statement>
| if ( <expression> ) <statement> else <statement>
| switch ( <expression> ) <statement>
<iteration-statement> ::= while ( <expression> ) <statement>
| do <statement> while ( <expression> ) ;
| for ( {<expression>}? ; {<expression>}? ; {<expression>}? ) <statement>
<jump-statement> ::= goto <identifier> ;
| continue ;
| break ;
| return {<expression>}? ;
The syntax of C in Backus-Naur Form
<translation-unit> ::= {<external-declaration>}*
<external-declaration> ::= <function-definition>
| <declaration>
<function-definition> ::= {<declaration-specifier>}* <declarator> {<declaration>}* <compound-statement>
<declaration-specifier> ::= <storage-class-specifier>
| <type-specifier>
| <type-qualifier>
<storage-class-specifier> ::= auto
| register
| static
| extern
| typedef
<type-specifier> ::= void
| char
| short
| int
| long
| float
| double
| signed
| unsigned
| <struct-or-union-specifier>
| <enum-specifier>
| <typedef-name>
<struct-or-union-specifier> ::= <struct-or-union> <identifier> { {<struct-declaration>}+ }
| <struct-or-union> { {<struct-declaration>}+ }
| <struct-or-union> <identifier>
<struct-or-union> ::= struct
| union
<struct-declaration> ::= {<specifier-qualifier>}* <struct-declarator-list>
<specifier-qualifier> ::= <type-specifier>
| <type-qualifier>
<struct-declarator-list> ::= <struct-declarator>
| <struct-declarator-list> , <struct-declarator>
<struct-declarator> ::= <declarator>
| <declarator> : <constant-expression>
| : <constant-expression>
<declarator> ::= {<pointer>}? <direct-declarator>
<pointer> ::= * {<type-qualifier>}* {<pointer>}?
<type-qualifier> ::= const
| volatile
<direct-declarator> ::= <identifier>
| ( <declarator> )
| <direct-declarator> [ {<constant-expression>}? ]
| <direct-declarator> ( <parameter-type-list> )
| <direct-declarator> ( {<identifier>}* )
<constant-expression> ::= <conditional-expression>
<conditional-expression> ::= <logical-or-expression>
| <logical-or-expression> ? <expression> : <conditional-expression>
<logical-or-expression> ::= <logical-and-expression>
| <logical-or-expression> || <logical-and-expression>
<logical-and-expression> ::= <inclusive-or-expression>
| <logical-and-expression> && <inclusive-or-expression>
<inclusive-or-expression> ::= <exclusive-or-expression>
| <inclusive-or-expression> | <exclusive-or-expression>
<exclusive-or-expression> ::= <and-expression>
| <exclusive-or-expression> ^ <and-expression>
<and-expression> ::= <equality-expression>
| <and-expression> & <equality-expression>
<equality-expression> ::= <relational-expression>
| <equality-expression> == <relational-expression>
| <equality-expression> != <relational-expression>
<relational-expression> ::= <shift-expression>
| <relational-expression> < <shift-expression>
| <relational-expression> > <shift-expression>
| <relational-expression> <= <shift-expression>
| <relational-expression> >= <shift-expression>
<shift-expression> ::= <additive-expression>
| <shift-expression> << <additive-expression>
| <shift-expression> >> <additive-expression>
<additive-expression> ::= <multiplicative-expression>
| <additive-expression> + <multiplicative-expression>
| <additive-expression> - <multiplicative-expression>
<multiplicative-expression> ::= <cast-expression>
| <multiplicative-expression> * <cast-expression>
| <multiplicative-expression> / <cast-expression>
| <multiplicative-expression> % <cast-expression>
<cast-expression> ::= <unary-expression>
| ( <type-name> ) <cast-expression>
<unary-expression> ::= <postfix-expression>
| ++ <unary-expression>
| -- <unary-expression>
| <unary-operator> <cast-expression>
| sizeof <unary-expression>
| sizeof <type-name>
<postfix-expression> ::= <primary-expression>
| <postfix-expression> [ <expression> ]
| <postfix-expression> ( {<assignment-expression>}* )
| <postfix-expression> . <identifier>
| <postfix-expression> -> <identifier>
| <postfix-expression> ++
| <postfix-expression> --
<primary-expression> ::= <identifier>
| <constant>
| <string>
| ( <expression> )
<constant> ::= <integer-constant>
| <character-constant>
| <floating-constant>
| <enumeration-constant>
<expression> ::= <assignment-expression>
| <expression> , <assignment-expression>
<assignment-expression> ::= <conditional-expression>
| <unary-expression> <assignment-operator> <assignment-expression>
<assignment-operator> ::= =
| *=
| /=
| %=
| +=
| -=
| <<=
| >>=
| &=
| ^=
| |=
<unary-operator> ::= &
| *
| +
| -
| ~
| !
<type-name> ::= {<specifier-qualifier>}+ {<abstract-declarator>}?
<parameter-type-list> ::= <parameter-list>
| <parameter-list> , ...
<parameter-list> ::= <parameter-declaration>
| <parameter-list> , <parameter-declaration>
<parameter-declaration> ::= {<declaration-specifier>}+ <declarator>
| {<declaration-specifier>}+ <abstract-declarator>
| {<declaration-specifier>}+
<abstract-declarator> ::= <pointer>
| <pointer> <direct-abstract-declarator>
| <direct-abstract-declarator>
<direct-abstract-declarator> ::= ( <abstract-declarator> )
| {<direct-abstract-declarator>}? [ {<constant-expression>}? ]
| {<direct-abstract-declarator>}? ( {<parameter-type-list>}? )
<enum-specifier> ::= enum <identifier> { <enumerator-list> }
| enum { <enumerator-list> }
| enum <identifier>
<enumerator-list> ::= <enumerator>
| <enumerator-list> , <enumerator>
<enumerator> ::= <identifier>
| <identifier> = <constant-expression>
<typedef-name> ::= <identifier>
<declaration> ::= {<declaration-specifier>}+ {<init-declarator>}* ;
<init-declarator> ::= <declarator>
| <declarator> = <initializer>
<initializer> ::= <assignment-expression>
| { <initializer-list> }
| { <initializer-list> , }
<initializer-list> ::= <initializer>
| <initializer-list> , <initializer>
<compound-statement> ::= { {<declaration>}* {<statement>}* }
<statement> ::= <labeled-statement>
| <expression-statement>
| <compound-statement>
| <selection-statement>
| <iteration-statement>
| <jump-statement>
<labeled-statement> ::= <identifier> : <statement>
| case <constant-expression> : <statement>
| default : <statement>
<expression-statement> ::= {<expression>}? ;
<selection-statement> ::= if ( <expression> ) <statement>
| if ( <expression> ) <statement> else <statement>
| switch ( <expression> ) <statement>
<iteration-statement> ::= while ( <expression> ) <statement>
| do <statement> while ( <expression> ) ;
| for ( {<expression>}? ; {<expression>}? ; {<expression>}? ) <statement>
<jump-statement> ::= goto <identifier> ;
| continue ;
| break ;
| return {<expression>}? ;
This grammar was adapted from Section A13 of The C programming language, 2nd edition, by Brian W. Kernighan and Dennis M. Ritchie,Prentice Hall, 1988.

Java BNF

Java Syntax Specification

Programs

<compilation unit> ::= <package declaration>? <import declarations>? <type declarations>?

Declarations

<package declaration> ::= package <package name> ;

<import declarations> ::= <import declaration> | <import declarations> <import declaration>

<import declaration> ::= <single type import declaration> | <type import on demand declaration>

<single type import declaration> ::= import <type name> ;

<type import on demand declaration> ::= import <package name> . * ;

<type declarations> ::= <type declaration> | <type declarations> <type declaration>

<type declaration> ::= <class declaration> | <interface declaration> | ;

<class declaration> ::= <class modifiers>? class <identifier> <super>? <interfaces>? <class body>

<class modifiers> ::= <class modifier> | <class modifiers> <class modifier>

<class modifier> ::= public | abstract | final

<super> ::= extends <class type>

<interfaces> ::= implements <interface type list>

<interface type list> ::= <interface type> | <interface type list> , <interface type>

<class body> ::= { <class body declarations>? }

<class body declarations> ::= <class body declaration> | <class body declarations> <class body declaration>

<class body declaration> ::= <class member declaration> | <static initializer> | <constructor declaration>

<class member declaration> ::= <field declaration> | <method declaration>

<static initializer> ::= static <block>

<constructor declaration> ::= <constructor modifiers>? <constructor declarator> <throws>? <constructor body>

<constructor modifiers> ::= <constructor modifier> | <constructor modifiers> <constructor modifier>

<constructor modifier> ::= public | protected | private

<constructor declarator> ::= <simple type name> ( <formal parameter list>? )

<formal parameter list> ::= <formal parameter> | <formal parameter list> , <formal parameter>

<formal parameter> ::= <type> <variable declarator id>

<throws> ::= throws <class type list>

<class type list> ::= <class type> | <class type list> , <class type>

<constructor body> ::= { <explicit constructor invocation>? <block statements>? }

<explicit constructor invocation>::= this ( <argument list>? ) | super ( <argument list>? )

<field declaration> ::= <field modifiers>? <type> <variable declarators> ;

<field modifiers> ::= <field modifier> | <field modifiers> <field modifier>

<field modifier> ::= public | protected | private | static | final | transient | volatile

<variable declarators> ::= <variable declarator> | <variable declarators> , <variable declarator>

<variable declarator> ::= <variable declarator id> | <variable declarator id> = <variable initializer>

<variable declarator id> ::= <identifier> | <variable declarator id> [ ]

<variable initializer> ::= <expression> | <array initializer>

<method declaration> ::= <method header> <method body>

<method header> ::= <method modifiers>? <result type> <method declarator> <throws>?

<result type> ::= <type> | void

<method modifiers> ::= <method modifier> | <method modifiers> <method modifier>

<method modifier> ::= public | protected | private | static | abstract | final | synchronized | native

<method declarator> ::= <identifier> ( <formal parameter list>? )

<method body> ::= <block> | ;

<interface declaration> ::= <interface modifiers>? interface <identifier> <extends interfaces>? <interface body>

<interface modifiers> ::= <interface modifier> | <interface modifiers> <interface modifier>

<interface modifier> ::= public | abstract

<extends interfaces> ::= extends <interface type> | <extends interfaces> , <interface type>

<interface body> ::= { <interface member declarations>? }

<interface member declarations> ::= <interface member declaration> | <interface member declarations> <interface member declaration>

<interface member declaration> ::= <constant declaration> | <abstract method declaration>

<constant declaration> ::= <constant modifiers> <type> <variable declarator>

<constant modifiers> ::= public | static | final

<abstract method declaration>::= <abstract method modifiers>? <result type> <method declarator> <throws>? ;

<abstract method modifiers> ::= <abstract method modifier> | <abstract method modifiers> <abstract method modifier>

<abstract method modifier> ::= public | abstract

<array initializer> ::= { <variable initializers>? , ? }

<variable initializers> ::= <variable initializer> | <variable initializers> , <variable initializer>

<variable initializer> ::= <expression> | <array initializer>

Types

<type> ::= <primitive type> | <reference type>

<primitive type> ::= <numeric type> | boolean

<numeric type> ::= <integral type> | <floating-point type>

<integral type> ::= byte | short | int | long | char

<floating-point type> ::= float | double

<reference type> ::= <class or interface type> | <array type>

<class or interface type> ::= <class type> | <interface type>

<class type> ::= <type name>

<interface type> ::= <type name>

<array type> ::= <type> [ ]

Blocks and Commands

<block> ::= { <block statements>? }

<block statements> ::= <block statement> | <block statements> <block statement>

<block statement> ::= <local variable declaration statement> | <statement>

<local variable declaration statement> ::= <local variable declaration> ;

<local variable declaration> ::= <type> <variable declarators>

<statement> ::= <statement without trailing substatement> | <labeled statement> | <if then statement> | <if then else statement> | <while statement> | <for statement>

<statement no short if> ::= <statement without trailing substatement> | <labeled statement no short if> | <if then else statement no short if> | <while statement no short if> | <for statement no short if>

<statement without trailing substatement> ::= <block> | <empty statement> | <expression statement> | <switch statement> | <do statement> | <break statement> | <continue statement> | <return statement> | <synchronized statement> | <throws statements> | <try statement>

<empty statement> ::= ;

<labeled statement> ::= <identifier> : <statement>

<labeled statement no short if> ::= <identifier> : <statement no short if>

<expression statement> ::= <statement expression> ;

<statement expression> ::= <assignment> | <preincrement expression> | <postincrement expression> | <predecrement expression> | <postdecrement expression> | <method invocation> | <class instance creation expression>

<if then statement>::= if ( <expression> ) <statement>

<if then else statement>::= if ( <expression> ) <statement no short if> else <statement>

<if then else statement no short if> ::= if ( <expression> ) <statement no short if> else <statement no short if>

<switch statement> ::= switch ( <expression> ) <switch block>

<switch block> ::= { <switch block statement groups>? <switch labels>? }

<switch block statement groups> ::= <switch block statement group> | <switch block statement groups> <switch block statement group>

<switch block statement group> ::= <switch labels> <block statements>

<switch labels> ::= <switch label> | <switch labels> <switch label>

<switch label> ::= case <constant expression> : | default :

<while statement> ::= while ( <expression> ) <statement>

<while statement no short if> ::= while ( <expression> ) <statement no short if>

<do statement> ::= do <statement> while ( <expression> ) ;

<for statement> ::= for ( <for init>? ; <expression>? ; <for update>? ) <statement>

<for statement no short if> ::= for ( <for init>? ; <expression>? ; <for update>? ) <statement no short if>

<for init> ::= <statement expression list> | <local variable declaration>

<for update> ::= <statement expression list>

<statement expression list> ::= <statement expression> | <statement expression list> , <statement expression>

<break statement> ::= break <identifier>? ;

<continue statement> ::= continue <identifier>? ;

<return statement> ::= return <expression>? ;

<throws statement> ::= throw <expression> ;

<synchronized statement> ::= synchronized ( <expression> ) <block>

<try statement> ::= try <block> <catches> | try <block> <catches>? <finally>

<catches> ::= <catch clause> | <catches> <catch clause>

<catch clause> ::= catch ( <formal parameter> ) <block>

<finally > ::= finally <block>

Expressions

<constant expression> ::= <expression>

<expression> ::= <assignment expression>

<assignment expression> ::= <conditional expression> | <assignment>

<assignment> ::= <left hand side> <assignment operator> <assignment expression>

<left hand side> ::= <expression name> | <field access> | <array access>

<assignment operator> ::= = | *****= | /= | %= | += | -= | <<= | >>= | >>>= | &= | ^= | |=

<conditional expression> ::= <conditional or expression> | <conditional or expression> ? <expression> : <conditional expression>

<conditional or expression> ::= <conditional and expression> | <conditional or expression> || <conditional and expression>

<conditional and expression> ::= <inclusive or expression> | <conditional and expression> && <inclusive or expression>

<inclusive or expression> ::= <exclusive or expression> | <inclusive or expression> | <exclusive or expression>

<exclusive or expression> ::= <and expression> | <exclusive or expression> ^ <and expression>

<and expression> ::= <equality expression> | <and expression> & <equality expression>

<equality expression> ::= <relational expression> | <equality expression> == <relational expression> | <equality expression> != <relational expression>

<relational expression> ::= <shift expression> | <relational expression> < <shift expression> | <relational expression> > <shift expression> | <relational expression> <= <shift expression> | <relational expression> >= <shift expression> | <relational expression> instanceof <reference type>

<shift expression> ::= <additive expression> | <shift expression> << <additive expression> | <shift expression> >> <additive expression> | <shift expression> >>> <additive expression>

<additive expression> ::= <multiplicative expression> | <additive expression> + <multiplicative expression> | <additive expression> - <multiplicative expression>

<multiplicative expression> ::= <unary expression> | <multiplicative expression> * <unary expression> | <multiplicative expression> / <unary expression> | <multiplicative expression> % <unary expression>

<cast expression> ::= ( <primitive type> ) <unary expression> | ( <reference type> ) <unary expression not plus minus>

<unary expression> ::= <preincrement expression> | <predecrement expression> | + <unary expression> | - <unary expression> | <unary expression not plus minus>

<predecrement expression> ::= -- <unary expression>

<preincrement expression> ::= ++ <unary expression>

<unary expression not plus minus> ::= <postfix expression> | ~ <unary expression> | ! <unary expression> | <cast expression>

<postdecrement expression> ::= <postfix expression> --

<postincrement expression> ::= <postfix expression> ++

<postfix expression> ::= <primary> | <expression name> | <postincrement expression> | <postdecrement expression>

<method invocation> ::= <method name> ( <argument list>? ) | <primary> . <identifier> ( <argument list>? ) | super . <identifier> ( <argument list>? )

<field access> ::= <primary> . <identifier> | super . <identifier>

<primary> ::= <primary no new array> | <array creation expression>

<primary no new array> ::= <literal> | this | ( <expression> ) | <class instance creation expression> | <field access> | <method invocation> | <array access>

<class instance creation expression> ::= new <class type> ( <argument list>? )

<argument list> ::= <expression> | <argument list> , <expression>

<array creation expression> ::= new <primitive type> <dim exprs> <dims>? | new <class or interface type> <dim exprs> <dims>?

<dim exprs> ::= <dim expr> | <dim exprs> <dim expr>

<dim expr> ::= [ <expression> ]

<dims> ::= [ ] | <dims> [ ]

<array access> ::= <expression name> [ <expression> ] | <primary no new array> [ <expression>]

Tokens

<package name> ::= <identifier> | <package name> . <identifier>

<type name> ::= <identifier> | <package name> . <identifier>

<simple type name> ::= <identifier>

<expression name> ::= <identifier> | <ambiguous name> . <identifier>

<method name> ::= <identifier> | <ambiguous name>. <identifier>

<ambiguous name>::= <identifier> | <ambiguous name>. <identifier>

<literal> ::= <integer literal> | <floating-point literal> | <boolean literal> | <character literal> | <string literal> | <null literal>

<integer literal> ::= <decimal integer literal> | <hex integer literal> | <octal integer literal>

<decimal integer literal> ::= <decimal numeral> <integer type suffix>?

<hex integer literal> ::= <hex numeral> <integer type suffix>?

<octal integer literal> ::= <octal numeral> <integer type suffix>?

<integer type suffix> ::= l | L

<decimal numeral> ::= 0 | <non zero digit> <digits>?

<digits> ::= <digit> | <digits> <digit>

<digit> ::= 0 | <non zero digit>

<non zero digit> ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

<hex numeral> ::= 0 x <hex digit> | 0 X <hex digit> | <hex numeral> <hex digit>

<hex digit> :: = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | a | b | c | d | e | f | A | B | C | D | E | F

<octal numeral> ::= 0 <octal digit> | <octal numeral> <octal digit>

<octal digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7

<floating-point literal> ::= <digits> . <digits>? <exponent part>? <float type suffix>?

<digits> <exponent part>? <float type suffix>?

<exponent part> ::= <exponent indicator> <signed integer>

<exponent indicator> ::= e | E

<signed integer> ::= <sign>? <digits>

<sign> ::= + | -

<float type suffix> ::= f | F | d | D

<boolean literal> ::= true | false

<character literal> ::= ' <single character> ' | ' <escape sequence> '

<single character> ::= <input character> except ' and \

<string literal> ::= " <string characters>?"

<string characters> ::= <string character> | <string characters> <string character>

<string character> ::= <input character> except " and \ | <escape character>

<null literal> ::= null

<keyword> ::= abstract | boolean | break | byte | case | catch | char | class | const | continue | default | do | double | else | extends | final | finally | float | for | goto | if | implements | import | instanceof | int | interface | long | native | new | package | private | protected | public | return | short | static | super | switch | synchronized | this | throw | throws | transient | try | void | volatile | while

The character set for Java is Unicode, a 16-bit character set. This is the set denoted by <input character>. Unicode effectively contains the familiar 7-bit ASCII characters as a subset, and includes "escape code" designations of the form \udddd (where each d is from <hex digit>). In the extended BNF for Java the optional appearance of X is written X?, and the iterative appearance of X is written {X}.

The syntax category <identifier> consists of strings that must start with a letter - including underscore (_) and dollar sign ($) - followed by any number of letters and digits. Characters of numerous international languages are recognized as "letters" in Java. A Java letter is a character for which the method Character.isJavaLetter returns true. A Java letter-or-digit is a character for which the method Character.isJaveLetterOrDigit returns true. Also, <identifier> includes none of the keywords given above - these are reserved words in Java.

The only BNF extention used here is the optional construct which is written with '?' added as a suffix to a terminal or non-terminal. Note that '*', '{', and '}' are all terminal symbols. This BNF definition does not address such pragmatic issues as comment conventions and the use of "white space" to delimit tokens. This BNF also does not express numerous "context-sensitive" restrictions on syntax. For instance, type use of identifiers must be consistent with the required declarations, there are size limitations on numerical literals, etc.

/*
* The MIT License (MIT)
*
* Copyright (c) 2014 by Bart Kiers
*
* Permission is hereby granted, free of charge, to any person
* obtaining a copy of this software and associated documentation
* files (the "Software"), to deal in the Software without
* restriction, including without limitation the rights to use,
* copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following
* conditions:
*
* The above copyright notice and this permission notice shall be
* included in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
* EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
* OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
* NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
* HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
* WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
* OTHER DEALINGS IN THE SOFTWARE.
*
* Project : python3-parser; an ANTLR4 grammar for Python 3
* https://github.com/bkiers/python3-parser
* Developed by : Bart Kiers, bart@big-o.nl
*/
grammar Python3;
// All comments that start with "///" are copy-pasted from
// The Python Language Reference: https://docs.python.org/3.3/reference/grammar.html
tokens { INDENT, DEDENT }
@lexer::members {
// A queue where extra tokens are pushed on (see the NEWLINE lexer rule).
private java.util.Queue<Token> tokens = new java.util.LinkedList<>();
// The stack that keeps track of the indentation level.
private java.util.Stack<Integer> indents = new java.util.Stack<>();
// The amount of opened braces, brackets and parenthesis.
private int opened = 0;
@Override
public void emit(Token t) {
super.setToken(t);
tokens.offer(t);
}
@Override
public Token nextToken() {
// Check if the end-of-file is ahead and there are still some DEDENTS expected.
if (_input.LA(1) == EOF && !this.indents.isEmpty()) {
// First emit an extra line break that serves as the end of the statement.
this.emit(new CommonToken(Python3Parser.NEWLINE, "\n"));
// Now emit as much DEDENT tokens as needed.
while (!indents.isEmpty()) {
this.emit(new CommonToken(Python3Parser.DEDENT, "DEDENT"));
indents.pop();
}
}
Token next = super.nextToken();
return tokens.isEmpty() ? next : tokens.poll();
}
// Calculates the indentation of the provided spaces, taking the
// following rules into account:
//
// "Tabs are replaced (from left to right) by one to eight spaces
// such that the total number of characters up to and including
// the replacement is a multiple of eight [...]"
//
// -- https://docs.python.org/3.1/reference/lexical_analysis.html#indentation
static int getIndentationCount(String spaces) {
int count = 0;
for (char ch : spaces.toCharArray()) {
switch (ch) {
case '\t':
count += 8 - (count % 8);
break;
default:
// A normal space char.
count++;
}
}
return count;
}
}
/*
* parser rules
*/
/// single_input: NEWLINE | simple_stmt | compound_stmt NEWLINE
single_input
: NEWLINE
| simple_stmt
| compound_stmt NEWLINE
;
/// file_input: (NEWLINE | stmt)* ENDMARKER
file_input
: ( NEWLINE | stmt )* EOF
;
/// eval_input: testlist NEWLINE* ENDMARKER
eval_input
: testlist NEWLINE* EOF
;
/// decorator: '@' dotted_name [ '(' [arglist] ')' ] NEWLINE
decorator
: '@' dotted_name ( '(' arglist? ')' )? NEWLINE
;
/// decorators: decorator+
decorators
: decorator+
;
/// decorated: decorators (classdef | funcdef)
decorated
: decorators ( classdef | funcdef )
;
/// funcdef: 'def' NAME parameters ['->' test] ':' suite
funcdef
: DEF NAME parameters ( '->' test )? ':' suite
;
/// parameters: '(' [typedargslist] ')'
parameters
: '(' typedargslist? ')'
;
/// typedargslist: (tfpdef ['=' test] (',' tfpdef ['=' test])* [','
/// ['*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef]]
/// | '*' [tfpdef] (',' tfpdef ['=' test])* [',' '**' tfpdef] | '**' tfpdef)
typedargslist
: tfpdef ( '=' test )? ( ',' tfpdef ( '=' test )? )* ( ',' ( '*' tfpdef? ( ',' tfpdef ( '=' test )? )* ( ',' '**' tfpdef )?
| '**' tfpdef
)?
)?
| '*' tfpdef? ( ',' tfpdef ( '=' test )? )* ( ',' '**' tfpdef )?
| '**' tfpdef
;
/// tfpdef: NAME [':' test]
tfpdef
: NAME ( ':' test )?
;
/// varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [','
/// ['*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef]]
/// | '*' [vfpdef] (',' vfpdef ['=' test])* [',' '**' vfpdef] | '**' vfpdef)
varargslist
: vfpdef ( '=' test )? ( ',' vfpdef ( '=' test )? )* ( ',' ( '*' vfpdef? ( ',' vfpdef ( '=' test )? )* ( ',' '**' vfpdef )?
| '**' vfpdef
)?
)?
| '*' vfpdef? ( ',' vfpdef ( '=' test )? )* ( ',' '**' vfpdef )?
| '**' vfpdef
;
/// vfpdef: NAME
vfpdef
: NAME
;
/// stmt: simple_stmt | compound_stmt
stmt
: simple_stmt
| compound_stmt
;
/// simple_stmt: small_stmt (';' small_stmt)* [';'] NEWLINE
simple_stmt
: small_stmt ( ';' small_stmt )* ';'? NEWLINE
;
/// small_stmt: (expr_stmt | del_stmt | pass_stmt | flow_stmt |
/// import_stmt | global_stmt | nonlocal_stmt | assert_stmt)
small_stmt
: expr_stmt
| del_stmt
| pass_stmt
| flow_stmt
| import_stmt
| global_stmt
| nonlocal_stmt
| assert_stmt
;
/// expr_stmt: testlist_star_expr (augassign (yield_expr|testlist) |
/// ('=' (yield_expr|testlist_star_expr))*)
expr_stmt
: testlist_star_expr ( augassign ( yield_expr | testlist)
| ( '=' ( yield_expr| testlist_star_expr ) )*
)
;
/// testlist_star_expr: (test|star_expr) (',' (test|star_expr))* [',']
testlist_star_expr
: ( test | star_expr ) ( ',' ( test | star_expr ) )* ','?
;
/// augassign: ('+=' | '-=' | '*=' | '/=' | '%=' | '&=' | '|=' | '^=' |
/// '<<=' | '>>=' | '**=' | '//=')
augassign
: '+='
| '-='
| '*='
| '@=' // PEP 465
| '/='
| '%='
| '&='
| '|='
| '^='
| '<<='
| '>>='
| '**='
| '//='
;
/// del_stmt: 'del' exprlist
del_stmt
: DEL exprlist
;
/// pass_stmt: 'pass'
pass_stmt
: PASS
;
/// flow_stmt: break_stmt | continue_stmt | return_stmt | raise_stmt | yield_stmt
flow_stmt
: break_stmt
| continue_stmt
| return_stmt
| raise_stmt
| yield_stmt
;
/// break_stmt: 'break'
break_stmt
: BREAK
;
/// continue_stmt: 'continue'
continue_stmt
: CONTINUE
;
/// return_stmt: 'return' [testlist]
return_stmt
: RETURN testlist?
;
/// yield_stmt: yield_expr
yield_stmt
: yield_expr
;
/// raise_stmt: 'raise' [test ['from' test]]
raise_stmt
: RAISE ( test ( FROM test )? )?
;
/// import_stmt: import_name | import_from
import_stmt
: import_name
| import_from
;
/// import_name: 'import' dotted_as_names
import_name
: IMPORT dotted_as_names
;
/// # note below: the ('.' | '...') is necessary because '...' is tokenized as ELLIPSIS
/// import_from: ('from' (('.' | '...')* dotted_name | ('.' | '...')+)
/// 'import' ('*' | '(' import_as_names ')' | import_as_names))
import_from
: FROM ( ( '.' | '...' )* dotted_name
| ('.' | '...')+
)
IMPORT ( '*'
| '(' import_as_names ')'
| import_as_names
)
;
/// import_as_name: NAME ['as' NAME]
import_as_name
: NAME ( AS NAME )?
;
/// dotted_as_name: dotted_name ['as' NAME]
dotted_as_name
: dotted_name ( AS NAME )?
;
/// import_as_names: import_as_name (',' import_as_name)* [',']
import_as_names
: import_as_name ( ',' import_as_name )* ','?
;
/// dotted_as_names: dotted_as_name (',' dotted_as_name)*
dotted_as_names
: dotted_as_name ( ',' dotted_as_name )*
;
/// dotted_name: NAME ('.' NAME)*
dotted_name
: NAME ( '.' NAME )*
;
/// global_stmt: 'global' NAME (',' NAME)*
global_stmt
: GLOBAL NAME ( ',' NAME )*
;
/// nonlocal_stmt: 'nonlocal' NAME (',' NAME)*
nonlocal_stmt
: NONLOCAL NAME ( ',' NAME )*
;
/// assert_stmt: 'assert' test [',' test]
assert_stmt
: ASSERT test ( ',' test )?
;
/// compound_stmt: if_stmt | while_stmt | for_stmt | try_stmt | with_stmt | funcdef | classdef | decorated
compound_stmt
: if_stmt
| while_stmt
| for_stmt
| try_stmt
| with_stmt
| funcdef
| classdef
| decorated
;
/// if_stmt: 'if' test ':' suite ('elif' test ':' suite)* ['else' ':' suite]
if_stmt
: IF test ':' suite ( ELIF test ':' suite )* ( ELSE ':' suite )?
;
/// while_stmt: 'while' test ':' suite ['else' ':' suite]
while_stmt
: WHILE test ':' suite ( ELSE ':' suite )?
;
/// for_stmt: 'for' exprlist 'in' testlist ':' suite ['else' ':' suite]
for_stmt
: FOR exprlist IN testlist ':' suite ( ELSE ':' suite )?
;
/// try_stmt: ('try' ':' suite
/// ((except_clause ':' suite)+
/// ['else' ':' suite]
/// ['finally' ':' suite] |
/// 'finally' ':' suite))
try_stmt
: TRY ':' suite ( ( except_clause ':' suite )+
( ELSE ':' suite )?
( FINALLY ':' suite )?
| FINALLY ':' suite
)
;
/// with_stmt: 'with' with_item (',' with_item)* ':' suite
with_stmt
: WITH with_item ( ',' with_item )* ':' suite
;
/// with_item: test ['as' expr]
with_item
: test ( AS expr )?
;
/// # NB compile.c makes sure that the default except clause is last
/// except_clause: 'except' [test ['as' NAME]]
except_clause
: EXCEPT ( test ( AS NAME )? )?
;
/// suite: simple_stmt | NEWLINE INDENT stmt+ DEDENT
suite
: simple_stmt
| NEWLINE INDENT stmt+ DEDENT
;
/// test: or_test ['if' or_test 'else' test] | lambdef
test
: or_test ( IF or_test ELSE test )?
| lambdef
;
/// test_nocond: or_test | lambdef_nocond
test_nocond
: or_test
| lambdef_nocond
;
/// lambdef: 'lambda' [varargslist] ':' test
lambdef
: LAMBDA varargslist? ':' test
;
/// lambdef_nocond: 'lambda' [varargslist] ':' test_nocond
lambdef_nocond
: LAMBDA varargslist? ':' test_nocond
;
/// or_test: and_test ('or' and_test)*
or_test
: and_test ( OR and_test )*
;
/// and_test: not_test ('and' not_test)*
and_test
: not_test ( AND not_test )*
;
/// not_test: 'not' not_test | comparison
not_test
: NOT not_test
| comparison
;
/// comparison: star_expr (comp_op star_expr)*
comparison
: star_expr ( comp_op star_expr )*
;
/// # <> isn't actually a valid comparison operator in Python. It's here for the
/// # sake of a __future__ import described in PEP 401
/// comp_op: '<'|'>'|'=='|'>='|'<='|'<>'|'!='|'in'|'not' 'in'|'is'|'is' 'not'
comp_op
: '<'
| '>'
| '=='
| '>='
| '<='
| '<>'
| '!='
| IN
| NOT IN
| IS
| IS NOT
;
/// star_expr: ['*'] expr
star_expr
: '*'? expr
;
/// expr: xor_expr ('|' xor_expr)*
expr
: xor_expr ( '|' xor_expr )*
;
/// xor_expr: and_expr ('^' and_expr)*
xor_expr
: and_expr ( '^' and_expr )*
;
/// and_expr: shift_expr ('&' shift_expr)*
and_expr
: shift_expr ( '&' shift_expr )*
;
/// shift_expr: arith_expr (('<<'|'>>') arith_expr)*
shift_expr
: arith_expr ( '<<' arith_expr
| '>>' arith_expr
)*
;
/// arith_expr: term (('+'|'-') term)*
arith_expr
: term ( '+' term
| '-' term
)*
;
/// term: factor (('*'|'/'|'%'|'//') factor)*
term
: factor ( '*' factor
| '/' factor
| '%' factor
| '//' factor
| '@' factor // PEP 465
)*
;
/// factor: ('+'|'-'|'~') factor | power
factor
: '+' factor
| '-' factor
| '~' factor
| power
;
/// power: atom trailer* ['**' factor]
power
: atom trailer* ( '**' factor )?
;
/// atom: ('(' [yield_expr|testlist_comp] ')' |
/// '[' [testlist_comp] ']' |
/// '{' [dictorsetmaker] '}' |
/// NAME | NUMBER | STRING+ | '...' | 'None' | 'True' | 'False')
atom
: '(' ( yield_expr | testlist_comp )? ')'
| '[' testlist_comp? ']'
| '{' dictorsetmaker? '}'
| NAME
| number
| string+
| '...'
| NONE
| TRUE
| FALSE
;
/// testlist_comp: test ( comp_for | (',' test)* [','] )
testlist_comp
: test ( comp_for
| ( ',' test )* ','?
)
;
/// trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' NAME
trailer
: '(' arglist? ')'
| '[' subscriptlist ']'
| '.' NAME
;
/// subscriptlist: subscript (',' subscript)* [',']
subscriptlist
: subscript ( ',' subscript )* ','?
;
/// subscript: test | [test] ':' [test] [sliceop]
subscript
: test
| test? ':' test? sliceop?
;
/// sliceop: ':' [test]
sliceop
: ':' test?
;
/// exprlist: star_expr (',' star_expr)* [',']
exprlist
: star_expr ( ',' star_expr )* ','?
;
/// testlist: test (',' test)* [',']
testlist
: test ( ',' test )* ','?
;
/// dictorsetmaker: ( (test ':' test (comp_for | (',' test ':' test)* [','])) |
/// (test (comp_for | (',' test)* [','])) )
dictorsetmaker
: test ':' test ( comp_for
| ( ',' test ':' test )* ','?
)
| test ( comp_for
| ( ',' test )* ','?
)
;
/// classdef: 'class' NAME ['(' [arglist] ')'] ':' suite
classdef
: CLASS NAME ( '(' arglist? ')' )? ':' suite
;
/// arglist: (argument ',')* (argument [',']
/// |'*' test (',' argument)* [',' '**' test]
/// |'**' test)
arglist
: ( argument ',' )* ( argument ','?
| '*' test ( ',' argument )* ( ',' '**' test )?
| '**' test
)
;
/// # The reason that keywords are test nodes instead of NAME is that using NAME
/// # results in an ambiguity. ast.c makes sure it's a NAME.
/// argument: test [comp_for] | test '=' test # Really [keyword '='] test
argument
: test comp_for?
| test '=' test
;
/// comp_iter: comp_for | comp_if
comp_iter
: comp_for
| comp_if
;
/// comp_for: 'for' exprlist 'in' or_test [comp_iter]
comp_for
: FOR exprlist IN or_test comp_iter?
;
/// comp_if: 'if' test_nocond [comp_iter]
comp_if
: IF test_nocond comp_iter?
;
/// yield_expr: 'yield' [testlist]
yield_expr
: YIELD yield_arg?
;
/// yield_arg: 'from' test | testlist
yield_arg
: FROM test
| testlist
;
string
: STRING_LITERAL
| BYTES_LITERAL
;
number
: integer
| FLOAT_NUMBER
| IMAG_NUMBER
;
/// integer ::= decimalinteger | octinteger | hexinteger | bininteger
integer
: DECIMAL_INTEGER
| OCT_INTEGER
| HEX_INTEGER
| BIN_INTEGER
;
/*
* lexer rules
*/
DEF : 'def';
RETURN : 'return';
RAISE : 'raise';
FROM : 'from';
IMPORT : 'import';
AS : 'as';
GLOBAL : 'global';
NONLOCAL : 'nonlocal';
ASSERT : 'assert';
IF : 'if';
ELIF : 'elif';
ELSE : 'else';
WHILE : 'while';
FOR : 'for';
IN : 'in';
TRY : 'try';
FINALLY : 'finally';
WITH : 'with';
EXCEPT : 'except';
LAMBDA : 'lambda';
OR : 'or';
AND : 'and';
NOT : 'not';
IS : 'is';
NONE : 'None';
TRUE : 'True';
FALSE : 'False';
CLASS : 'class';
YIELD : 'yield';
DEL : 'del';
PASS : 'pass';
CONTINUE : 'continue';
BREAK : 'break';
NEWLINE
: ( '\r'? '\n' | '\r' ) SPACES?
{
String spaces = getText().replaceAll("[\r\n]+", "");
int next = _input.LA(1);
if (opened > 0 || next == '\r' || next == '\n' || next == '#') {
// If we're inside a list or on a blank line, ignore all indents,
// dedents and line breaks.
skip();
}
else {
emit(new CommonToken(NEWLINE, "\n"));
int indent = getIndentationCount(spaces);
int previous = indents.isEmpty() ? 0 : indents.peek();
if (indent == previous) {
// skip indents of the same size as the present indent-size
skip();
}
else if (indent > previous) {
indents.push(indent);
emit(new CommonToken(Python3Parser.INDENT, "INDENT"));
}
else {
// Possibly emit more than 1 DEDENT token.
while(!indents.isEmpty() && indents.peek() > indent) {
emit(new CommonToken(Python3Parser.DEDENT, "DEDENT"));
indents.pop();
}
}
}
}
;
/// identifier ::= id_start id_continue*
NAME
: ID_START ID_CONTINUE*
;
/// stringliteral ::= [stringprefix](shortstring | longstring)
/// stringprefix ::= "r" | "R"
STRING_LITERAL
: [uU]? [rR]? ( SHORT_STRING | LONG_STRING )
;
/// bytesliteral ::= bytesprefix(shortbytes | longbytes)
/// bytesprefix ::= "b" | "B" | "br" | "Br" | "bR" | "BR"
BYTES_LITERAL
: [bB] [rR]? ( SHORT_BYTES | LONG_BYTES )
;
/// decimalinteger ::= nonzerodigit digit* | "0"+
DECIMAL_INTEGER
: NON_ZERO_DIGIT DIGIT*
| '0'+
;
/// octinteger ::= "0" ("o" | "O") octdigit+
OCT_INTEGER
: '0' [oO] OCT_DIGIT+
;
/// hexinteger ::= "0" ("x" | "X") hexdigit+
HEX_INTEGER
: '0' [xX] HEX_DIGIT+
;
/// bininteger ::= "0" ("b" | "B") bindigit+
BIN_INTEGER
: '0' [bB] BIN_DIGIT+
;
/// floatnumber ::= pointfloat | exponentfloat
FLOAT_NUMBER
: POINT_FLOAT
| EXPONENT_FLOAT
;
/// imagnumber ::= (floatnumber | intpart) ("j" | "J")
IMAG_NUMBER
: ( FLOAT_NUMBER | INT_PART ) [jJ]
;
DOT : '.';
ELLIPSIS : '...';
STAR : '*';
OPEN_PAREN : '(' {opened++;};
CLOSE_PAREN : ')' {opened--;};
COMMA : ',';
COLON : ':';
SEMI_COLON : ';';
POWER : '**';
ASSIGN : '=';
OPEN_BRACK : '[' {opened++;};
CLOSE_BRACK : ']' {opened--;};
OR_OP : '|';
XOR : '^';
AND_OP : '&';
LEFT_SHIFT : '<<';
RIGHT_SHIFT : '>>';
ADD : '+';
MINUS : '-';
DIV : '/';
MOD : '%';
IDIV : '//';
NOT_OP : '~';
OPEN_BRACE : '{' {opened++;};
CLOSE_BRACE : '}' {opened--;};
LESS_THAN : '<';
GREATER_THAN : '>';
EQUALS : '==';
GT_EQ : '>=';
LT_EQ : '<=';
NOT_EQ_1 : '<>';
NOT_EQ_2 : '!=';
AT : '@';
ARROW : '->';
ADD_ASSIGN : '+=';
SUB_ASSIGN : '-=';
MULT_ASSIGN : '*=';
AT_ASSIGN : '@=';
DIV_ASSIGN : '/=';
MOD_ASSIGN : '%=';
AND_ASSIGN : '&=';
OR_ASSIGN : '|=';
XOR_ASSIGN : '^=';
LEFT_SHIFT_ASSIGN : '<<=';
RIGHT_SHIFT_ASSIGN : '>>=';
POWER_ASSIGN : '**=';
IDIV_ASSIGN : '//=';
SKIP
: ( SPACES | COMMENT | LINE_JOINING ) -> skip
;
UNKNOWN_CHAR
: .
;
/*
* fragments
*/
/// shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"'
/// shortstringitem ::= shortstringchar | stringescapeseq
/// shortstringchar ::= <any source character except "\" or newline or the quote>
fragment SHORT_STRING
: '\'' ( STRING_ESCAPE_SEQ | ~[\\\r\n'] )* '\''
| '"' ( STRING_ESCAPE_SEQ | ~[\\\r\n"] )* '"'
;
/// longstring ::= "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
fragment LONG_STRING
: '\'\'\'' LONG_STRING_ITEM*? '\'\'\''
| '"""' LONG_STRING_ITEM*? '"""'
;
/// longstringitem ::= longstringchar | stringescapeseq
fragment LONG_STRING_ITEM
: LONG_STRING_CHAR
| STRING_ESCAPE_SEQ
;
/// longstringchar ::= <any source character except "\">
fragment LONG_STRING_CHAR
: ~'\\'
;
/// stringescapeseq ::= "\" <any source character>
fragment STRING_ESCAPE_SEQ
: '\\' .
;
/// nonzerodigit ::= "1"..."9"
fragment NON_ZERO_DIGIT
: [1-9]
;
/// digit ::= "0"..."9"
fragment DIGIT
: [0-9]
;
/// octdigit ::= "0"..."7"
fragment OCT_DIGIT
: [0-7]
;
/// hexdigit ::= digit | "a"..."f" | "A"..."F"
fragment HEX_DIGIT
: [0-9a-fA-F]
;
/// bindigit ::= "0" | "1"
fragment BIN_DIGIT
: [01]
;
/// pointfloat ::= [intpart] fraction | intpart "."
fragment POINT_FLOAT
: INT_PART? FRACTION
| INT_PART '.'
;
/// exponentfloat ::= (intpart | pointfloat) exponent
fragment EXPONENT_FLOAT
: ( INT_PART | POINT_FLOAT ) EXPONENT
;
/// intpart ::= digit+
fragment INT_PART
: DIGIT+
;
/// fraction ::= "." digit+
fragment FRACTION
: '.' DIGIT+
;
/// exponent ::= ("e" | "E") ["+" | "-"] digit+
fragment EXPONENT
: [eE] [+-]? DIGIT+
;
/// shortbytes ::= "'" shortbytesitem* "'" | '"' shortbytesitem* '"'
/// shortbytesitem ::= shortbyteschar | bytesescapeseq
fragment SHORT_BYTES
: '\'' ( SHORT_BYTES_CHAR_NO_SINGLE_QUOTE | BYTES_ESCAPE_SEQ )* '\''
| '"' ( SHORT_BYTES_CHAR_NO_DOUBLE_QUOTE | BYTES_ESCAPE_SEQ )* '"'
;
/// longbytes ::= "'''" longbytesitem* "'''" | '"""' longbytesitem* '"""'
fragment LONG_BYTES
: '\'\'\'' LONG_BYTES_ITEM*? '\'\'\''
| '"""' LONG_BYTES_ITEM*? '"""'
;
/// longbytesitem ::= longbyteschar | bytesescapeseq
fragment LONG_BYTES_ITEM
: LONG_BYTES_CHAR
| BYTES_ESCAPE_SEQ
;
/// shortbyteschar ::= <any ASCII character except "\" or newline or the quote>
fragment SHORT_BYTES_CHAR_NO_SINGLE_QUOTE
: [\u0000-\u0009]
| [\u000B-\u000C]
| [\u000E-\u0026]
| [\u0028-\u005B]
| [\u005D-\u007F]
;
fragment SHORT_BYTES_CHAR_NO_DOUBLE_QUOTE
: [\u0000-\u0009]
| [\u000B-\u000C]
| [\u000E-\u0021]
| [\u0023-\u005B]
| [\u005D-\u007F]
;
/// longbyteschar ::= <any ASCII character except "\">
fragment LONG_BYTES_CHAR
: [\u0000-\u005B]
| [\u005D-\u007F]
;
/// bytesescapeseq ::= "\" <any ASCII character>
fragment BYTES_ESCAPE_SEQ
: '\\' [\u0000-\u007F]
;
fragment SPACES
: [ \t]+
;
fragment COMMENT
: '#' ~[\r\n]*
;
fragment LINE_JOINING
: '\\' SPACES? ( '\r'? '\n' | '\r' )
;
/// id_start ::= <all characters in general categories Lu, Ll, Lt, Lm, Lo, Nl, the underscore, and characters with the Other_ID_Start property>
fragment ID_START
: '_'
| [A-Z]
| [a-z]
| '\u00AA'
| '\u00B5'
| '\u00BA'
| [\u00C0-\u00D6]
| [\u00D8-\u00F6]
| [\u00F8-\u01BA]
| '\u01BB'
| [\u01BC-\u01BF]
| [\u01C0-\u01C3]
| [\u01C4-\u0241]
| [\u0250-\u02AF]
| [\u02B0-\u02C1]
| [\u02C6-\u02D1]
| [\u02E0-\u02E4]
| '\u02EE'
| '\u037A'
| '\u0386'
| [\u0388-\u038A]
| '\u038C'
| [\u038E-\u03A1]
| [\u03A3-\u03CE]
| [\u03D0-\u03F5]
| [\u03F7-\u0481]
| [\u048A-\u04CE]
| [\u04D0-\u04F9]
| [\u0500-\u050F]
| [\u0531-\u0556]
| '\u0559'
| [\u0561-\u0587]
| [\u05D0-\u05EA]
| [\u05F0-\u05F2]
| [\u0621-\u063A]
| '\u0640'
| [\u0641-\u064A]
| [\u066E-\u066F]
| [\u0671-\u06D3]
| '\u06D5'
| [\u06E5-\u06E6]
| [\u06EE-\u06EF]
| [\u06FA-\u06FC]
| '\u06FF'
| '\u0710'
| [\u0712-\u072F]
| [\u074D-\u076D]
| [\u0780-\u07A5]
| '\u07B1'
| [\u0904-\u0939]
| '\u093D'
| '\u0950'
| [\u0958-\u0961]
| '\u097D'
| [\u0985-\u098C]
| [\u098F-\u0990]
| [\u0993-\u09A8]
| [\u09AA-\u09B0]
| '\u09B2'
| [\u09B6-\u09B9]
| '\u09BD'
| '\u09CE'
| [\u09DC-\u09DD]
| [\u09DF-\u09E1]
| [\u09F0-\u09F1]
| [\u0A05-\u0A0A]
| [\u0A0F-\u0A10]
| [\u0A13-\u0A28]
| [\u0A2A-\u0A30]
| [\u0A32-\u0A33]
| [\u0A35-\u0A36]
| [\u0A38-\u0A39]
| [\u0A59-\u0A5C]
| '\u0A5E'
| [\u0A72-\u0A74]
| [\u0A85-\u0A8D]
| [\u0A8F-\u0A91]
| [\u0A93-\u0AA8]
| [\u0AAA-\u0AB0]
| [\u0AB2-\u0AB3]
| [\u0AB5-\u0AB9]
| '\u0ABD'
| '\u0AD0'
| [\u0AE0-\u0AE1]
| [\u0B05-\u0B0C]
| [\u0B0F-\u0B10]
| [\u0B13-\u0B28]
| [\u0B2A-\u0B30]
| [\u0B32-\u0B33]
| [\u0B35-\u0B39]
| '\u0B3D'
| [\u0B5C-\u0B5D]
| [\u0B5F-\u0B61]
| '\u0B71'
| '\u0B83'
| [\u0B85-\u0B8A]
| [\u0B8E-\u0B90]
| [\u0B92-\u0B95]
| [\u0B99-\u0B9A]
| '\u0B9C'
| [\u0B9E-\u0B9F]
| [\u0BA3-\u0BA4]
| [\u0BA8-\u0BAA]
| [\u0BAE-\u0BB9]
| [\u0C05-\u0C0C]
| [\u0C0E-\u0C10]
| [\u0C12-\u0C28]
| [\u0C2A-\u0C33]
| [\u0C35-\u0C39]
| [\u0C60-\u0C61]
| [\u0C85-\u0C8C]
| [\u0C8E-\u0C90]
| [\u0C92-\u0CA8]
| [\u0CAA-\u0CB3]
| [\u0CB5-\u0CB9]
| '\u0CBD'
| '\u0CDE'
| [\u0CE0-\u0CE1]
| [\u0D05-\u0D0C]
| [\u0D0E-\u0D10]
| [\u0D12-\u0D28]
| [\u0D2A-\u0D39]
| [\u0D60-\u0D61]
| [\u0D85-\u0D96]
| [\u0D9A-\u0DB1]
| [\u0DB3-\u0DBB]
| '\u0DBD'
| [\u0DC0-\u0DC6]
| [\u0E01-\u0E30]
| [\u0E32-\u0E33]
| [\u0E40-\u0E45]
| '\u0E46'
| [\u0E81-\u0E82]
| '\u0E84'
| [\u0E87-\u0E88]
| '\u0E8A'
| '\u0E8D'
| [\u0E94-\u0E97]
| [\u0E99-\u0E9F]
| [\u0EA1-\u0EA3]
| '\u0EA5'
| '\u0EA7'
| [\u0EAA-\u0EAB]
| [\u0EAD-\u0EB0]
| [\u0EB2-\u0EB3]
| '\u0EBD'
| [\u0EC0-\u0EC4]
| '\u0EC6'
| [\u0EDC-\u0EDD]
| '\u0F00'
| [\u0F40-\u0F47]
| [\u0F49-\u0F6A]
| [\u0F88-\u0F8B]
| [\u1000-\u1021]
| [\u1023-\u1027]
| [\u1029-\u102A]
| [\u1050-\u1055]
| [\u10A0-\u10C5]
| [\u10D0-\u10FA]
| '\u10FC'
| [\u1100-\u1159]
| [\u115F-\u11A2]
| [\u11A8-\u11F9]
| [\u1200-\u1248]
| [\u124A-\u124D]
| [\u1250-\u1256]
| '\u1258'
| [\u125A-\u125D]
| [\u1260-\u1288]
| [\u128A-\u128D]
| [\u1290-\u12B0]
| [\u12B2-\u12B5]
| [\u12B8-\u12BE]
| '\u12C0'
| [\u12C2-\u12C5]
| [\u12C8-\u12D6]
| [\u12D8-\u1310]
| [\u1312-\u1315]
| [\u1318-\u135A]
| [\u1380-\u138F]
| [\u13A0-\u13F4]
| [\u1401-\u166C]
| [\u166F-\u1676]
| [\u1681-\u169A]
| [\u16A0-\u16EA]
| [\u16EE-\u16F0]
| [\u1700-\u170C]
| [\u170E-\u1711]
| [\u1720-\u1731]
| [\u1740-\u1751]
| [\u1760-\u176C]
| [\u176E-\u1770]
| [\u1780-\u17B3]
| '\u17D7'
| '\u17DC'
| [\u1820-\u1842]
| '\u1843'
| [\u1844-\u1877]
| [\u1880-\u18A8]
| [\u1900-\u191C]
| [\u1950-\u196D]
| [\u1970-\u1974]
| [\u1980-\u19A9]
| [\u19C1-\u19C7]
| [\u1A00-\u1A16]
| [\u1D00-\u1D2B]
| [\u1D2C-\u1D61]
| [\u1D62-\u1D77]
| '\u1D78'
| [\u1D79-\u1D9A]
| [\u1D9B-\u1DBF]
| [\u1E00-\u1E9B]
| [\u1EA0-\u1EF9]
| [\u1F00-\u1F15]
| [\u1F18-\u1F1D]
| [\u1F20-\u1F45]
| [\u1F48-\u1F4D]
| [\u1F50-\u1F57]
| '\u1F59'
| '\u1F5B'
| '\u1F5D'
| [\u1F5F-\u1F7D]
| [\u1F80-\u1FB4]
| [\u1FB6-\u1FBC]
| '\u1FBE'
| [\u1FC2-\u1FC4]
| [\u1FC6-\u1FCC]
| [\u1FD0-\u1FD3]
| [\u1FD6-\u1FDB]
| [\u1FE0-\u1FEC]
| [\u1FF2-\u1FF4]
| [\u1FF6-\u1FFC]
| '\u2071'
| '\u207F'
| [\u2090-\u2094]
| '\u2102'
| '\u2107'
| [\u210A-\u2113]
| '\u2115'
| '\u2118'
| [\u2119-\u211D]
| '\u2124'
| '\u2126'
| '\u2128'
| [\u212A-\u212D]
| '\u212E'
| [\u212F-\u2131]
| [\u2133-\u2134]
| [\u2135-\u2138]
| '\u2139'
| [\u213C-\u213F]
| [\u2145-\u2149]
| [\u2160-\u2183]
| [\u2C00-\u2C2E]
| [\u2C30-\u2C5E]
| [\u2C80-\u2CE4]
| [\u2D00-\u2D25]
| [\u2D30-\u2D65]
| '\u2D6F'
| [\u2D80-\u2D96]
| [\u2DA0-\u2DA6]
| [\u2DA8-\u2DAE]
| [\u2DB0-\u2DB6]
| [\u2DB8-\u2DBE]
| [\u2DC0-\u2DC6]
| [\u2DC8-\u2DCE]
| [\u2DD0-\u2DD6]
| [\u2DD8-\u2DDE]
| '\u3005'
| '\u3006'
| '\u3007'
| [\u3021-\u3029]
| [\u3031-\u3035]
| [\u3038-\u303A]
| '\u303B'
| '\u303C'
| [\u3041-\u3096]
| [\u309B-\u309C]
| [\u309D-\u309E]
| '\u309F'
| [\u30A1-\u30FA]
| [\u30FC-\u30FE]
| '\u30FF'
| [\u3105-\u312C]
| [\u3131-\u318E]
| [\u31A0-\u31B7]
| [\u31F0-\u31FF]
| [\u3400-\u4DB5]
| [\u4E00-\u9FBB]
| [\uA000-\uA014]
| '\uA015'
| [\uA016-\uA48C]
| [\uA800-\uA801]
| [\uA803-\uA805]
| [\uA807-\uA80A]
| [\uA80C-\uA822]
| [\uAC00-\uD7A3]
| [\uF900-\uFA2D]
| [\uFA30-\uFA6A]
| [\uFA70-\uFAD9]
| [\uFB00-\uFB06]
| [\uFB13-\uFB17]
| '\uFB1D'
| [\uFB1F-\uFB28]
| [\uFB2A-\uFB36]
| [\uFB38-\uFB3C]
| '\uFB3E'
| [\uFB40-\uFB41]
| [\uFB43-\uFB44]
| [\uFB46-\uFBB1]
| [\uFBD3-\uFD3D]
| [\uFD50-\uFD8F]
| [\uFD92-\uFDC7]
| [\uFDF0-\uFDFB]
| [\uFE70-\uFE74]
| [\uFE76-\uFEFC]
| [\uFF21-\uFF3A]
| [\uFF41-\uFF5A]
| [\uFF66-\uFF6F]
| '\uFF70'
| [\uFF71-\uFF9D]
| [\uFF9E-\uFF9F]
| [\uFFA0-\uFFBE]
| [\uFFC2-\uFFC7]
| [\uFFCA-\uFFCF]
| [\uFFD2-\uFFD7]
| [\uFFDA-\uFFDC]
;
/// id_continue ::= <all characters in id_start, plus characters in the categories Mn, Mc, Nd, Pc and others with the Other_ID_Continue property>
fragment ID_CONTINUE
: ID_START
| [0-9]
| [\u0300-\u036F]
| [\u0483-\u0486]
| [\u0591-\u05B9]
| [\u05BB-\u05BD]
| '\u05BF'
| [\u05C1-\u05C2]
| [\u05C4-\u05C5]
| '\u05C7'
| [\u0610-\u0615]
| [\u064B-\u065E]
| [\u0660-\u0669]
| '\u0670'
| [\u06D6-\u06DC]
| [\u06DF-\u06E4]
| [\u06E7-\u06E8]
| [\u06EA-\u06ED]
| [\u06F0-\u06F9]
| '\u0711'
| [\u0730-\u074A]
| [\u07A6-\u07B0]
| [\u0901-\u0902]
| '\u0903'
| '\u093C'
| [\u093E-\u0940]
| [\u0941-\u0948]
| [\u0949-\u094C]
| '\u094D'
| [\u0951-\u0954]
| [\u0962-\u0963]
| [\u0966-\u096F]
| '\u0981'
| [\u0982-\u0983]
| '\u09BC'
| [\u09BE-\u09C0]
| [\u09C1-\u09C4]
| [\u09C7-\u09C8]
| [\u09CB-\u09CC]
| '\u09CD'
| '\u09D7'
| [\u09E2-\u09E3]
| [\u09E6-\u09EF]
| [\u0A01-\u0A02]
| '\u0A03'
| '\u0A3C'
| [\u0A3E-\u0A40]
| [\u0A41-\u0A42]
| [\u0A47-\u0A48]
| [\u0A4B-\u0A4D]
| [\u0A66-\u0A6F]
| [\u0A70-\u0A71]
| [\u0A81-\u0A82]
| '\u0A83'
| '\u0ABC'
| [\u0ABE-\u0AC0]
| [\u0AC1-\u0AC5]
| [\u0AC7-\u0AC8]
| '\u0AC9'
| [\u0ACB-\u0ACC]
| '\u0ACD'
| [\u0AE2-\u0AE3]
| [\u0AE6-\u0AEF]
| '\u0B01'
| [\u0B02-\u0B03]
| '\u0B3C'
| '\u0B3E'
| '\u0B3F'
| '\u0B40'
| [\u0B41-\u0B43]
| [\u0B47-\u0B48]
| [\u0B4B-\u0B4C]
| '\u0B4D'
| '\u0B56'
| '\u0B57'
| [\u0B66-\u0B6F]
| '\u0B82'
| [\u0BBE-\u0BBF]
| '\u0BC0'
| [\u0BC1-\u0BC2]
| [\u0BC6-\u0BC8]
| [\u0BCA-\u0BCC]
| '\u0BCD'
| '\u0BD7'
| [\u0BE6-\u0BEF]
| [\u0C01-\u0C03]
| [\u0C3E-\u0C40]
| [\u0C41-\u0C44]
| [\u0C46-\u0C48]
| [\u0C4A-\u0C4D]
| [\u0C55-\u0C56]
| [\u0C66-\u0C6F]
| [\u0C82-\u0C83]
| '\u0CBC'
| '\u0CBE'
| '\u0CBF'
| [\u0CC0-\u0CC4]
| '\u0CC6'
| [\u0CC7-\u0CC8]
| [\u0CCA-\u0CCB]
| [\u0CCC-\u0CCD]
| [\u0CD5-\u0CD6]
| [\u0CE6-\u0CEF]
| [\u0D02-\u0D03]
| [\u0D3E-\u0D40]
| [\u0D41-\u0D43]
| [\u0D46-\u0D48]
| [\u0D4A-\u0D4C]
| '\u0D4D'
| '\u0D57'
| [\u0D66-\u0D6F]
| [\u0D82-\u0D83]
| '\u0DCA'
| [\u0DCF-\u0DD1]
| [\u0DD2-\u0DD4]
| '\u0DD6'
| [\u0DD8-\u0DDF]
| [\u0DF2-\u0DF3]
| '\u0E31'
| [\u0E34-\u0E3A]
| [\u0E47-\u0E4E]
| [\u0E50-\u0E59]
| '\u0EB1'
| [\u0EB4-\u0EB9]
| [\u0EBB-\u0EBC]
| [\u0EC8-\u0ECD]
| [\u0ED0-\u0ED9]
| [\u0F18-\u0F19]
| [\u0F20-\u0F29]
| '\u0F35'
| '\u0F37'
| '\u0F39'
| [\u0F3E-\u0F3F]
| [\u0F71-\u0F7E]
| '\u0F7F'
| [\u0F80-\u0F84]
| [\u0F86-\u0F87]
| [\u0F90-\u0F97]
| [\u0F99-\u0FBC]
| '\u0FC6'
| '\u102C'
| [\u102D-\u1030]
| '\u1031'
| '\u1032'
| [\u1036-\u1037]
| '\u1038'
| '\u1039'
| [\u1040-\u1049]
| [\u1056-\u1057]
| [\u1058-\u1059]
| '\u135F'
| [\u1369-\u1371]
| [\u1712-\u1714]
| [\u1732-\u1734]
| [\u1752-\u1753]
| [\u1772-\u1773]
| '\u17B6'
| [\u17B7-\u17BD]
| [\u17BE-\u17C5]
| '\u17C6'
| [\u17C7-\u17C8]
| [\u17C9-\u17D3]
| '\u17DD'
| [\u17E0-\u17E9]
| [\u180B-\u180D]
| [\u1810-\u1819]
| '\u18A9'
| [\u1920-\u1922]
| [\u1923-\u1926]
| [\u1927-\u1928]
| [\u1929-\u192B]
| [\u1930-\u1931]
| '\u1932'
| [\u1933-\u1938]
| [\u1939-\u193B]
| [\u1946-\u194F]
| [\u19B0-\u19C0]
| [\u19C8-\u19C9]
| [\u19D0-\u19D9]
| [\u1A17-\u1A18]
| [\u1A19-\u1A1B]
| [\u1DC0-\u1DC3]
| [\u203F-\u2040]
| '\u2054'
| [\u20D0-\u20DC]
| '\u20E1'
| [\u20E5-\u20EB]
| [\u302A-\u302F]
| [\u3099-\u309A]
| '\uA802'
| '\uA806'
| '\uA80B'
| [\uA823-\uA824]
| [\uA825-\uA826]
| '\uA827'
| '\uFB1E'
| [\uFE00-\uFE0F]
| [\uFE20-\uFE23]
| [\uFE33-\uFE34]
| [\uFE4D-\uFE4F]
| [\uFF10-\uFF19]
| '\uFF3F'
;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment