Skip to content

Instantly share code, notes, and snippets.

@Chubek
Last active March 9, 2024 16:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save Chubek/52884d1fa766fa16ae8d8f226ba105ad to your computer and use it in GitHub Desktop.
Save Chubek/52884d1fa766fa16ae8d8f226ba105ad to your computer and use it in GitHub Desktop.
EBNF Grammar for C

c.ebnf contains grammar ofr C99. Note that this is ANSI C, not ISO C, so there are some omissions. The reason I wrote this is, I am currently writing a C compiler, with my own backend (and hopefully, frontend) in OCaml. And I needed to collect the grammar in one place.

How to Read EBNF Grammars?

Reading EBNF grammars is pretty simple:

  • Enclosed within two ?s is a global capture, it does not mean optional. It means 'I am writing a free-style sentence'.
  • Enclosed within { and } means : repeat at least zero times
  • Enclosed within [ and ] means : this is optional
  • Enclosed within ( and ) means : this is a group
  • | means alternation.

Every left-hand-side denotes a rule name. They are assinged to by ::=.

If you want a nice, stylized view of EBNF, you can use my EBNF VimScript: https://gist.github.com/Chubek/37fa48239ac0de78da45475d9533b123

WARNING: This grammar may not be completely correct and compliant. If you see anything missing, please do tell me (chubakbidpaa [at] riseup [dot] net),

Grammar::Syntactic::ANSI-C
+Syntax::TopLevel
top-level ::= { top-level-elem | comment }
top-level-elem ::= decl-stmt | fn-definition
+Syntax::Functions
fun-definition ::= fun-signature compound-statement
fun-signature ::= [ kw-static ] [ kw-inline ] tyy-decl identifier '(' tyyid-pair-list ')'
+Syntax::Statements
labeled-stmt ::= identifier ':' statement-list
statement-list ::= { statement }
statement ::= decl-stmt
| for-stmt
| break-stmt
| null-stmt
| return-stmt
| compound-stmt
| continue-stmt
| if-stmt
| while-stmt
| do-while-stmt
| labeled-stmt
| goto-stmt
compound-stmt ::= '{' statement-list '}'
goto-stmt ::= kw-goto identifier ';'
null-stmt ::= ';'
return-stmt ::= kw-return expression ';'
continue-stmt ::= kw-continue ';'
break-stmt ::= kw-break ';'
if-stmt ::= kw-if '(' expr-list ')' compound-stmt { kw-else [ '(' expr-list ')' ] compound-stmt }
while-stmt ::= kw-while '(' expr-list ')' compound-stmt
do-while-stmt ::= kw-do compound-stmt kw-while '(' expr-list ')'
for-stmt ::= kw-for '(' assign-list ';' expr-list ';' expr-list ')' compound-stmt
decl-stmt ::= assign-list ';'
| tyy-defn ';'
| fun-signature ';'
+Syntax::AssignStatment
assign-list ::= assign-stmt { ',' assign-stmt }
assign ::= dyn-init
| const-init
| comp-literal
comp-init ::= [ tyy-decl ] lhs-id [ '=' [ tyy-case ] comp-literal ]
dyn-init ::= [ tyy-decl ] lhs-id [ '=' [ tyy-cast ] expression ]
const-init ::= [ tyy-decl ] lhs-id [ '=' [ tyy-cast ] const-literal ]
lhs-id ::= [ kw-const ] [ '*' ] ( long-index-opt | index-opt )
+Syntax::Expression
expr-list ::= expression { ',' expression }
expression ::= primary
| unary
| binary
| ternary
| inplace
| synthesized
synthesized ::= '(' assign-list ')'
inplace ::= long-index-opt ( "+=" | "-="
| "*=" | "/="
| "%=" | "<<="
| ">>=" | "&="
| "|=" | "^=" ) expression
ternary ::= binary '?' binary ':' binary
+Syntax::Binary-Expression
binary ::= logor-expr
logor-expr ::= logand-expr "||" logor-expr
logand-expr ::= bitor-expr "&&" logand-expr
bitor-expr ::= bitxor-expr '|' bitor-expr
bitxor-expr ::= bitand-expr '^' bitxor-expr
bitand-expr ::= eqop-expr '&' bitand-expr
eqop-expr ::= relop-expr "==" eqop-expr
| relop-expr "!=" eqop-expr
relop-expr ::= shift-expr '<' relop-expr
| shift-expr '>' relop-expr
| shift-expr "<=" relop-expr
| shift-expr ">=" relop-expr
shift-expr ::= add-expr ">>" shift-expr
| add-expr "<<" shift-expr
add-expr ::= mult-expr '+' add-expr
| mult-expr '-' add-expr
mult-expr ::= unary '*' multi-expr
| unary '/' multi-expr
| unary '%' mutli-expr
+Syntax::Unary-Expression
unary ::= unary-plus
| unary-minus
| unary-1scomp
| unary-2scomp
| unary-lnot
| unary-ref
| unary-deref
| unary-preinc
| unary-predec
| unary-postinc
| unary-postdec
| unary-offsetof
| unary-sizeof
unary-offsetof ::= "offsetof" primary
unary-sizeof ::= "sizeof" primary
unary-postdec ::= primary "--"
unary-postinc ::= primary "++"
unary-predec ::= "--" primary
unary-preinc ::= "++" primary
unary-deref ::= '*' primary
unary-ref ::= '&' primary
unary-lnot ::= '!' primary
unary-2scompl ::= '~' primary
unary-1scompl ::= '-' primary
unary-plus ::= '+' primary
+Syntax::Primary
primary ::= prim-expr
| prim-funcall
| prim-literal
| prim-ident
prim-expr ::= '(' expression ')'
prim-funcall ::= identifier '(' expr-list ')'
prim-literal ::= const-literal
prim-ident ::= long-index-opt
| index-opt
+Syntax::Types
tyyid-pair-list ::= tyyid-pair { ',' tyyid-pair }
tyyid-pair ::= tyy-decl [ kw-const ] [ '*' ] identifier
tyy-lit ::= tyy-enum-lit
| tyy-ext-lit
tyy-enum-lit ::= '{' tyy-enum-field { ',' tyy-enum-field } '}'
tyy-enum-field ::= identifier [ '=' int-const ]
tyy-ext-lit ::= '{' tyy-ext-field { ';' tyy-ext-field } '}'
tyy-ext-field ::= tyy-decl identifier
tyy-decl ::= [ tyy-storage ] [ tyy-qualifier ] [ '*' ] tyy-body
tyy-defn ::= kw-typedef tyy-body tyy-alias
tyy-storage ::= kw-auto
| kw-static
| kw-extern
| kw-register
tyy-qualifier ::= kw-const
| kw-volatile
| kw-restrict
tyy-cast ::= '(' tyy-ref ')'
tyy-ref ::= tyy-body [ '*' ]
tyy-body ::= tyy-base
| tyy-ext
| tyy-lit
tyy-ext ::= tyy-ext-union
| tyy-ext-enum
| tyy-ext-struct
tyy-ext-union ::= kw-union tyy-alias
tyy-ext-enum ::= kw-enum tyy-alias
tyy-ext-struct ::= kw-struct tyy-alias
tyy-alias ::= identifier
tyy-base ::= [ tyy-base-word ] [ tyy-base-sign ] tyy-base-body
tyy-base-word ::= kw-long
| kw-short
tyy-base-sign ::= kw-signed
| kw-unsigned
tyy-base-body ::= kw-char
| kw-int
| kw-float
| kw-double
| kw-void
+Syntax::Idents
long-index-opt ::= long-ident [ index ]
index-opt ::= identifier [ index ]
index ::= '[' [ long-ident | const-literal ] ']'
long-ident ::= identifier { ( dot-ident | arrow-ident ) }
dot-ident ::= identifier { '.' identifier }
arrow-ident ::= identifier { "->" identifier }
+Syntax::CompoundLiteral
comp-literal ::= [ '(' tyy-decl ')' ] list-literal
list-literal ::= '{' designated-init { ',' designated-init } '}'
designated-init ::= [ '.' identifier '=' ] const-literal
Gramamr::Lexical::ANSI-C
+Lexical::Keywords
kw-auto ::= "auto"
kw-break ::= "break"
kw-case ::= "case"
kw-char ::= "char"
kw-const ::= "const"
kw-continue ::= "continue"
kw-default ::= "default"
kw-do ::= "do"
kw-double ::= "double"
kw-else ::= "else"
kw-enum ::= "enum"
kw-extern ::= "extern"
kw-float ::= "float"
kw-for ::= "for"
kw-goto ::= "goto"
kw-if ::= "if"
kw-int ::= "int"
kw-long ::= "long"
kw-register ::= "register"
kw-return ::= "return"
kw-short ::= "short"
kw-signed ::= "signed"
kw-sizeof ::= "sizeof"
kw-static ::= "static"
kw-struct ::= "struct"
kw-switch ::= "switch"
kw-typedef ::= "typedef"
kw-union ::= "union"
kw-unsigned ::= "unsigned"
kw-void ::= "void"
kw-volatile ::= "volatile"
kw-while ::= "while"
+Lexical::LiteralTokens
const-literal ::= expr-const | str-const | char-const | num-const | int-const | null-const
null-const ::= "NULL"
expr-const ::= int-const { ( '+' | '-' | '*' | '%' | '/' | '&' | '|' ) int-const }
str-const ::= [ 'L' ] '"' { character } '"'
char-const ::= "'" character "'"
num-const ::= integer | rational
float-const ::= float
int-const ::= integer | char-const
rational ::= [ intenger ] '.' integer
integer ::= dec-digit | hex-digit | oct-digit | bin-digit
bin-integer ::= ( "0b" | "0B" ) bin-digit { bin-digit }
oct-integer ::= ( "0o" | "0O" ) oct-digt { oct-digit }
hex-integer ::= ( "0x" | "0X" ) hex-digit { hex-digit }
dec-integer ::= digit { digit }
identifier ::= ( letter | '_' ) { letter | digit | '_' }
letter ::= upper-case | lower-case
lower-case ::= 'a' | 'b' | 'c' | ... | 'x' | 'y' | 'z'
upper-case ::= 'A' | 'B' | 'C' | ... | 'X' | 'Y' | 'Z'
hex-digit ::= digit | 'a' | 'b' | 'c' | 'd' | 'e' | 'f'
| 'A' | 'B' | 'C' | 'D' | 'E' | 'F'
digit ::= oct-digit | '8' | '9'
oct-digit ::= bin-digit | '2' | '3' | '4' | '5' | '6' | '7'
bin-digit ::= '0' | '1'
character ::= printable | char-escape | hex-escape | oct-escape
printable ::= ' ' | '!' | '"' | ... | '|' | '}' | '~'
char-escape ::= '\' escapable
hex-escape ::= "\x" hex-digit hex-digit
oct-escape ::= '\' oct-digit oct-dit oct-digit
escapble ::= 'n' | 'a' | 'b' | 't' | 'f' | 'r' | 'v' | '\' | "'" | '"' | '?' | '0'
comment ::= "// " ? any-char ? "\n"
| "/*" ? any-char "*/"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment