What it looks like in BNF
<myword> ::= "hello" "world" ;
What it looks like in lua, with data-structures
production_rules = { "<myword>", "::=", "hello", "world", ";" }
production_rules = { ["<myword>"]={"hello", "world"} }
terminals = { "hello ", "world" }
non_terminals = { "<myword>" }
ASIDE:
This would actually be a bit easier if BNF was defined with postfix: (since forth is a post fix language)
"world" "hello" ; <myword> ::=
with:
-- add words to definition
BNF[";"] = function (stack)
local non_terminal = pop(BNF)
BNF[non_terminal] = {}
repeat push(BNF[non_terminal], pop(stack)) until #stack == 0 ; end
In this case the termination symbol ;
, wouldn’t be as neccesary.
Since ::=
would pop the string at the top of the stack (myword
),
we could choose to either have what remains on the stack as being the definition (of myword
),
or specify the amount of words that are whithin the defintion, for example:
"world" "hello" 2 <myword> ::=
"everything" "is" "how" "world" "hello" 5 <anotherword> ::=
The reason stack-based languages don’t define their words like this (in postfix), is that the intrepreter would execute the words that are part of a definition, for example:
: add5andthenprint ( num -- ) 5 + . ; (1)
5 + . 3 add5andthenprint ::= (2)
5 + . ; add5andthenprint ::= (3)
-
a word defined, as its usually defined in forth, the
:
tells it to stop executing and enter compilation mode -
a word defined with our weird postfix notation
-
another word defined with our weird postfix notation, here’s lets assume that
;
pushes the number3
onto the stack
unfortunately since formal grammars are defined in terms of sets (𝐍, 𝚺, 𝐏, 𝐒), this code is failing a bit. Elements in sets are unordered & unique, and while that may be true of 𝐍 and 𝚺 (as they are defined), when considering 𝐏 this becomes problematic. 𝐏 is defined as the sets of production rules, but what a production rule is is normally rather informally defined as simply string rewriting.
consider:
<generic> ::= hello world bye world ;
(or if you want written:S → hello world bye world
), we see thatworld
is used twice, and the order matters here.