Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
A very simple example showing how to use Racket's lexing and parsing utilities
#lang racket
(require parser-tools/lex
(prefix-in re- parser-tools/lex-sre)
parser-tools/yacc)
(provide (all-defined-out))
(define-tokens a (NUM VAR))
(define-empty-tokens b (+ - EOF LET IN))
(define-lex-trans number
(syntax-rules ()
((_ digit)
(re-: (re-? (re-or "-" "+")) (uinteger digit)
(re-? (re-: "." (re-? (uinteger digit))))))))
(define-lex-trans uinteger
(syntax-rules ()
((_ digit) (re-+ digit))))
(define-lex-abbrevs
(digit10 (char-range "0" "9"))
(number10 (number digit10))
(identifier-characters (re-or (char-range "A" "z")
"?" "!" ":" "$" "%" "^" "&"))
(identifier (re-+ identifier-characters)))
(define simple-math-lexer
(lexer
("-" (token--))
("+" (token-+))
("let" (token-LET))
("in" (token-IN))
((re-+ number10) (token-NUM (string->number lexeme)))
(identifier (token-VAR lexeme))
;; recursively calls the lexer which effectively skips whitespace
(whitespace (simple-math-lexer input-port))
((eof) (token-EOF))))
(define-struct let-exp (var num exp))
(define-struct arith-exp (op e1 e2))
(define-struct num-exp (n))
(define-struct var-exp (i))
(define simple-math-parser
(parser
(start exp)
(end EOF)
(error void)
(tokens a b)
(precs (left - +))
(grammar
(exp ((LET VAR NUM IN exp)
(make-let-exp $2 (num-exp $3) $5))
((NUM) (num-exp $1))
((VAR) (var-exp $1))
((exp + exp) (make-arith-exp + $1 $3))
((exp - exp) (make-arith-exp - $1 $3))))))
(define (eval parsed-exp)
(match parsed-exp
((let-exp var num exp)
(eval (subst var num exp)))
((arith-exp op e1 e2)
(op (eval e1)
(eval e2)))
((num-exp n) n)
((var-exp i) (error 'eval "undefined identifier ~a" i))))
(define (subst var num exp)
(match exp
((let-exp var2 num2 exp2)
(if (eq? var var2)
exp
(let-exp var2 num2
(subst var num exp2))))
((arith-exp op e1 e2)
(arith-exp op
(subst var num e1)
(subst var num e2)))
((var-exp id)
(if (equal? id var)
num
exp))
((num-exp n) exp)))
(define (lex-this lexer input) (lambda () (lexer input)))
(let ((input (open-input-string "3 - 3.3 + 6")))
(eval (simple-math-parser (lex-this simple-math-lexer input))))
(let ((input (open-input-string "let foo 6 in 3 - 3.3 + foo")))
(eval (simple-math-parser (lex-this simple-math-lexer input))))
@legmar
legmar commented May 15, 2012

Awesome example! I have extended this grammar quite a bit on my own project. One question though... any pointers on how to include a new token such as WOR that could recognize strings for names of variables? (i.e. Just like NUM matches numbers, I'm trying to find a way to get WOR to match words such as "x" or "y".)

So far, I tried created a new token WOR and a char-range from "a" "z", but I couldn't quite get it to work. I keep getting an error that I have an unbound identifier when I try to do ((re-+ word10) (token-WOR (string->word lexeme))). (Note, I am using word10 just to show consistency in my attempt to copy the format used for decimal numbers on the alphabet token.)

Anyway, hope that makes some sense as to what I'm attempting to accomplish.

Thanks so much for any advice! :D

@legmar
legmar commented May 15, 2012

An hour later, I may have made some progress...

Something like the following seems to fix at least some of my problems.

((re-+ word10) (token-WOR (string->symbol lexeme)))

string->symbol works but string->word does not. Guess "symbol" is some keyword about which I am uneducated. O_o

@danking
Owner
danking commented May 15, 2012
@legmar
legmar commented May 15, 2012

Thanks! That helps a lot! That was a very clear, helpful, concise, and awesome explanation! I'm travelling today, but I will post my project soon just for reference.

@wedesoft

Hi! Thanks for your work. Very concise example :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment