Created
June 25, 2011 10:21
-
-
Save meijeru/1046347 to your computer and use it in GitHub Desktop.
Reds lexer (drives separator + grammar)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
REBOL [ | |
Title: "Red/System lexical analysis" | |
Date: 1-Jul-2011 | |
Name: "Reds lexer" | |
Type: none | |
Version: 1.0.0 | |
File: %/G/Projects/Common/RED/red-system/sources/reds-lexer/reds-lexer.r | |
Home: http://users.telenet.be/rwmeijer | |
Author: "Rudolf W. Meijer" | |
Rights: "Copyright (C) 2011 Rudolf W. Meijer. All Rights Reserved" | |
History: [ | |
0.0.0 [19-Jun-2011 {Start of project} "RM"] | |
0.5.0 [24-Jun-2011 {First working version} "RM"] | |
0.7.0 [27-Jun-2011 {Added file! and tuple! literals} "RM"] | |
0.8.0 [27-Jun-2011 {Simplified the separator} "RM"] | |
0.9.0 [29-Jun-2011 { | |
Separator reduced to stripping comments only, | |
Grammar takes care of whitespace | |
} "RM" ] | |
1.0.0 [1-Jul-2011 {Grammar takes care of comments also} "RM"] | |
] | |
] | |
;---|----1----|----2----|----3----|----4----|----5----|----6----|----7----|- | |
do %reds-lex-grammar.r | |
reds-lexer: func [ | |
inp [file! url! binary! string!] | |
][ | |
unless string? inp [ | |
unless binary? inp [inp: read/binary inp] | |
inp: to-string inp | |
] | |
either empty? inp | |
[ | |
print "nothing to analyse: empty input" | |
][ | |
print "start" | |
; diagnostic | |
["parse" dt [ | |
parse/all inp lex-grammar/program | |
]] | |
; parsed source is in lex-grammar/source | |
] | |
] | |
test-text: | |
%../../tests/source/units/exit-test.reds | |
print "call" | |
reds-lexer | |
;copy | |
test-text | |
; diagnostic | |
print mold/all head lex-grammar/source | |
ask "" | |
halt |
I do have a Red/System syntax grammar ready, as a Word file (BNF productions - albeit with ambiguity - and semantic comments). I will send it to your nr@red-lang.org address shortly.
This grammar does not consider the shift operators that were added just today.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
A few comments from my first review session:
It is sometime up to 50x slower than
LOAD
(using%tests/source/units/auto-tests/byte-auto-test.reds
for example, I get 11ms with LOAD and 504ms withreds-lexer
). It seems that a lot of time is spent indecode-string
, could it be rewritten usingparse
rules instead? This is not a big issue at this point, it just shows how slow REBOL is...Using
%tests/source/units/auto-tests/integer-auto-test.reds
as input, I get the following error:The previous point makes me realize that
reds-lexer
is currently lacking error catching support, including proper reporting of the input position where the scanning failed (required to be able to generate accurate syntax error messages in Red/System). This point it really important for using it as a LOAD replacement.It will require significant work (maybe a day or two) to integrate
reds-lexer
in compiler. Mainly re-wiring properly, removing or replacing deferred syntax checking in compiler and extensive testing/fixing for regressions.Related to this work, I would be very interested in a formal Red/System syntax grammar specification (ideally using BNF format). Let me know if you are interested.
I am still interested in replacing LOAD with
reds-lexer
, but I guess it is not doable before going beta (announce planned for tomorrow). I guess thatreds-lexer
integration could be achieved probably later this week or next week.Thank you for your nice work!