-
-
Save meijeru/1046347 to your computer and use it in GitHub Desktop.
REBOL [ | |
Title: "Red/System lexical analysis" | |
Date: 1-Jul-2011 | |
Name: "Reds lexer" | |
Type: none | |
Version: 1.0.0 | |
File: %/G/Projects/Common/RED/red-system/sources/reds-lexer/reds-lexer.r | |
Home: http://users.telenet.be/rwmeijer | |
Author: "Rudolf W. Meijer" | |
Rights: "Copyright (C) 2011 Rudolf W. Meijer. All Rights Reserved" | |
History: [ | |
0.0.0 [19-Jun-2011 {Start of project} "RM"] | |
0.5.0 [24-Jun-2011 {First working version} "RM"] | |
0.7.0 [27-Jun-2011 {Added file! and tuple! literals} "RM"] | |
0.8.0 [27-Jun-2011 {Simplified the separator} "RM"] | |
0.9.0 [29-Jun-2011 { | |
Separator reduced to stripping comments only, | |
Grammar takes care of whitespace | |
} "RM" ] | |
1.0.0 [1-Jul-2011 {Grammar takes care of comments also} "RM"] | |
] | |
] | |
;---|----1----|----2----|----3----|----4----|----5----|----6----|----7----|- | |
do %reds-lex-grammar.r | |
reds-lexer: func [ | |
inp [file! url! binary! string!] | |
][ | |
unless string? inp [ | |
unless binary? inp [inp: read/binary inp] | |
inp: to-string inp | |
] | |
either empty? inp | |
[ | |
print "nothing to analyse: empty input" | |
][ | |
print "start" | |
; diagnostic | |
["parse" dt [ | |
parse/all inp lex-grammar/program | |
]] | |
; parsed source is in lex-grammar/source | |
] | |
] | |
test-text: | |
%../../tests/source/units/exit-test.reds | |
print "call" | |
reds-lexer | |
;copy | |
test-text | |
; diagnostic | |
print mold/all head lex-grammar/source | |
ask "" | |
halt |
I realize it is not geared to #include and #define, so that part will have to be (re-)done anyway.
A few comments from my first review session:
-
It is sometime up to 50x slower than
LOAD
(using%tests/source/units/auto-tests/byte-auto-test.reds
for example, I get 11ms with LOAD and 504ms withreds-lexer
). It seems that a lot of time is spent indecode-string
, could it be rewritten usingparse
rules instead? This is not a big issue at this point, it just shows how slow REBOL is... -
Using
%tests/source/units/auto-tests/integer-auto-test.reds
as input, I get the following error:** Math Error: Math or number overflow ** Where: store-integer ** Near: pow: pow * 10
-
The previous point makes me realize that
reds-lexer
is currently lacking error catching support, including proper reporting of the input position where the scanning failed (required to be able to generate accurate syntax error messages in Red/System). This point it really important for using it as a LOAD replacement. -
It will require significant work (maybe a day or two) to integrate
reds-lexer
in compiler. Mainly re-wiring properly, removing or replacing deferred syntax checking in compiler and extensive testing/fixing for regressions. -
Related to this work, I would be very interested in a formal Red/System syntax grammar specification (ideally using BNF format). Let me know if you are interested.
I am still interested in replacing LOAD with reds-lexer
, but I guess it is not doable before going beta (announce planned for tomorrow). I guess that reds-lexer
integration could be achieved probably later this week or next week.
Thank you for your nice work!
I do have a Red/System syntax grammar ready, as a Word file (BNF productions - albeit with ambiguity - and semantic comments). I will send it to your nr@red-lang.org address shortly.
This grammar does not consider the shift operators that were added just today.
Lexer and grammar files downloaded, having just a quick look now, will do all the testing tomorrow. Hope it could be used as a drop-in replacement to LOAD (or at least would not require too much work to do so). It looks very exciting anyway. :-)