Skip to content

Instantly share code, notes, and snippets.

@ruv

ruv/string-literal-via-recognizer.fth Secret

Last active Jun 24, 2020
Embed
What would you like to do?
Recognizer for string literal that does not make additional parsing on resolving phase
\ String literals like "abc def"
\ via Recognizer API v4
\ http://amforth.sourceforge.net/pr/Recognizer-rfc-D.html
\ NB: additional parsing (if any) takes place on further interpreting or compiling
\ NB: POSTPONE is not applicable to a string literal containing blanks.
' noop \ actually we need to copy to a buffer here - carbon-slit
' slit, dup
rectype: rectype-slit
: parse-slit-end ( c-addr u1 -- c-addr u2 )
over + char+ source drop >in @ + <>
\ ensure that (c-addr u1) is inside SOURCE and before PARSE-AREA
if 0<> -11 and throw drop '"' parse exit then
\ -11 "result out of range"
'"' parse + over -
;
\ NB: parse-slit-end may accept the first part of a string from any location (even outside SOURCE)
\ and concatanate it with the second part in a special buffer. And it may even parse multiple
\ input lines to support multiline string literals.
\ For simplicity of the example, just a basic variant is implemented.
[: parse-slit-end ;] \ carbon-slit is missed for simplicity
[: parse-slit-end slit, ;]
[: -32 throw ;] \ -32 invalid name argument
rectype: rectype-slit-parsing
: rec-sliteral ( c-addr1 u1 -- c-addr2 u2 rectype-slit|rectype-slit-parsing | 0 )
dup 0= if 2drop rectype-null exit then
over c@ '"' <> if 2drop rectype-null exit then
1 /string dup 0= if rectype-slit-parsing exit then
\ todo: fail if '"' is found in the middle of the string (c-addr1 u1)
2dup + char- c@ '"' = if char- rectype-slit exit then
rectype-slit-parsing
;
\ Appending at the top for testing
forth-recognizer get-recognizer
'rec-markup swap 1+ forth-recognizer set-recognizer
\ String literals like "abc def"
\ via Resolver API v1
\ https://github.com/ruv/forth-design-exp/blob/master/docs/resolver-api.md
\ NB: additional parsing (if any) takes place not on resolving but on further token translating
: parse-slit-end ( c-addr u1 -- c-addr u2 )
over + char+ source drop >in @ + <>
\ ensure that (c-addr u1) is inside SOURCE and before PARSE-AREA
if 0<> -11 and throw drop '"' parse exit then
\ -11 "result out of range"
'"' parse + over -
;
: tt-slit-parsing ( c-addr1 u1 -- c-addr2 u2 | ) parse-slit-end tt-slit ;
\ NB: tt-slit is a provided token translator
: resolve-sliteral ( c-addr1 u1 -- c-addr2 u2 tt-slit|tt-slit-parsing | c-addr1 u1 0 )
`" match-head ?E0 `" match-tail if 'tt-slit exit then
\ todo: fail if '"' is found in the middle of the string (c-addr1 u1)
'tt-slit-parsing
;
\ Appending at the top for testing
' resolve-sliteral preempt-resolver
@ruv

This comment has been minimized.

Copy link
Owner Author

@ruv ruv commented Jun 24, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment