Skip to content

Instantly share code, notes, and snippets.

@paniq
Last active March 4, 2024 09:15
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save paniq/ff7c84972a3fa9a862cc1ab12f92b035 to your computer and use it in GitHub Desktop.
Save paniq/ff7c84972a3fa9a862cc1ab12f92b035 to your computer and use it in GitHub Desktop.
using import struct enum print itertools
enum Token plain
None
Word
Open
Close
@@ finite-state-machine
struct Tokenizer
state = Token.None
x0 = 0:usize
x1 = 0:usize
inline insert (self send ch)
x2 := self.x1 + 1
newstate group? := switch ch
pass c" "
pass c"\t"
pass c"\n"
pass c"\r"
pass c"\x00"
do
pass Token.None true
case c"("
pass Token.Open false
case c")"
pass Token.Close false
default
pass Token.Word true
if ((newstate == self.state) and group?)
self.x1 = x2
else
send self.x0 self.x1 self.state
self.state = newstate
self.x0 = self.x1
self.x1 = x2
# Output:
Token.None ""
Token.Word the
Token.None " "
Token.Word quick
Token.Open "("
Token.Word brown
Token.None " "
Token.Open "("
Token.Word fox
Token.None " "
Token.Close ")"
Token.Close ")"
Token.None " "
Token.Word jumped
Token.None " "
Token.Word over
Token.None " "
Token.Word the
Token.None 2314:i16
Token.Word lazy
Token.None 0x9:u8
Token.Word dog
# input must explicitly insert a zero as terminator
S := "the quick(brown (fox )) jumped over the\n\tlazy\tdog\x00"
print
->> S
Tokenizer
map
inline (x0 x1 state)
print2 state
/prettydata
& (S @ x0)
x1 - x0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment