Here's how the Chasm lexer is meant to work:
The primary function, lex()
, is called by main and processes the input source file into a super-list of tokens. This is done by iterating over each line of the source file and calling tokenize_line()
on it. This function then processes the line of code into a list of tokens, usually involving other helper functions for specific types of arguments.
There is a context issue when it comes to identifier tokens, however. See, tokenize_line()
doesn't have any context besides the line it is processing. As such, when defining identifiers the function can easily set the identifier type. However, when an identifier is being used the type simply cannot be set.
There are two ways to handle this issue: either give tokenize_line()
more context, or have lex()
do a final run through the tokens. The first option is great for handling symbol identifiers (as they must be defined before they are used) but not so great for handling label identi