Skip to content

Instantly share code, notes, and snippets.

@gnat
Last active July 4, 2024 16:56
Show Gist options
  • Save gnat/062758153e0189a5b4e08d81bd932f2b to your computer and use it in GitHub Desktop.
Save gnat/062758153e0189a5b4e08d81bd932f2b to your computer and use it in GitHub Desktop.

Programming Language Design

Do I need a tokenizer?

Many "languages" can get away without one, however, if your language has one of:

  • strings
  • comments
  • code blocks (markdown, but bbcode can get away without one because there's an explicit closing [/code] tag)

Your design will naturally lead to a simple tokenizer. Those blocks need to be excluded from the rest of the conversion in some way.

In the simplest implementations this may look like just an alias, or an actual list of tokens as a sidecar to the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment