Skip to content

Instantly share code, notes, and snippets.

@MilesCranmer
Created September 29, 2023 06:25
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save MilesCranmer/cd8f64b20b4bd9ed6eb0eba93bd8a204 to your computer and use it in GitHub Desktop.
Save MilesCranmer/cd8f64b20b4bd9ed6eb0eba93bd8a204 to your computer and use it in GitHub Desktop.
Easily count the number of tokens in text from the command line
function num_tokens {
prun python -c 'import sys; import tiktoken; s = "\n".join([line for line in sys.stdin]); encoding = tiktoken.get_encoding("cl100k_base"); print(len(encoding.encode(s)))'
}
# e.g., `echo "Hello World!" | num_tokens`
# this should give 3 tokens.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment