Skip to content

Instantly share code, notes, and snippets.

@iTrooz
Created June 25, 2024 11:40
Show Gist options
  • Save iTrooz/254c4d97d1cc7dede4396647e020e3cd to your computer and use it in GitHub Desktop.
Save iTrooz/254c4d97d1cc7dede4396647e020e3cd to your computer and use it in GitHub Desktop.
CLI to count number of tokens in files using tiktoken
#!/usr/bin/env python3
import tiktoken
import sys
if len(sys.argv) == 1:
print(f"Syntax: {sys.argv[0]} <file>")
exit(1)
enc = tiktoken.get_encoding("cl100k_base")
total = 0
for i in range(len(sys.argv)-1):
with open(sys.argv[i+1], "r") as f:
count = len(enc.encode(f.read()))
print(count, "\t", sys.argv[i+1])
total+=count
if len(sys.argv) > 2:
print("Total:", total)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment