Skip to content

Instantly share code, notes, and snippets.

@gamingflexer
Created July 19, 2023 09:02
Show Gist options
  • Star 10 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save gamingflexer/3364999976db4f8ba8df7829d7dfe384 to your computer and use it in GitHub Desktop.
Save gamingflexer/3364999976db4f8ba8df7829d7dfe384 to your computer and use it in GitHub Desktop.
Anthropic's tokenizer for Claude
from transformers import PreTrainedTokenizerFast
fast_tokenizer = PreTrainedTokenizerFast(tokenizer_file="/home/ubuntu/LLM/module/claude-v1-tokenization.json")
text = "Hello, this is a test input."
tokens = fast_tokenizer.tokenize(text)
tokens
@danikhan632
Copy link

@ayansengupta17
Copy link

Could you provide the link for Claude 3?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment