Skip to content

Instantly share code, notes, and snippets.

What would you like to do?
import torch
from pytorch_transformers import BertTokenizer, BertModel, BertForMaskedLM
# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# Tokenize input
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = tokenizer.tokenize(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment