Skip to content

Instantly share code, notes, and snippets.

@mohdsanadzakirizvi
Created July 18, 2019 09:48
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mohdsanadzakirizvi/dc390826b048e24b3aad1976ce4b0c2a to your computer and use it in GitHub Desktop.
Save mohdsanadzakirizvi/dc390826b048e24b3aad1976ce4b0c2a to your computer and use it in GitHub Desktop.
import torch
from pytorch_transformers import BertTokenizer, BertModel, BertForMaskedLM
# Load pre-trained model tokenizer (vocabulary)
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
# Tokenize input
text = "[CLS] Who was Jim Henson ? [SEP] Jim Henson was a puppeteer [SEP]"
tokenized_text = tokenizer.tokenize(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment