Skip to content

Instantly share code, notes, and snippets.

View arteagac's full-sized avatar

Cristian Arteaga arteagac

View GitHub Profile
@arteagac
arteagac / expand_bert_beyond_512_tokens.py
Created November 16, 2023 23:38
Expand BERT beyond 512 tokens
# Load the model
from transformers import AutoTokenizer, AutoModelForQuestionAnswering
import torch
tokenizer = AutoTokenizer.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
model = AutoModelForQuestionAnswering.from_pretrained("bert-large-uncased-whole-word-masking-finetuned-squad")
## EXPAND POSITION EMBEDDINGS TO 1024 TOKENS
max_length = 1024
tokenizer.model_max_length = max_length