Skip to content

Instantly share code, notes, and snippets.

@michelkana
Created August 5, 2019 14:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save michelkana/4a4da0ff66fe7510decb62eb93650632 to your computer and use it in GitHub Desktop.
Save michelkana/4a4da0ff66fe7510decb62eb93650632 to your computer and use it in GitHub Desktop.
# Load BertForSequenceClassification, the pretrained BERT model with a single linear classification layer on top.
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=nb_labels)
model.cuda()
# BERT model summary
BertForSequenceClassification(
(bert): BertModel(
(embeddings): BertEmbeddings(
(word_embeddings): Embedding(30522, 768, padding_idx=0)
(position_embeddings): Embedding(512, 768)
(token_type_embeddings): Embedding(2, 768)
(LayerNorm): BertLayerNorm()
(dropout): Dropout(p=0.1)
)
(encoder): BertEncoder(
(layer): ModuleList(
(0): BertLayer(
(attention): BertAttention(
(self): BertSelfAttention(
(query): Linear(in_features=768, out_features=768, bias=True)
(key): Linear(in_features=768, out_features=768, bias=True)
(value): Linear(in_features=768, out_features=768, bias=True)
(dropout): Dropout(p=0.1)
)
(output): BertSelfOutput(
(dense): Linear(in_features=768, out_features=768, bias=True)
(LayerNorm): BertLayerNorm()
(dropout): Dropout(p=0.1)
)
)
(intermediate): BertIntermediate(
(dense): Linear(in_features=768, out_features=3072, bias=True)
)
(output): BertOutput(
(dense): Linear(in_features=3072, out_features=768, bias=True)
(LayerNorm): BertLayerNorm()
(dropout): Dropout(p=0.1)
)
)
'
'
'
)
)
(pooler): BertPooler(
(dense): Linear(in_features=768, out_features=768, bias=True)
(activation): Tanh()
)
)
(dropout): Dropout(p=0.1)
(classifier): Linear(in_features=768, out_features=2, bias=True)
)
@TriptiAgrawal
Copy link

TriptiAgrawal commented Mar 27, 2020

I am trying to implement your code on Google Colab notebook. I am having no idea about the variable 'nb_labels' value. Default value is 2 but what value this 'nb_labels' stores as it has not been initialized previously? Also, in https://gist.github.com/michelkana/a201156d1876fc444470007a267acc80, you have used 'labels' variable, what intent values it stores? . I am also using ATIS dataset and I have initialized labels = intent_data_label_train. Correct me where I am wrong as I am getting runtime error as:

Epoch: 0%| | 0/4 [00:00<?, ?it/s]

RuntimeError Traceback (most recent call last)
in ()
34 # Forward pass
35 loss = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask, labels=b_labels)
---> 36 train_loss_set.append(loss.item())
37 # Backward pass
38 loss.backward()

RuntimeError: CUDA error: device-side assert triggered

I am a research scholar, request you for timely response. Thanks!

@dlukose
Copy link

dlukose commented Aug 22, 2020

When I tried to run the above code, I get the following "NameError"


NameError Traceback (most recent call last)
in ()
----> 1 model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=nb_labels)
2 model.cuda()

NameError: name 'nb_labels' is not defined

Where do you define "nb_labels"?

DL

@teddy-f-47
Copy link

This is a little bit late I suppose, but if somebody is encountering the same issue, I use nb_labels = len(train_labels) and it works. I think it should be equal to the number of labels in the dataset. It can be 2 if we are doing binary classification like in the case where we only have "flight" and "others" labels.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment