Created
August 5, 2019 14:26
-
-
Save michelkana/4a4da0ff66fe7510decb62eb93650632 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Load BertForSequenceClassification, the pretrained BERT model with a single linear classification layer on top. | |
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=nb_labels) | |
model.cuda() | |
# BERT model summary | |
BertForSequenceClassification( | |
(bert): BertModel( | |
(embeddings): BertEmbeddings( | |
(word_embeddings): Embedding(30522, 768, padding_idx=0) | |
(position_embeddings): Embedding(512, 768) | |
(token_type_embeddings): Embedding(2, 768) | |
(LayerNorm): BertLayerNorm() | |
(dropout): Dropout(p=0.1) | |
) | |
(encoder): BertEncoder( | |
(layer): ModuleList( | |
(0): BertLayer( | |
(attention): BertAttention( | |
(self): BertSelfAttention( | |
(query): Linear(in_features=768, out_features=768, bias=True) | |
(key): Linear(in_features=768, out_features=768, bias=True) | |
(value): Linear(in_features=768, out_features=768, bias=True) | |
(dropout): Dropout(p=0.1) | |
) | |
(output): BertSelfOutput( | |
(dense): Linear(in_features=768, out_features=768, bias=True) | |
(LayerNorm): BertLayerNorm() | |
(dropout): Dropout(p=0.1) | |
) | |
) | |
(intermediate): BertIntermediate( | |
(dense): Linear(in_features=768, out_features=3072, bias=True) | |
) | |
(output): BertOutput( | |
(dense): Linear(in_features=3072, out_features=768, bias=True) | |
(LayerNorm): BertLayerNorm() | |
(dropout): Dropout(p=0.1) | |
) | |
) | |
' | |
' | |
' | |
) | |
) | |
(pooler): BertPooler( | |
(dense): Linear(in_features=768, out_features=768, bias=True) | |
(activation): Tanh() | |
) | |
) | |
(dropout): Dropout(p=0.1) | |
(classifier): Linear(in_features=768, out_features=2, bias=True) | |
) |
When I tried to run the above code, I get the following "NameError"
NameError Traceback (most recent call last)
in ()
----> 1 model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=nb_labels)
2 model.cuda()
NameError: name 'nb_labels' is not defined
Where do you define "nb_labels"?
DL
This is a little bit late I suppose, but if somebody is encountering the same issue, I use nb_labels = len(train_labels)
and it works. I think it should be equal to the number of labels in the dataset. It can be 2
if we are doing binary classification like in the case where we only have "flight" and "others" labels.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am trying to implement your code on Google Colab notebook. I am having no idea about the variable 'nb_labels' value. Default value is 2 but what value this 'nb_labels' stores as it has not been initialized previously? Also, in https://gist.github.com/michelkana/a201156d1876fc444470007a267acc80, you have used 'labels' variable, what intent values it stores? . I am also using ATIS dataset and I have initialized labels = intent_data_label_train. Correct me where I am wrong as I am getting runtime error as:
Epoch: 0%| | 0/4 [00:00<?, ?it/s]
RuntimeError Traceback (most recent call last)
in ()
34 # Forward pass
35 loss = model(b_input_ids, token_type_ids=None, attention_mask=b_input_mask, labels=b_labels)
---> 36 train_loss_set.append(loss.item())
37 # Backward pass
38 loss.backward()
RuntimeError: CUDA error: device-side assert triggered
I am a research scholar, request you for timely response. Thanks!