Skip to content

Instantly share code, notes, and snippets.

@nialloriordan
Forked from nbertagnolli/bert_emotions.ipynb
Last active February 7, 2022 12:41
Show Gist options
  • Save nialloriordan/4deec5ad99613f02201b65f26d66cf48 to your computer and use it in GitHub Desktop.
Save nialloriordan/4deec5ad99613f02201b65f26d66cf48 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@subratac
Copy link

subratac commented Sep 6, 2021

When I run this code, I get the following error:
'BertTransformer' object has no attribute 'bert_model'

@nialloriordan
Copy link
Author

nialloriordan commented Sep 6, 2021

@subratac This error is occurring because the parameter names and the attributes differ in BertTransformer. I haven't looked into what has changed to cause this error yet but updating BertTransformer to initialise as follows fixes the issue:

class BertTransformer(BaseEstimator, TransformerMixin):
    def __init__(
        self,
        tokenizer,
        model,
        max_length: int = 60,
        embedding_func: Optional[Callable[[torch.Tensor], torch.Tensor]] = None,
    ):
        self.tokenizer = tokenizer
        self.model = model
        self.model.eval()
        self.max_length = max_length
        self.embedding_func = embedding_func

        if self.embedding_func is None:
            self.embedding_func = lambda x: x[0][:, 0, :].squeeze()

@subratac
Copy link

subratac commented Sep 6, 2021

Thank you so much!!!! Let me run it again with your modifications. Appreciate your quick response

@teddddddy
Copy link

Thanks so much for sharing. That bug was so annoying,

@MVreijling
Copy link

Dit anyone run this and get results anywhere near the ones reported in Nicolas Bertagnolli's Medium post (90%+ F1 score on 7 labels)? We dont get any usefull results...

@oriordanniall
Copy link

Dit anyone run this and get results anywhere near the ones reported in Nicolas Bertagnolli's Medium post (90%+ F1 score on 7 labels)? We dont get any usefull results...

He left the following response in the comment section of the article:

There was a small bug in my original code another reader pointed out. This is the fixed version but the results are worse than presented in the article. I've been meaning to go through and fix it but haven't had time. Sorry!

I would recommend fine-tuning some of the methods such as how you extract the embeddings and also look at alternative models that are now achieving significantly higher results than BERT on benchmarks: https://super.gluebenchmark.com/leaderboard

@MVreijling
Copy link

Thanks for your response. It's not so much that I'm looking for a better model. I was just wondering if we implemented it wrong. The idea of using embeddings without any finetuning is very interesting to me, but we can't get better than an overall accuracy of 25% on this dataset. This, compared to the originally reported results is very poor.

@oriordanniall
Copy link

You might also be interested in zero shot classification if you don't want to fine-tune your embeddings.

@MVreijling
Copy link

You might also be interested in zero shot classification if you don't want to fine-tune your embeddings.

Thanks for the tip. We're looking into that. But, am I correct to deduce from your replies that you weren't able to get any useful results from this code either?

@oriordanniall
Copy link

oriordanniall commented Feb 7, 2022

Thanks for the tip. We're looking into that. But, am I correct to deduce from your replies that you weren't able to get any useful results from this code either?

For my specific use case using embeddings as features were very valuable. This code example shows the simplest method of extracting embeddings as features but you might be able to extract more value from the embeddings by:

  • using another Transformer model other than bert e.g XLNet, RoBERTa, T5 etc.
  • explore alternative methods for extracting embeddings (embedding_func) rather than only using the last layer. More information in this Github issue discussion
  • fine-tuning your model before creating embeddings as features

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment