Skip to content

Instantly share code, notes, and snippets.

@avidale
Created April 30, 2021 21:51
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 7 You must be signed in to fork a gist
  • Save avidale/44cd35bfcdaf8bedf51d97c468cc8001 to your computer and use it in GitHub Desktop.
Save avidale/44cd35bfcdaf8bedf51d97c468cc8001 to your computer and use it in GitHub Desktop.
create_rut5-base.ipynb
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Nehc
Copy link

Nehc commented Mar 2, 2024

Should this work with XLMRobertaModel, like e5-large? Or is something fundamentally different being used there. It didn't work out for me.

@avidale
Copy link
Author

avidale commented Mar 2, 2024

@Nehc

Should this work with XLMRobertaModel, like e5-large? Or is something fundamentally different being used there. It didn't work out for me.

As I can judge from the HF documentation, XLMRobertaTokenizer is based on SentencePiece, just like T5Tokenizer. Thus, in principle, the approach should work; I don't see any fundamental reasons why it wouldn't.

Nevertheless, the specific details, such as model parameter names, tokenizer parameter names, special tokens etc. may differ between T5 and XLMRoberta, so my code will surely need some adaptation to work with E5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment