Skip to content

Instantly share code, notes, and snippets.

@mohammedkhalilia
Last active April 24, 2024 21:20
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mohammedkhalilia/72c3261734d7715094089bdf4de74b4a to your computer and use it in GitHub Desktop.
Save mohammedkhalilia/72c3261734d7715094089bdf4de74b4a to your computer and use it in GitHub Desktop.
train_flat_arabic_ner.ipynb
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@ahmedoumar
Copy link

Hello Dr. Mohammed, thanks for the notebook.

I just noticed that it does not work with other models such as MARBERTv2 and ARBERTv2.
Maybe you can look into that when you have some time.

Thanks

@mohammedkhalilia
Copy link
Author

I have not tried MARBERTv2 and ARBERTv2.
Can you let me know what error are you getting with either of those two models?

@ahmedoumar
Copy link


Screenshot from 2023-08-16 17-26-51

@mohammedkhalilia
Copy link
Author

Can you change the batch size? Try higher or lower value and let me know.

@ahmedoumar
Copy link

used small batch size and higher one and it didn't work.
maybe the tokenizer ignores some non arabic words?
even though arabertv2 which has small vocab size work fine on the data.

@mohammedkhalilia
Copy link
Author

mohammedkhalilia commented Apr 24, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment