Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save TaylorHawkes/f5a4dcf834fea642f6c2755c55d8d778 to your computer and use it in GitHub Desktop.
Save TaylorHawkes/f5a4dcf834fea642f6c2755c55d8d778 to your computer and use it in GitHub Desktop.
This gist is part of my blogpost on BERT. Find the complete blogpost, covering both theory and hands-on part, here: https://towardsml.com/2019/09/17/bert-explained-a-complete-guide-with-theory-and-tutorial/
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@TaylorHawkes
Copy link
Author

For some reason when saving that alpha (a) saves as first column and was messing up the training.
I changed that "alpha" column to "poop" and it fixed it. (think it is just saving columns alphabetically, maybe there is better fix here haha)

df_bert = pd.DataFrame({
'id':range(len(train_df)),
'label':train_df[0],
'poop':['a']*train_df.shape[0],
'text': train_df[1].replace(r'\n', ' ', regex=True)
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment