Skip to content

Instantly share code, notes, and snippets.

@Ab1992ao
Created May 17, 2021 08:40
Show Gist options
  • Save Ab1992ao/6a9d061e9d6a5c25f14f902c84b6c397 to your computer and use it in GitHub Desktop.
Save Ab1992ao/6a9d061e9d6a5c25f14f902c84b6c397 to your computer and use it in GitHub Desktop.
prepare toxic data for multitask learning
def load_toxic_data(tox_path):
tox = pd.read_csv(tox_path)
#remove ' ' before and after text
tox['text'] = tox['text'].map(lambda x: str(x).lstrip().rstrip())
#toxic = 1, other = 0
tox['sentiment'] = tox['sentiment'].map(lambda x: 0 if x in ['positive','neutral'] else 1)
toxic_text, toxic_labels = tox.text.values, tox.sentiment.values
return toxic_text, toxic_labels
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment