Skip to content

Instantly share code, notes, and snippets.

View VictoireSagbo's full-sized avatar

Denise SAGBO VictoireSagbo

View GitHub Profile
@VictoireSagbo
VictoireSagbo / text_preprocessing.py
Created June 4, 2021 17:53 — forked from jiahao87/text_preprocessing.py
Full code for preprocessing text
from bs4 import BeautifulSoup
import spacy
import unidecode
from word2number import w2n
import contractions
nlp = spacy.load('en_core_web_md')
# exclude words from spacy stopwords list
deselect_stop_words = ['no', 'not']