Skip to content

Instantly share code, notes, and snippets.

@jinhangjiang
Last active December 6, 2022 01:02
Show Gist options
  • Save jinhangjiang/736d537f60e7e4676e0b9afd0d04ef06 to your computer and use it in GitHub Desktop.
Save jinhangjiang/736d537f60e7e4676e0b9afd0d04ef06 to your computer and use it in GitHub Desktop.
Demo of MoreThanSentiments
df['Boilerplate'] = mts.Boilerplate(sent_tok, n = 4, min_doc = 5, get_ngram = False)
df['cleaned_data'] = pd.Series()
for i in range(len(df['sent_tok'])):
df['cleaned_data'][i] = [mts.clean_data(x,\
lower = True,\
punctuations = True,\
number = False,\
unicode = True,\
stop_words = False) for x in df['sent_tok'][i]]
df['cleaned_data'] = df.text.apply(mts.clean_data, args=(True, True, False, True, False))
import MoreThanSentiments as mts
pip install MoreThanSentiments
my_dir_path = "D:/YourDataFolder"
df = mts.read_txt_files(PATH = my_dir_path)
df['Redundancy'] = mts.Redundancy(df.cleaned_data, n = 10)
df['Relative_prevalence'] = mts.Relative_prevalence(df.text)
df['sent_tok'] = df.text.apply(mts.sent_tok)
df['Specificity'] = mts.Specificity(df.text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment