Last active
December 6, 2022 01:02
-
-
Save jinhangjiang/736d537f60e7e4676e0b9afd0d04ef06 to your computer and use it in GitHub Desktop.
Demo of MoreThanSentiments
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['Boilerplate'] = mts.Boilerplate(sent_tok, n = 4, min_doc = 5, get_ngram = False) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['cleaned_data'] = pd.Series() | |
for i in range(len(df['sent_tok'])): | |
df['cleaned_data'][i] = [mts.clean_data(x,\ | |
lower = True,\ | |
punctuations = True,\ | |
number = False,\ | |
unicode = True,\ | |
stop_words = False) for x in df['sent_tok'][i]] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['cleaned_data'] = df.text.apply(mts.clean_data, args=(True, True, False, True, False)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import MoreThanSentiments as mts |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pip install MoreThanSentiments |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
my_dir_path = "D:/YourDataFolder" | |
df = mts.read_txt_files(PATH = my_dir_path) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['Redundancy'] = mts.Redundancy(df.cleaned_data, n = 10) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['Relative_prevalence'] = mts.Relative_prevalence(df.text) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['sent_tok'] = df.text.apply(mts.sent_tok) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
df['Specificity'] = mts.Specificity(df.text) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment