Skip to content

Instantly share code, notes, and snippets.

@yuyasugano
Created September 25, 2020 06:54
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yuyasugano/13fe7c43a3fe016227eab649c5836a88 to your computer and use it in GitHub Desktop.
Save yuyasugano/13fe7c43a3fe016227eab649c5836a88 to your computer and use it in GitHub Desktop.
Sentiment analysis with NLTK and Scikit-learn TfidfVectorizer
from imblearn.over_sampling import SMOTE
X = train_vec
y = train_tweets['label']
print('X shape: {}, y shape: {}'.format(X.shape, y.shape))
sm = SMOTE(sampling_strategy='auto', random_state=39)
X_resampled, y_resampled = sm.fit_sample(X, y)
print('Resampled label balance:\n{}'.format(y_resampled.value_counts()))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment