Skip to content

Instantly share code, notes, and snippets.

@lakshaychhabra
Last active December 7, 2019 15:09
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save lakshaychhabra/f1229f7fa5cdaedb3e9184186212511f to your computer and use it in GitHub Desktop.
Save lakshaychhabra/f1229f7fa5cdaedb3e9184186212511f to your computer and use it in GitHub Desktop.
labels = {"c#" : 0, "java" : 1, "c++" : 2, "c" : 3, "ios" : 4}
labels_map = { 0 : "c#" , 1 : "java" , 2 : "c++" , 3 : "c", 4 : "ios"}
processed["Tags"] = processed["Tags"].map(labels)
X = processed.Title.values
y = processed.Tags.values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20,stratify = y)
X_train, X_cv, y_train, y_cv = train_test_split(X_train, y_train, test_size=0.25, stratify = y_train)
tfidf = TfidfVectorizer()
X_train = tfidf.fit_transform(X_train) X_cv = tfidf.transform(X_cv)
X_test = tfidf.transform(X_test)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment