Skip to content

Instantly share code, notes, and snippets.

View Faker_names.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View Faker_names.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View English_Egyptian_name.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View NameClassifier_train_test_split.py
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(word_mat, y, test_size=0.3)
View NameClassifier_train_test_split.py
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(word_mat, y, test_size=0.3)
View NameClassifier_model_instantiate.py
from sklearn.naive_bayes import MultinomialNB
# instantiate the model as clf(classifier) and train it
clf = MultinomialNB()
clf.fit(x_train, y_train)
View NameClassifier_CountVectorizer.py
from sklearn.feature_extraction.text import CountVectorizer
# Initialize and fit CountVectorizer with given text documents
vectorizer = CountVectorizer().fit(df['name'])
# use the vectorizer to transform the document into word count vectors (Sparse)
word_mat = vectorizer.transform(df['name'])
View NameClassifer_label_encode.py
from sklearn.preprocessing import OrdinaryEncoder
# creating mapping from unique label texts to unique integers
# note this can be re-used to encode and decode the labels after as well
encoder = OrdinaryEncoder().fit(df['code'])
# using the encoder to encode the entire dataset
y = encoder.transform(encoder)
@wtberry
wtberry / NameClassifier_data_load.ipynb
Created Jun 23, 2019
medium/NameClassifier/dataload
View NameClassifier_data_load.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View NameClassifier1.py
import pandas as pd
import os
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.model_selection import train_test_split
# setting up path to the data file
PATH = os.path.dirname(os.path.realpath(__file__))
PATH = os.path.join(PATH, 'data')
print(PATH)
You can’t perform that action at this time.