This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Gerekli Kütüphanelerin import edilmesi | |
| import warnings | |
| warnings.filterwarnings("ignore", category=FutureWarning) | |
| warnings.filterwarnings("ignore", category=UserWarning) | |
| from sklearn.metrics import classification_report, f1_score, confusion_matrix, accuracy_score, roc_auc_score,roc_curve, precision_score, recall_score | |
| from sklearn.model_selection import cross_val_score | |
| from sklearn.model_selection import train_test_split | |
| from sklearn.linear_model import LogisticRegression | |
| from sklearn.tree import DecisionTreeClassifier | |
| from sklearn.naive_bayes import MultinomialNB |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Veri yükleme | |
| ogrenme_seti = pd.read_csv('set/ogrenme_seti.csv', sep=',') | |
| corpus = pd.read_csv('set/hepsiburada_corpus.csv', sep=';') | |
| #Veriler arasında farklar olduğu için sep değişkenleri dosyadan dosyaya değişmekte. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Feature Normalization | |
| #Kelime Sayısı Normalizasiyonu | |
| kelsay_normal = [] | |
| for i in ogrenme_seti['kelsay']: | |
| normal = (i - ogrenme_seti['kelsay'].min()) / (ogrenme_seti['kelsay'].max() - ogrenme_seti['kelsay'].min()) | |
| kelsay_normal.append(normal) | |
| #Set'e eklenme | |
| ogrenme_seti['kelsay'] = kelsay_normal |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Model Tanımları | |
| clf = MultinomialNB() | |
| lr = LogisticRegression() | |
| dtc = DecisionTreeClassifier() | |
| rfc = RandomForestClassifier() | |
| gradient = GradientBoostingClassifier() | |
| xgb = XGBClassifier() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Model Eğitimleri. | |
| #MultinomialNB Best random_state=3305 | |
| X_train, X_test, y_train, y_test = train_test_split(ogrenme_seti.iloc[:, 0:4], ogrenme_seti.iloc[:,-1:], test_size=0.25,random_state=3305) | |
| clf.fit(X_train,y_train.values.ravel()) | |
| y_pred_test = clf.predict(X_test) | |
| print("Naive Bayes::\n", confusion_matrix(y_test,y_pred_test), "\n") | |
| f1_1 = f1_score(y_test,y_pred_test,average='macro') | |
| print(classification_report(y_test,y_pred_test)) | |
| print("Accuracy: ",accuracy_score(y_test,y_pred_test)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Algoritma Bazlı Cross Validation ile Accuracy Skoru Hesaplama. | |
| score= cross_val_score(clf, X_train, y_train.values.ravel(), cv=5) | |
| score2= cross_val_score(lr, X_train, y_train.values.ravel(), cv=5) | |
| score3= cross_val_score(dtc, X_train, y_train.values.ravel(), cv=5) | |
| score4= cross_val_score(rfc, X_train, y_train.values.ravel(), cv=5) | |
| score5= cross_val_score(gradient, X_train, y_train.values.ravel(), cv=5) | |
| score6= cross_val_score(xgb, X_train, y_train.values.ravel(), cv=5) | |
| #Görselleştirme |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Algoritma Bazlı ROC-AUC Skoru Hesaplama. | |
| rf_predictions = clf.predict(X_test) | |
| rf_probs = clf.predict_proba(X_test)[:, 1] | |
| score= roc_auc_score(y_test, rf_probs) | |
| rf_predictions = lr.predict(X_test) | |
| rf_probs = lr.predict_proba(X_test)[:, 1] | |
| score2= roc_auc_score(y_test, rf_probs) | |
| rf_predictions = dtc.predict(X_test) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Daha önce hazırlamış olduğumuz ROC-AUC oranlarını bir de ROC-AUC grafiği olarak gösteriyoruz. | |
| plt=reload(plt) | |
| plt.style.use('seaborn') | |
| #dtc rfc gradient xgb clf lr <-- algoritmalarının isimleri | |
| pred_prob_dtc = dtc.predict_proba(X_test) | |
| pred_prob_rfc = rfc.predict_proba(X_test) | |
| pred_prob_gradient = gradient.predict_proba(X_test) | |
| pred_prob_xgb = xgb.predict_proba(X_test) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #Algoritma Bazlı Öznitelik Ağırlıkları | |
| #Sadece bir algoritma için hesaplayacağım ancak diğer algoritmalar için görselleştirme adına nasıl yapmanız gerektiğini de göstereceğim. | |
| plt=reload(plt) | |
| agirliklar = lr.coef_[0].tolist() # <-- Burası, Linear Algoritmalar için farklı, non-Linear algoritmalar için farklı. Aşağıdaki yorum satırını referans alabilirsiniz. | |
| #predict = rfc.predict(X_test) | |
| x_label = [x for x in range(len(agirliklar))] | |
| # clf lr ---> .coef_[0].tolist() | |
| #dtc rfc gradient xgb ---> feature_importances_.tolist() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from pydrive.auth import GoogleAuth | |
| def create_user_auth(): | |
| gauth = GoogleAuth() | |
| gauth.LoadCredentialsFile('mycreds.txt') | |
| if gauth.credentials is None: | |
| # Eğer mycreds.txt dosyası yoksa, kullanıcıdan kimlik doğrulaması yapılması istenir. | |
| gauth.LocalWebserverAuth() | |
| elif gauth.access_token_expired: |
OlderNewer