This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.feature_selection import SelectKBest | |
| from sklearn.feature_selection import f_classif | |
| from sklearn.pipeline import Pipeline | |
| from sklearn.linear_model import LogisticRegression | |
| pipe = Pipeline([('kBest', SelectKBest(f_classif, k = 2)), | |
| ('lr', LogisticRegression())]) | |
| pipe.fit(x_train, y_train) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.dummy import DummyClassifier | |
| dc = DummyClassifier() | |
| dc.fit(x_train, y_train) | |
| print(u'The performance of the model is: %0.5f' % dc.score(x_test, y_test)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from sklearn.model_selection import train_test_split | |
| target = 'Target' | |
| features = list(transf_selection.columns) | |
| features.remove(target) | |
| x_train, x_test, y_train, y_test = train_test_split(transf_selection[features], transf_selection[target], random_state = 0) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| calculateVIF(transfusion) | |
| #generate the new dataset with the selection of variables of the VIF function | |
| transf_selection = selectDataUsingVIF(transfusion, 5) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| transfusion['Average (c.c./months)'] = transfusion['Volume'] / transfusion['Time'] | |
| transfusion['Donations per Month'] = (transfusion['Time'] - transfusion['Recency']) / transfusion['Frequency'] | |
| transfusion['Frequent Donor'] = transfusion['Frequency'] > median(transfusion['Frequency']) | |
| plt.figure(1, figsize=(10, 10)) | |
| plt.subplot(221) | |
| plot_distribution(transfusion, 'Average (c.c./months)', 'Target') | |
| plt.subplot(222) | |
| plot_distribution(transfusion, 'Donations per Month', 'Target') | |
| plt.subplot(223) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| plt.figure(1, figsize=(10, 10)) | |
| plt.subplot(221) | |
| plot_distribution(transfusion, 'Recency', 'Target') | |
| plt.subplot(222) | |
| plot_distribution(transfusion, 'Frequency', 'Target') | |
| plt.subplot(223) | |
| plot_distribution(transfusion, 'Volume', 'Target') | |
| plt.subplot(224) | |
| plot_distribution(transfusion, 'Time', 'Target') |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from seaborn import distplot | |
| def plot_distribution(data, feature, target): | |
| min_value = floor(min(data[feature]) - 1) | |
| max_value = ceil(max(data[feature]) + 1) | |
| bins = range(int(min_value), int(max_value), int(max(1, round((max_value - min_value) / 20)))) | |
| distplot(transfusion[data[target] == 0][feature], | |
| bins = bins, |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from seaborn import pairplot | |
| pairplot(transfusion.iloc[:, 0:4], diag_kind = 'kde'); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from seaborn import heatmap | |
| heatmap(transfusion.corr(), annot = True); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import pandas as pd | |
| transfusion = pd.read_table('https://archive.ics.uci.edu/ml/machine-learning-databases/blood-transfusion/transfusion.data', sep=',') | |
| #rename columns | |
| transfusion = transfusion.rename(columns={'Recency (months)':'Recency', | |
| 'Frequency (times)':'Frequency','Monetary (c.c. blood)':'Volume', | |
| 'Time (months)':'Time', 'whether he/she donated blood in March 2007':'Target'}) | |
| transfusion.head() |