Skip to content

Instantly share code, notes, and snippets.

@debonx
Created December 15, 2018 10:56
Show Gist options
  • Save debonx/a139f05c3d5ad00e0d8496c15db7b9ae to your computer and use it in GitHub Desktop.
Save debonx/a139f05c3d5ad00e0d8496c15db7b9ae to your computer and use it in GitHub Desktop.
Using Python and sklearn to create a Breast Cancer Classifier and predict malignant or benign tumours, based on features list.
# Importing breast cancer data and features.
# Importing training model, classifier and matplotlib
import codecademylib3_seaborn
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
import matplotlib.pyplot as plt
# Split data
breast_cancer_data = load_breast_cancer()
training_data, validation_data, training_labels, validation_labels = train_test_split(breast_cancer_data.data, breast_cancer_data.target, test_size = 0.2, random_state = 500)
# Create the classifier and test the best accuracy.
best_score = 0
best_k = 0
k_list = range(1,100)
accuracies = []
for k in k_list:
classifier = KNeighborsClassifier(n_neighbors = k)
classifier.fit(training_data, training_labels)
the_score = classifier.score(validation_data, validation_labels)
accuracies.append(the_score)
if the_score > best_score:
best_score = the_score
best_k = k
# Plotting the Classifier
plt.plot(k_list, accuracies)
plt.xlabel("k")
plt.ylabel("Validation Accuracy")
plt.title("Breast Cancer Classifier Accuracy")
plt.show()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment