Skip to content

Instantly share code, notes, and snippets.

View haydarai's full-sized avatar

Haydar Ali Ismail haydarai

View GitHub Profile
@haydarai
haydarai / decision-tree-classification.py
Created February 1, 2017 13:33
Classification using Decision Tree Classifier
dtc = DecisionTreeClassifier()
dtc.fit(train_inputs, train_classes)
dtc.score(test_inputs, test_classes)
@haydarai
haydarai / train-test-split.py
Created February 1, 2017 13:28
Split to input and output classes and also splitting it into train and test data with the training size is 70% of the whole dataset and the random state is defined as 1
all_inputs = df[['SepalLengthCm', 'SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']].values
all_classes = df['Species'].values
(train_inputs, test_inputs, train_classes, test_classes) = train_test_split(all_inputs, all_classes, train_size=0.7, random_state=1)
@haydarai
haydarai / pair-plot.py
Created February 1, 2017 13:16
Plot pairwise relationships in a dataset
sns.pairplot(df, hue='Species')
@haydarai
haydarai / plot-petal-width.py
Created February 1, 2017 12:57
Plot PetalWidthCm column
df['PetalWidthCm'].plot.hist()
plt.show()
@haydarai
haydarai / df-describe.py
Created February 1, 2017 12:44
Give a short summary of the dataset
df.describe()
@haydarai
haydarai / check-dtypes.py
Created February 1, 2017 12:37
Check the data types of each column
df.dtypes
@haydarai
haydarai / check-na-values.py
Created February 1, 2017 12:31
Check whether there are null values in the dataset
df.isnull().any()
@haydarai
haydarai / iris-imports.py
Last active February 1, 2017 12:56
Importing Dataset and Dependencies
%matplotlib inline
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
@haydarai
haydarai / naive-bayes-example.py
Created January 20, 2017 13:02
Gaussian Naive Bayes Example
import numpy as np
from sklearn.naive_bayes import GaussianNB
X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
Y = np.array([1, 1, 1, 2, 2, 2])
clf = GaussianNB()
clf.fit(X, Y)
print(clf.predict([[-0.8, -1]]))
@haydarai
haydarai / dataframe.py
Created January 17, 2017 13:22
Converting Spark Dataframe to Pandas and the other way around
# Create a Spark DataFrame from Pandas
spark_df = sc.createDataFrame(pandas_df)
# Create a Pandas DataFrame from Spark
pandas_df = spark_df.toPandas()