Skip to content

Instantly share code, notes, and snippets.

@srishtis
Created October 5, 2018 09:44
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save srishtis/10d8e8cecfa128ff694bd8846f825135 to your computer and use it in GitHub Desktop.
Save srishtis/10d8e8cecfa128ff694bd8846f825135 to your computer and use it in GitHub Desktop.
Loading iris dataset in Python
from sklearn import datasets
import pandas as pd
# load iris dataset
iris = datasets.load_iris()
# Since this is a bunch, create a dataframe
iris_df=pd.DataFrame(iris.data)
iris_df['class']=iris.target
iris_df.columns=['sepal_len', 'sepal_wid', 'petal_len', 'petal_wid', 'class']
iris_df.dropna(how="all", inplace=True) # remove any empty lines
#selecting only first 4 columns as they are the independent(X) variable
# any kind of feature selection or correlation analysis should be first done on these
iris_X=iris_df.iloc[:,[0,1,2,3]]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment