oliviac12/Python Note

## Python Note
#Python interative data viz
http://walkerke.github.io/geog30323/slides/interactive/#/

https://s3.amazonaws.com/quandl-static-content/Documents/Quandl+-+Pandas,+SciPy,+NumPy+Cheat+Sheet.pdf

## Python_airbnbkaggle
[dataframe].shape
#it has to be data frame? and it gets the dimension of the data set of you

dataframe.isnull().sum()
gives you the number of missing data in each column in your data set

dataframe.info()
gives you data type of the varibales
series/varible/colum.describe()
gives you stat summary of that seires/varibale/column

datafram.loc() is a pandas lable based slicing becasue normal slicing won't work in pandas
there's also conditional slicing see example:
dataframe.loc[dataframe.age > 95, 'age'] = np.nan (when age is larger than 95, value is NaN

For a long numerical number, can get rid of last n digit by doing
number // 10*n, this way, the last n digits will be gone, no decimal points

dataframe.series.value_counts gives you the frequency of different value in same series

dataframe.fillna('-1')  #fill up all the Nan with -1, could be other values too
np.vstack #stack all the list vertically
	#Python interative data viz
	http://walkerke.github.io/geog30323/slides/interactive/#/

	https://s3.amazonaws.com/quandl-static-content/Documents/Quandl+-+Pandas,+SciPy,+NumPy+Cheat+Sheet.pdf
	[dataframe].shape
	#it has to be data frame? and it gets the dimension of the data set of you

	dataframe.isnull().sum()
	gives you the number of missing data in each column in your data set

	dataframe.info()
	gives you data type of the varibales
	series/varible/colum.describe()
	gives you stat summary of that seires/varibale/column

	datafram.loc() is a pandas lable based slicing becasue normal slicing won't work in pandas
	there's also conditional slicing see example:
	dataframe.loc[dataframe.age > 95, 'age'] = np.nan (when age is larger than 95, value is NaN

	For a long numerical number, can get rid of last n digit by doing
	number // 10*n, this way, the last n digits will be gone, no decimal points

	dataframe.series.value_counts gives you the frequency of different value in same series

	dataframe.fillna('-1') #fill up all the Nan with -1, could be other values too
	np.vstack #stack all the list vertically