Skip to content

Instantly share code, notes, and snippets.

@oliviac12
Last active February 19, 2016 19:24
Show Gist options
  • Save oliviac12/635be5ecdb6cee1f0c3e to your computer and use it in GitHub Desktop.
Save oliviac12/635be5ecdb6cee1f0c3e to your computer and use it in GitHub Desktop.
It's Python Time
#Python interative data viz
http://walkerke.github.io/geog30323/slides/interactive/#/
https://s3.amazonaws.com/quandl-static-content/Documents/Quandl+-+Pandas,+SciPy,+NumPy+Cheat+Sheet.pdf
[dataframe].shape
#it has to be data frame? and it gets the dimension of the data set of you
dataframe.isnull().sum()
gives you the number of missing data in each column in your data set
dataframe.info()
gives you data type of the varibales
series/varible/colum.describe()
gives you stat summary of that seires/varibale/column
datafram.loc() is a pandas lable based slicing becasue normal slicing won't work in pandas
there's also conditional slicing see example:
dataframe.loc[dataframe.age > 95, 'age'] = np.nan (when age is larger than 95, value is NaN
For a long numerical number, can get rid of last n digit by doing
number // 10*n, this way, the last n digits will be gone, no decimal points
dataframe.series.value_counts gives you the frequency of different value in same series
dataframe.fillna('-1') #fill up all the Nan with -1, could be other values too
np.vstack #stack all the list vertically
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment