Skip to content

Instantly share code, notes, and snippets.

View hwetsman's full-sized avatar

Howard Wetsman MD hwetsman

View GitHub Profile

np.random.choice(array,n,replace=True,p)

Pandas

Data Inspection

df.info() - gives column by col review of dtypes and number of non-null entries

df.sample(n) - returns a random sample of n entries from the df. Good for looking for quality problems. Default is n=1.

df.head(n) - returns the first n rows of the dataframe

df.tail(n) - returns the last n rows of the dataframe

@hwetsman
hwetsman / Python_Cheatsheet.md
Last active January 9, 2022 19:07
A listing of python syntax

datetime

create datetime object

dt1 = datetime.strptime('20091031','%Y%m%d')

get 'date' object

dt1.date()

get 'time' object

dt1.time()

@hwetsman
hwetsman / readme.md
Last active July 15, 2022 19:59 — forked from AlexMercedCoder/readme.md
Data Terms/Concepts Cheatsheet

Data Analytics/Science Terms and Concepts Cheatsheet

Structured Data

Data is organized to meet a schema. Think tables which organize data into rows and columns.

Unstructured Data

Data is unorganized and lacks a schema. Imagine collections of html documents including text and images not organized in any consistent way.