Skip to content

Instantly share code, notes, and snippets.

View stevenobadja's full-sized avatar

Steven Obadja stevenobadja

View GitHub Profile

Keybase proof

I hereby claim:

  • I am stevenobadja on github.
  • I am obadja (https://keybase.io/obadja) on keybase.
  • I have a public key ASCPhv4R1upTN17S0-bC-aa9awZdMQrqXe558_eQCNKEcAo

To claim this, I am signing this object:

import numpy as np
import pandas as pd
import scipy
from scipy.stats import ttest_ind
import matplotlib.pyplot as plt
%matplotlib inline
pop1 = np.random.binomial(10, 0.2, 10000)
pop2 = np.random.binomial(10,0.5, 10000)
@stevenobadja
stevenobadja / distributions
Created August 30, 2017 05:41
Distributions
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# Generate a bernoulli distribution
bernoulli= np.random.binomial(1, .5, 100)
plt.hist(bernoulli)
plt.axvline(bernoulli.mean(), color='g', linestyle='solid', linewidth=2)
plt.axvline(bernoulli.mean() + bernoulli.std(), color='y', linestyle='dashed', linewidth=2)
@stevenobadja
stevenobadja / monty_hall
Last active August 30, 2017 02:28
Monty Hall
P ( Car in Door 1 | Goat in Door 2 )
P ( A and B ) = P( A | B ) * P(B)
P ( B and A ) = P( B | A ) * P(A)
P( A | B ) * P(B) = P( B | A ) * P(A)
P( A | B ) = P( B | A ) * P(A) / P(B)
@stevenobadja
stevenobadja / data_source
Created August 25, 2017 05:47
Evaluating Data Sources Excercise
#Question 1
#Due to the dataset that was used was close to a holiday.
#Certain locations may not be available. To reframe the
#question I would ask "What are the popular Amsterdam
#neighborhoods during Christmas?"
#Question 2
#Due to September 12, 2001 was exactly the day after the
#911 attack I would say that the data is biased and will
#show that services will be used more in New York. I
@stevenobadja
stevenobadja / bayes_ex
Last active August 28, 2017 00:37
Bayes Excercise
#Total Population
suff_pop = .005
nonsuff_pop = .955
#Testing for Sufferer
sufftest_pos = .98
sufftest_neg = .02
#Testing for Non-Sufferer
nonsufftest_pos = .1
@stevenobadja
stevenobadja / prob_excercise
Created August 23, 2017 06:26
Exercises in Probability
print('Question #1')
print('Probability of either of those pattern is: {:.02%} or .5**4'.format(.5**4))
print('\n')
print('Question #2')
print('Probability of not choosing a man is: {:.02%} or 24/45'.format(24/45))
print('\n')
print('Question #3')
print('Probability that Bernice will be in a plane crash sometime in the next year is: {:%} or .1*.00005'.format(.1*.00005))
@stevenobadja
stevenobadja / describing_data
Last active August 23, 2017 05:34
Drill Describing Data
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.DataFrame()
df['age'] = [14,12,11,10,8,6,8]
print ('Question #1:')
print ('Mean: {}'.format(np.mean(df['age'])))
@stevenobadja
stevenobadja / powerball_histogram
Created August 20, 2017 23:45
Powerball Histogram
import matplotlib.pyplot as plt
import pandas as pd
df = pd.read_csv("http://bit.ly/2ifPlDC")
df.drop(['Format: Draw Number', ' Draw Date (yyyymmdd)', '6', ' Division 1', '2.1', '3.1', '4.1', '5.1', '6.1', '7', '8'], axis=1, inplace=True)
df_cols = ['first','second','third','fourth','fifth','powerball']
df.columns = df_cols
df['powerball'] = df['powerball'].str.replace('-', '0')
df['powerball'] = df.powerball.astype('int64')
plt.hist(x = (df['first'],df['second'], df['third'],df['fourth'],df['fifth'],df['powerball']), bins = 50, stacked = True, histtype = 'bar', rwidth = .8)