Skip to content

Instantly share code, notes, and snippets.

@phobson
Last active June 5, 2018 00:13
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save phobson/8d853ddf23d1d692fe4d to your computer and use it in GitHub Desktop.
Save phobson/8d853ddf23d1d692fe4d to your computer and use it in GitHub Desktop.
The split-apply-combine strategy applied to linear regression
# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: win-64
cycler=0.10.0=py35_0
decorator=4.0.9=py35_0
ipykernel=4.3.1=py35_0
ipython=4.1.2=py35_0
ipython_genutils=0.1.0=py35_0
ipywidgets=4.1.1=py35_0
jinja2=2.8=py35_0
jpeg=8d=vc14_0
jsonschema=2.4.0=py35_0
jupyter=1.0.0=py35_1
jupyter_client=4.1.1=py35_0
jupyter_console=4.1.0=py35_0
jupyter_core=4.0.6=py35_0
libpng=1.6.17=vc14_1
libtiff=4.0.6=vc14_1
markupsafe=0.23=py35_0
matplotlib=1.5.1=np110py35_0
mistune=0.7.1=py35_0
mkl=11.3.1=0
msvc_runtime=1.0.1=vc14_0
nbconvert=4.1.0=py35_0
nbformat=4.0.1=py35_0
nose=1.3.7=py35_0
notebook=4.1.0=py35_0
numpy=1.10.4=py35_0
pandas=0.17.1=np110py35_0
path.py=8.1.2=py35_1
patsy=0.4.0=np110py35_0
pickleshare=0.5=py35_0
pip=8.0.3=py35_0
pygments=2.1.1=py35_0
pyparsing=2.0.3=py35_0
pyqt=4.11.4=py35_5
pyreadline=2.1=py35_0
python=3.5.1=2
python-dateutil=2.4.2=py35_0
pytz=2015.7=py35_0
pyzmq=15.2.0=py35_0
qt=4.8.7=vc14_6
qtconsole=4.1.1=py35_0
scipy=0.17.0=np110py35_0
seaborn=0.7.0=py35_0
setuptools=20.1.1=py35_0
simplegeneric=0.8.1=py35_0
sip=4.16.9=py35_2
six=1.10.0=py35_0
statsmodels=0.6.1=np110py35_0
tk=8.5.18=vc14_0
tornado=4.3=py35_0
traitlets=4.1.0=py35_0
wheel=0.29.0=py35_0
zlib=1.2.8=vc14_2
conc date era location parameter days
123.52914203946618 2013-01-06 pre MW-1 TCE 0
119.73210230460235 2013-01-27 pre MW-1 TCE 21
126.0641465456202 2013-02-17 pre MW-1 TCE 42
121.76220112945919 2013-03-10 pre MW-1 TCE 63
125.53509980828159 2013-03-31 pre MW-1 TCE 84
117.18050337977171 2013-04-21 pre MW-1 TCE 105
116.61796215874692 2013-05-12 pre MW-1 TCE 126
144.36771295593925 2013-06-02 pre MW-1 TCE 147
114.72394061067713 2013-06-23 pre MW-1 TCE 168
114.96248415858344 2013-07-14 pre MW-1 TCE 189
119.68570549605978 2013-08-04 pre MW-1 TCE 210
114.00579774768289 2013-08-25 pre MW-1 TCE 231
114.84532926260337 2013-09-15 pre MW-1 TCE 252
111.71497466492465 2013-10-06 pre MW-1 TCE 273
112.61841679867288 2013-10-27 pre MW-1 TCE 294
110.32201454467196 2013-11-01 during MW-1 TCE 0
117.38638396586569 2013-11-04 during MW-1 TCE 3
99.99375653980398 2013-11-07 during MW-1 TCE 6
93.56167399155086 2013-11-10 during MW-1 TCE 9
90.93518049345657 2013-11-13 during MW-1 TCE 12
82.35548056772504 2013-11-16 during MW-1 TCE 15
76.02532903262016 2013-11-19 during MW-1 TCE 18
70.30623671971881 2013-11-22 during MW-1 TCE 21
68.21012593160663 2013-11-25 during MW-1 TCE 24
59.92200731876834 2013-11-28 during MW-1 TCE 27
54.25211305348309 2013-12-01 during MW-1 TCE 30
53.48476570046454 2013-12-04 during MW-1 TCE 33
41.83168688691159 2013-12-07 during MW-1 TCE 36
36.37571831374462 2013-12-10 during MW-1 TCE 39
31.01907997862268 2013-12-13 during MW-1 TCE 42
31.855101639323195 2014-01-01 post MW-1 TCE 0
31.206918208495612 2014-01-09 post MW-1 TCE 8
30.437589239720683 2014-01-17 post MW-1 TCE 16
30.45505534573859 2014-01-25 post MW-1 TCE 24
30.013194071942973 2014-02-02 post MW-1 TCE 32
28.884937325151594 2014-02-10 post MW-1 TCE 40
28.45390227278077 2014-02-18 post MW-1 TCE 48
29.82366445760144 2014-02-26 post MW-1 TCE 56
27.68423172367223 2014-03-06 post MW-1 TCE 64
27.173735962083434 2014-03-14 post MW-1 TCE 72
27.544441781603908 2014-03-22 post MW-1 TCE 80
27.425229622929308 2014-03-30 post MW-1 TCE 88
53.39966282386878 2014-04-07 post MW-1 TCE 96
26.0404059137335 2014-04-15 post MW-1 TCE 104
30.360599984430127 2014-04-23 post MW-1 TCE 112
254.19538628430846 2013-01-06 pre MW-1 DCE 0
246.80156981174437 2013-01-27 pre MW-1 DCE 21
247.19146930041813 2013-02-17 pre MW-1 DCE 42
239.66108138506846 2013-03-10 pre MW-1 DCE 63
235.77365576969004 2013-03-31 pre MW-1 DCE 84
235.66088092157042 2013-04-21 pre MW-1 DCE 105
228.75505291316918 2013-05-12 pre MW-1 DCE 126
227.8969778625727 2013-06-02 pre MW-1 DCE 147
222.12973532034013 2013-06-23 pre MW-1 DCE 168
230.98108878819173 2013-07-14 pre MW-1 DCE 189
220.55483902032776 2013-08-04 pre MW-1 DCE 210
211.6497313431593 2013-08-25 pre MW-1 DCE 231
207.8406751019786 2013-09-15 pre MW-1 DCE 252
203.93289115215626 2013-10-06 pre MW-1 DCE 273
200.95013301313168 2013-10-27 pre MW-1 DCE 294
201.16131807432328 2013-11-01 during MW-1 DCE 0
193.90226427657407 2013-11-04 during MW-1 DCE 3
187.71975846546655 2013-11-07 during MW-1 DCE 6
181.9693091301286 2013-11-10 during MW-1 DCE 9
173.98105815607758 2013-11-13 during MW-1 DCE 12
164.9200481236416 2013-11-16 during MW-1 DCE 15
157.4161423423674 2013-11-19 during MW-1 DCE 18
159.07628356780822 2013-11-22 during MW-1 DCE 21
143.55355650003153 2013-11-25 during MW-1 DCE 24
136.44380680523844 2013-11-28 during MW-1 DCE 27
129.4308579352776 2013-12-01 during MW-1 DCE 30
121.59323729047561 2013-12-04 during MW-1 DCE 33
115.19284610399369 2013-12-07 during MW-1 DCE 36
108.04808393510328 2013-12-10 during MW-1 DCE 39
103.58030457256527 2013-12-13 during MW-1 DCE 42
103.94745645630128 2014-01-01 post MW-1 DCE 0
100.94222533228707 2014-01-09 post MW-1 DCE 8
103.85610011001067 2014-01-17 post MW-1 DCE 16
100.37890073711647 2014-01-25 post MW-1 DCE 24
98.7595738367796 2014-02-02 post MW-1 DCE 32
98.44466186868992 2014-02-10 post MW-1 DCE 40
97.01340054490078 2014-02-18 post MW-1 DCE 48
96.00645887661945 2014-02-26 post MW-1 DCE 56
95.46550030079685 2014-03-06 post MW-1 DCE 64
93.97394709401362 2014-03-14 post MW-1 DCE 72
95.19808185882034 2014-03-22 post MW-1 DCE 80
92.6193014157367 2014-03-30 post MW-1 DCE 88
96.14965847766415 2014-04-07 post MW-1 DCE 96
91.92133258624467 2014-04-15 post MW-1 DCE 104
91.75394708807764 2014-04-23 post MW-1 DCE 112
33.05657794132975 2013-01-06 pre MW-1 PCE 0
34.04167325477549 2013-01-27 pre MW-1 PCE 21
30.945761673023867 2013-02-17 pre MW-1 PCE 42
32.344421041935604 2013-03-10 pre MW-1 PCE 63
36.38271167351001 2013-03-31 pre MW-1 PCE 84
31.983092746146017 2013-04-21 pre MW-1 PCE 105
34.818261670612905 2013-05-12 pre MW-1 PCE 126
32.598327131110445 2013-06-02 pre MW-1 PCE 147
33.316405257217504 2013-06-23 pre MW-1 PCE 168
34.94457426998054 2013-07-14 pre MW-1 PCE 189
33.7792878698855 2013-08-04 pre MW-1 PCE 210
34.409571229808414 2013-08-25 pre MW-1 PCE 231
34.547613963090086 2013-09-15 pre MW-1 PCE 252
47.580856465393154 2013-10-06 pre MW-1 PCE 273
37.02442294468243 2013-10-27 pre MW-1 PCE 294
38.35051091891114 2013-11-01 during MW-1 PCE 0
34.462427812677035 2013-11-04 during MW-1 PCE 3
32.48850932067632 2013-11-07 during MW-1 PCE 6
29.86143942985412 2013-11-10 during MW-1 PCE 9
29.00191234554431 2013-11-13 during MW-1 PCE 12
26.70786848625887 2013-11-16 during MW-1 PCE 15
29.10393592446233 2013-11-19 during MW-1 PCE 18
29.36459699583993 2013-11-22 during MW-1 PCE 21
31.025410559927504 2013-11-25 during MW-1 PCE 24
23.714547186778777 2013-11-28 during MW-1 PCE 27
17.922969422690542 2013-12-01 during MW-1 PCE 30
15.765803074570616 2013-12-04 during MW-1 PCE 33
17.870198751063747 2013-12-07 during MW-1 PCE 36
14.359469542553837 2013-12-10 during MW-1 PCE 39
10.180744099439718 2013-12-13 during MW-1 PCE 42
12.571147355035373 2014-01-01 post MW-1 PCE 0
15.635201956700826 2014-01-09 post MW-1 PCE 8
13.917602561416352 2014-01-17 post MW-1 PCE 16
14.522525961276642 2014-01-25 post MW-1 PCE 24
14.886473774542424 2014-02-02 post MW-1 PCE 32
15.627457575715235 2014-02-10 post MW-1 PCE 40
20.911929955149457 2014-02-18 post MW-1 PCE 48
18.62031169313942 2014-02-26 post MW-1 PCE 56
19.80284696815915 2014-03-06 post MW-1 PCE 64
34.02376500041348 2014-03-14 post MW-1 PCE 72
24.59347577099718 2014-03-22 post MW-1 PCE 80
24.406649619260783 2014-03-30 post MW-1 PCE 88
24.82624417824214 2014-04-07 post MW-1 PCE 96
34.951676996083805 2014-04-15 post MW-1 PCE 104
39.146774311787105 2014-04-23 post MW-1 PCE 112
105.26921496969135 2013-01-06 pre MW-2 TCE 0
101.27078821174815 2013-01-27 pre MW-2 TCE 21
95.83452712044419 2013-02-17 pre MW-2 TCE 42
95.495902011684 2013-03-10 pre MW-2 TCE 63
92.82982368718316 2013-03-31 pre MW-2 TCE 84
91.1398924072869 2013-04-21 pre MW-2 TCE 105
92.50201908699808 2013-05-12 pre MW-2 TCE 126
85.05368635218699 2013-06-02 pre MW-2 TCE 147
83.66639878033885 2013-06-23 pre MW-2 TCE 168
83.02459277791118 2013-07-14 pre MW-2 TCE 189
79.49126728783065 2013-08-04 pre MW-2 TCE 210
76.70908514889452 2013-08-25 pre MW-2 TCE 231
79.17088562094061 2013-09-15 pre MW-2 TCE 252
74.47337356239913 2013-10-06 pre MW-2 TCE 273
70.48973996468547 2013-10-27 pre MW-2 TCE 294
72.4437200908659 2013-11-01 during MW-2 TCE 0
71.74711796317351 2013-11-04 during MW-2 TCE 3
65.94537934065 2013-11-07 during MW-2 TCE 6
65.91342570780803 2013-11-10 during MW-2 TCE 9
60.38958896827559 2013-11-13 during MW-2 TCE 12
56.7202673046587 2013-11-16 during MW-2 TCE 15
53.24416411968205 2013-11-19 during MW-2 TCE 18
51.48080601059174 2013-11-22 during MW-2 TCE 21
49.24480858743331 2013-11-25 during MW-2 TCE 24
49.21301320809862 2013-11-28 during MW-2 TCE 27
42.8563441097427 2013-12-01 during MW-2 TCE 30
41.4919714250514 2013-12-04 during MW-2 TCE 33
36.20801643155208 2013-12-07 during MW-2 TCE 36
33.47698747743448 2013-12-10 during MW-2 TCE 39
31.436218152996663 2013-12-13 during MW-2 TCE 42
30.44118738698771 2014-01-01 post MW-2 TCE 0
34.525138778518894 2014-01-09 post MW-2 TCE 8
32.79862302069233 2014-01-17 post MW-2 TCE 16
32.122718141612864 2014-01-25 post MW-2 TCE 24
32.23270177487632 2014-02-02 post MW-2 TCE 32
50.263743270666225 2014-02-10 post MW-2 TCE 40
30.488949182467163 2014-02-18 post MW-2 TCE 48
29.605209507543258 2014-02-26 post MW-2 TCE 56
31.543720400928947 2014-03-06 post MW-2 TCE 64
30.90480511827157 2014-03-14 post MW-2 TCE 72
30.14909563019266 2014-03-22 post MW-2 TCE 80
29.699319211407825 2014-03-30 post MW-2 TCE 88
30.095050744065155 2014-04-07 post MW-2 TCE 96
31.139616933190187 2014-04-15 post MW-2 TCE 104
29.879395204145602 2014-04-23 post MW-2 TCE 112
333.6349969039696 2013-01-06 pre MW-2 DCE 0
298.353415173977 2013-01-27 pre MW-2 DCE 21
296.3260570483836 2013-02-17 pre MW-2 DCE 42
293.6657756425811 2013-03-10 pre MW-2 DCE 63
293.68629157370106 2013-03-31 pre MW-2 DCE 84
289.41830234117765 2013-04-21 pre MW-2 DCE 105
287.98288662353286 2013-05-12 pre MW-2 DCE 126
286.1681389388184 2013-06-02 pre MW-2 DCE 147
283.53612533517287 2013-06-23 pre MW-2 DCE 168
295.2061998318369 2013-07-14 pre MW-2 DCE 189
284.28584962494983 2013-08-04 pre MW-2 DCE 210
285.00417257707875 2013-08-25 pre MW-2 DCE 231
274.8328690831084 2013-09-15 pre MW-2 DCE 252
278.9009625167875 2013-10-06 pre MW-2 DCE 273
273.064857828803 2013-10-27 pre MW-2 DCE 294
274.3243309546276 2013-11-01 during MW-2 DCE 0
260.2560223968513 2013-11-04 during MW-2 DCE 3
239.10085871194988 2013-11-07 during MW-2 DCE 6
223.82720317484262 2013-11-10 during MW-2 DCE 9
211.9893000475949 2013-11-13 during MW-2 DCE 12
191.48820575844078 2013-11-16 during MW-2 DCE 15
186.64591223053304 2013-11-19 during MW-2 DCE 18
163.29173598377534 2013-11-22 during MW-2 DCE 21
145.8428842513852 2013-11-25 during MW-2 DCE 24
130.74339852771737 2013-11-28 during MW-2 DCE 27
113.31216049141679 2013-12-01 during MW-2 DCE 30
98.20742696454394 2013-12-04 during MW-2 DCE 33
87.9517086084528 2013-12-07 during MW-2 DCE 36
74.06650649620592 2013-12-10 during MW-2 DCE 39
97.95581497935035 2013-12-13 during MW-2 DCE 42
51.50319699064526 2014-01-01 post MW-2 DCE 0
51.009549723356585 2014-01-09 post MW-2 DCE 8
51.43836254736396 2014-01-17 post MW-2 DCE 16
51.318975663634575 2014-01-25 post MW-2 DCE 24
52.63868761904332 2014-02-02 post MW-2 DCE 32
54.088809282364885 2014-02-10 post MW-2 DCE 40
53.293925992869106 2014-02-18 post MW-2 DCE 48
58.49111005603868 2014-02-26 post MW-2 DCE 56
53.515980091520824 2014-03-06 post MW-2 DCE 64
53.63932064890161 2014-03-14 post MW-2 DCE 72
53.65439395497007 2014-03-22 post MW-2 DCE 80
53.357068273658506 2014-03-30 post MW-2 DCE 88
53.66087514164254 2014-04-07 post MW-2 DCE 96
54.091875304531015 2014-04-15 post MW-2 DCE 104
62.38785720844364 2014-04-23 post MW-2 DCE 112
65.04842009120831 2013-01-06 pre MW-2 PCE 0
68.83827299553742 2013-01-27 pre MW-2 PCE 21
59.59755443383201 2013-02-17 pre MW-2 PCE 42
59.069319901404185 2013-03-10 pre MW-2 PCE 63
59.293905060814296 2013-03-31 pre MW-2 PCE 84
60.268254042083385 2013-04-21 pre MW-2 PCE 105
60.93232103511965 2013-05-12 pre MW-2 PCE 126
63.61023476156474 2013-06-02 pre MW-2 PCE 147
59.49505368059587 2013-06-23 pre MW-2 PCE 168
71.34655781537072 2013-07-14 pre MW-2 PCE 189
57.676821838516794 2013-08-04 pre MW-2 PCE 210
56.59801014568396 2013-08-25 pre MW-2 PCE 231
55.915875024409985 2013-09-15 pre MW-2 PCE 252
55.89983439124815 2013-10-06 pre MW-2 PCE 273
57.23144074585753 2013-10-27 pre MW-2 PCE 294
55.54286189864905 2013-11-01 during MW-2 PCE 0
57.81710019926078 2013-11-04 during MW-2 PCE 3
59.63507063136525 2013-11-07 during MW-2 PCE 6
47.585265647477925 2013-11-10 during MW-2 PCE 9
49.85527321736887 2013-11-13 during MW-2 PCE 12
42.6003084767801 2013-11-16 during MW-2 PCE 15
42.72338017141761 2013-11-19 during MW-2 PCE 18
45.12237655736989 2013-11-22 during MW-2 PCE 21
36.78795075232477 2013-11-25 during MW-2 PCE 24
32.833000440391594 2013-11-28 during MW-2 PCE 27
30.793113448350372 2013-12-01 during MW-2 PCE 30
28.68879857092062 2013-12-04 during MW-2 PCE 33
26.31284169449277 2013-12-07 during MW-2 PCE 36
23.77938430314763 2013-12-10 during MW-2 PCE 39
21.43712778194844 2013-12-13 during MW-2 PCE 42
22.152973825611664 2014-01-01 post MW-2 PCE 0
20.721100760858352 2014-01-09 post MW-2 PCE 8
21.932608579900688 2014-01-17 post MW-2 PCE 16
32.070671689078225 2014-01-25 post MW-2 PCE 24
22.44795002358588 2014-02-02 post MW-2 PCE 32
22.732393372777373 2014-02-10 post MW-2 PCE 40
28.488361771106085 2014-02-18 post MW-2 PCE 48
22.567274041300344 2014-02-26 post MW-2 PCE 56
30.076192036410987 2014-03-06 post MW-2 PCE 64
23.962639305252196 2014-03-14 post MW-2 PCE 72
25.595094566411134 2014-03-22 post MW-2 PCE 80
25.788246342331707 2014-03-30 post MW-2 PCE 88
29.64100584729982 2014-04-07 post MW-2 PCE 96
25.82284785479302 2014-04-15 post MW-2 PCE 104
25.70642924749875 2014-04-23 post MW-2 PCE 112
import numpy
import pandas
import statsmodels.api as sm
import seaborn
from matplotlib import pyplot
from statsmodels.sandbox.regression.predstd import wls_prediction_std
def generate_fake_data(dates, startconc, stopconc, **othercols):
N = len(dates)
noise = numpy.random.lognormal(0.5, 1.25, size=N)
conc = noise + numpy.linspace(startconc, stopconc, num=N)
data = pandas.DataFrame(dict(date=dates, conc=conc, **othercols))
data['days'] = (data['date'] - data['date'].min()).dt.days
return data
def plot_raw_data(data, **fgopts):
fg = (
seaborn.FacetGrid(aspect=2, data=data, **fgopts)
.map(pyplot.scatter, 'date', 'conc')
.set_xticklabels(rotation=30, rotation_mode='anchor', ha='right')
.add_legend()
)
return fg
def plot_modeled_data(modeled, **fgopts):
fg = (
seaborn.FacetGrid(aspect=2, data=modeled, **fgopts)
.map(pyplot.fill_between, 'date', 'ci_upper', 'ci_lower', zorder=0, alpha=0.2, label='95% CI')
.map(pyplot.plot, 'date', 'fit', label='best-fit')
.map(pyplot.scatter, 'date', 'conc', label='raw data')
.set_xticklabels(rotation=30, rotation_mode='anchor', ha='right')
)
return fg
Display the source blob
Display the rendered blob
Raw
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment