Skip to content

Instantly share code, notes, and snippets.

@RottenFruits
Created June 6, 2018 13:16
Show Gist options
  • Save RottenFruits/1d38118af4c4c3a222ea01119fdfa981 to your computer and use it in GitHub Desktop.
Save RottenFruits/1d38118af4c4c3a222ea01119fdfa981 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# data load"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"WARNING: Method definition unix2zdt(Real) in module TimeZones at /home/sonics_jr/anaconda3/share/julia/site/v0.6/TimeZones/src/conversions.jl:122 overwritten in module RData at /home/sonics_jr/anaconda3/share/julia/site/v0.6/RData/src/convert.jl:201.\n"
]
},
{
"data": {
"text/html": [
"<table class=\"data-frame\"><thead><tr><th></th><th>SepalLength</th><th>SepalWidth</th><th>PetalLength</th><th>PetalWidth</th><th>Species</th></tr></thead><tbody><tr><th>1</th><td>5.1</td><td>3.5</td><td>1.4</td><td>0.2</td><td>setosa</td></tr><tr><th>2</th><td>4.9</td><td>3.0</td><td>1.4</td><td>0.2</td><td>setosa</td></tr><tr><th>3</th><td>4.7</td><td>3.2</td><td>1.3</td><td>0.2</td><td>setosa</td></tr><tr><th>4</th><td>4.6</td><td>3.1</td><td>1.5</td><td>0.2</td><td>setosa</td></tr><tr><th>5</th><td>5.0</td><td>3.6</td><td>1.4</td><td>0.2</td><td>setosa</td></tr><tr><th>6</th><td>5.4</td><td>3.9</td><td>1.7</td><td>0.4</td><td>setosa</td></tr><tr><th>7</th><td>4.6</td><td>3.4</td><td>1.4</td><td>0.3</td><td>setosa</td></tr><tr><th>8</th><td>5.0</td><td>3.4</td><td>1.5</td><td>0.2</td><td>setosa</td></tr><tr><th>9</th><td>4.4</td><td>2.9</td><td>1.4</td><td>0.2</td><td>setosa</td></tr><tr><th>10</th><td>4.9</td><td>3.1</td><td>1.5</td><td>0.1</td><td>setosa</td></tr><tr><th>11</th><td>5.4</td><td>3.7</td><td>1.5</td><td>0.2</td><td>setosa</td></tr><tr><th>12</th><td>4.8</td><td>3.4</td><td>1.6</td><td>0.2</td><td>setosa</td></tr><tr><th>13</th><td>4.8</td><td>3.0</td><td>1.4</td><td>0.1</td><td>setosa</td></tr><tr><th>14</th><td>4.3</td><td>3.0</td><td>1.1</td><td>0.1</td><td>setosa</td></tr><tr><th>15</th><td>5.8</td><td>4.0</td><td>1.2</td><td>0.2</td><td>setosa</td></tr><tr><th>16</th><td>5.7</td><td>4.4</td><td>1.5</td><td>0.4</td><td>setosa</td></tr><tr><th>17</th><td>5.4</td><td>3.9</td><td>1.3</td><td>0.4</td><td>setosa</td></tr><tr><th>18</th><td>5.1</td><td>3.5</td><td>1.4</td><td>0.3</td><td>setosa</td></tr><tr><th>19</th><td>5.7</td><td>3.8</td><td>1.7</td><td>0.3</td><td>setosa</td></tr><tr><th>20</th><td>5.1</td><td>3.8</td><td>1.5</td><td>0.3</td><td>setosa</td></tr><tr><th>21</th><td>5.4</td><td>3.4</td><td>1.7</td><td>0.2</td><td>setosa</td></tr><tr><th>22</th><td>5.1</td><td>3.7</td><td>1.5</td><td>0.4</td><td>setosa</td></tr><tr><th>23</th><td>4.6</td><td>3.6</td><td>1.0</td><td>0.2</td><td>setosa</td></tr><tr><th>24</th><td>5.1</td><td>3.3</td><td>1.7</td><td>0.5</td><td>setosa</td></tr><tr><th>25</th><td>4.8</td><td>3.4</td><td>1.9</td><td>0.2</td><td>setosa</td></tr><tr><th>26</th><td>5.0</td><td>3.0</td><td>1.6</td><td>0.2</td><td>setosa</td></tr><tr><th>27</th><td>5.0</td><td>3.4</td><td>1.6</td><td>0.4</td><td>setosa</td></tr><tr><th>28</th><td>5.2</td><td>3.5</td><td>1.5</td><td>0.2</td><td>setosa</td></tr><tr><th>29</th><td>5.2</td><td>3.4</td><td>1.4</td><td>0.2</td><td>setosa</td></tr><tr><th>30</th><td>4.7</td><td>3.2</td><td>1.6</td><td>0.2</td><td>setosa</td></tr><tr><th>&vellip;</th><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td><td>&vellip;</td></tr></tbody></table>"
],
"text/plain": [
"150×5 DataFrames.DataFrame\n",
"│ Row │ SepalLength │ SepalWidth │ PetalLength │ PetalWidth │ Species │\n",
"├─────┼─────────────┼────────────┼─────────────┼────────────┼───────────┤\n",
"│ 1 │ 5.1 │ 3.5 │ 1.4 │ 0.2 │ setosa │\n",
"│ 2 │ 4.9 │ 3.0 │ 1.4 │ 0.2 │ setosa │\n",
"│ 3 │ 4.7 │ 3.2 │ 1.3 │ 0.2 │ setosa │\n",
"│ 4 │ 4.6 │ 3.1 │ 1.5 │ 0.2 │ setosa │\n",
"│ 5 │ 5.0 │ 3.6 │ 1.4 │ 0.2 │ setosa │\n",
"│ 6 │ 5.4 │ 3.9 │ 1.7 │ 0.4 │ setosa │\n",
"│ 7 │ 4.6 │ 3.4 │ 1.4 │ 0.3 │ setosa │\n",
"│ 8 │ 5.0 │ 3.4 │ 1.5 │ 0.2 │ setosa │\n",
"│ 9 │ 4.4 │ 2.9 │ 1.4 │ 0.2 │ setosa │\n",
"│ 10 │ 4.9 │ 3.1 │ 1.5 │ 0.1 │ setosa │\n",
"│ 11 │ 5.4 │ 3.7 │ 1.5 │ 0.2 │ setosa │\n",
"⋮\n",
"│ 139 │ 6.0 │ 3.0 │ 4.8 │ 1.8 │ virginica │\n",
"│ 140 │ 6.9 │ 3.1 │ 5.4 │ 2.1 │ virginica │\n",
"│ 141 │ 6.7 │ 3.1 │ 5.6 │ 2.4 │ virginica │\n",
"│ 142 │ 6.9 │ 3.1 │ 5.1 │ 2.3 │ virginica │\n",
"│ 143 │ 5.8 │ 2.7 │ 5.1 │ 1.9 │ virginica │\n",
"│ 144 │ 6.8 │ 3.2 │ 5.9 │ 2.3 │ virginica │\n",
"│ 145 │ 6.7 │ 3.3 │ 5.7 │ 2.5 │ virginica │\n",
"│ 146 │ 6.7 │ 3.0 │ 5.2 │ 2.3 │ virginica │\n",
"│ 147 │ 6.3 │ 2.5 │ 5.0 │ 1.9 │ virginica │\n",
"│ 148 │ 6.5 │ 3.0 │ 5.2 │ 2.0 │ virginica │\n",
"│ 149 │ 6.2 │ 3.4 │ 5.4 │ 2.3 │ virginica │\n",
"│ 150 │ 5.9 │ 3.0 │ 5.1 │ 1.8 │ virginica │"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"using RDatasets, DataFrames\n",
"\n",
"iris_dataframe = dataset(\"datasets\", \"iris\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# linear regression"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"150-element Array{Float64,1}:\n",
" 3.5\n",
" 3.0\n",
" 3.2\n",
" 3.1\n",
" 3.6\n",
" 3.9\n",
" 3.4\n",
" 3.4\n",
" 2.9\n",
" 3.1\n",
" 3.7\n",
" 3.4\n",
" 3.0\n",
" ⋮ \n",
" 3.0\n",
" 3.1\n",
" 3.1\n",
" 3.1\n",
" 2.7\n",
" 3.2\n",
" 3.3\n",
" 3.0\n",
" 2.5\n",
" 3.0\n",
" 3.4\n",
" 3.0"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = iris_dataframe[:, 1]\n",
"y = iris_dataframe[:, 2]"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(3.4189468361038156, -0.06188479796414413)"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"linreg(x, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# multiple linear regression"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"using GLM"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"StatsModels.DataFrameRegressionModel{GLM.GeneralizedLinearModel{GLM.GlmResp{Array{Float64,1},Distributions.Normal{Float64},GLM.IdentityLink},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}\n",
"\n",
"Formula: SepalLength ~ 1 + SepalWidth + PetalLength + Species\n",
"\n",
"Coefficients:\n",
" Estimate Std.Error z value Pr(>|z|)\n",
"(Intercept) 2.39039 0.262268 9.11429 <1e-19\n",
"SepalWidth 0.432217 0.0813898 5.31046 <1e-6\n",
"PetalLength 0.775629 0.0642457 12.0729 <1e-32\n",
"Species: versicolor -0.955812 0.215199 -4.44154 <1e-5\n",
"Species: virginica -1.3941 0.285661 -4.88026 <1e-5\n"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"glm(@formula(SepalLength ~ SepalWidth + PetalLength + Species),\n",
" iris_dataframe, Normal(), IdentityLink())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# logistic regression"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"150-element Array{Int64,1}:\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" ⋮\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1\n",
" 1"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"iris_dataframe[:y] = map(x -> Int64(x), (iris_dataframe[:Species] .== \"setosa\") .| (iris_dataframe[:Species] .== \"virginica\"))"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"StatsModels.DataFrameRegressionModel{GLM.GeneralizedLinearModel{GLM.GlmResp{Array{Float64,1},Distributions.Binomial{Float64},GLM.LogitLink},GLM.DensePredChol{Float64,Base.LinAlg.Cholesky{Float64,Array{Float64,2}}}},Array{Float64,2}}\n",
"\n",
"Formula: y ~ 1 + SepalLength + SepalWidth\n",
"\n",
"Coefficients:\n",
" Estimate Std.Error z value Pr(>|z|)\n",
"(Intercept) -8.09277 2.38853 -3.38818 0.0007\n",
"SepalLength -0.129425 0.246911 -0.524177 0.6002\n",
"SepalWidth 3.21276 0.638106 5.03484 <1e-6\n"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"glm(@formula(y ~ SepalLength + SepalWidth), iris_dataframe, Binomial(), LogitLink())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Julia 0.6.1",
"language": "julia",
"name": "julia-0.6"
},
"language_info": {
"file_extension": ".jl",
"mimetype": "application/julia",
"name": "julia",
"version": "0.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment