Skip to content

Instantly share code, notes, and snippets.

@joelmartinez
Last active August 29, 2015 14:20
Show Gist options
  • Save joelmartinez/af9643c5011d2c8ac826 to your computer and use it in GitHub Desktop.
Save joelmartinez/af9643c5011d2c8ac826 to your computer and use it in GitHub Desktop.
A look at how to load a CSV file, and calculate the mean of a column named "Ozone", which contains NA values. Functionally, the task can be completed in very similar looking code ... except that there are a few helper functions, types, and operators defined in support of the F#.
setwd("~/dev/datascience/r")
data <- csv.load("hw1_data.csv")
filtered <- data[!is.na(data$Ozone),]$Ozone
mean(filtered)
let wd = ~~ "dev/datascience/r"
let data = hw1_data.Load(wd @@ "hw1_data.csv")
let filtered = data.Rows |> Seq.map (fun f -> f.Ozone) |> clearNA
printfn "%A" (filtered.Mean())
open System
open FSharp.Data // "FSharp.Data" Nuget package
open MathNet.Numerics.Statistics // "MathNet.Numerics.Fsharp" Nuget package
let (@@) x y = System.IO.Path.Combine(x, y)
let (~~) p = System.Environment.GetFolderPath(System.Environment.SpecialFolder.UserProfile) @@ p
let clearNA seq = seq |> Seq.filter (fun f -> not (Double.IsNaN f))
type hw1_data = CsvProvider<"/Users/joelmartinez/dev/datascience/r/hw1_data.csv">
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6
23 299 8.6 65 5 7
19 99 13.8 59 5 8
8 19 20.1 61 5 9
NA 194 8.6 69 5 10
7 NA 6.9 74 5 11
16 256 9.7 69 5 12
11 290 9.2 66 5 13
14 274 10.9 68 5 14
18 65 13.2 58 5 15
14 334 11.5 64 5 16
34 307 12 66 5 17
6 78 18.4 57 5 18
30 322 11.5 68 5 19
11 44 9.7 62 5 20
1 8 9.7 59 5 21
11 320 16.6 73 5 22
4 25 9.7 61 5 23
32 92 12 61 5 24
NA 66 16.6 57 5 25
NA 266 14.9 58 5 26
NA NA 8 57 5 27
23 13 12 67 5 28
45 252 14.9 81 5 29
115 223 5.7 79 5 30
37 279 7.4 76 5 31
NA 286 8.6 78 6 1
NA 287 9.7 74 6 2
NA 242 16.1 67 6 3
NA 186 9.2 84 6 4
NA 220 8.6 85 6 5
NA 264 14.3 79 6 6
29 127 9.7 82 6 7
NA 273 6.9 87 6 8
71 291 13.8 90 6 9
39 323 11.5 87 6 10
NA 259 10.9 93 6 11
NA 250 9.2 92 6 12
23 148 8 82 6 13
NA 332 13.8 80 6 14
NA 322 11.5 79 6 15
21 191 14.9 77 6 16
37 284 20.7 72 6 17
20 37 9.2 65 6 18
12 120 11.5 73 6 19
13 137 10.3 76 6 20
NA 150 6.3 77 6 21
NA 59 1.7 76 6 22
NA 91 4.6 76 6 23
NA 250 6.3 76 6 24
NA 135 8 75 6 25
NA 127 8 78 6 26
NA 47 10.3 73 6 27
NA 98 11.5 80 6 28
NA 31 14.9 77 6 29
NA 138 8 83 6 30
135 269 4.1 84 7 1
49 248 9.2 85 7 2
32 236 9.2 81 7 3
NA 101 10.9 84 7 4
64 175 4.6 83 7 5
40 314 10.9 83 7 6
77 276 5.1 88 7 7
97 267 6.3 92 7 8
97 272 5.7 92 7 9
85 175 7.4 89 7 10
NA 139 8.6 82 7 11
10 264 14.3 73 7 12
27 175 14.9 81 7 13
NA 291 14.9 91 7 14
7 48 14.3 80 7 15
48 260 6.9 81 7 16
35 274 10.3 82 7 17
61 285 6.3 84 7 18
79 187 5.1 87 7 19
63 220 11.5 85 7 20
16 7 6.9 74 7 21
NA 258 9.7 81 7 22
NA 295 11.5 82 7 23
80 294 8.6 86 7 24
108 223 8 85 7 25
20 81 8.6 82 7 26
52 82 12 86 7 27
82 213 7.4 88 7 28
50 275 7.4 86 7 29
64 253 7.4 83 7 30
59 254 9.2 81 7 31
39 83 6.9 81 8 1
9 24 13.8 81 8 2
16 77 7.4 82 8 3
78 NA 6.9 86 8 4
35 NA 7.4 85 8 5
66 NA 4.6 87 8 6
122 255 4 89 8 7
89 229 10.3 90 8 8
110 207 8 90 8 9
NA 222 8.6 92 8 10
NA 137 11.5 86 8 11
44 192 11.5 86 8 12
28 273 11.5 82 8 13
65 157 9.7 80 8 14
NA 64 11.5 79 8 15
22 71 10.3 77 8 16
59 51 6.3 79 8 17
23 115 7.4 76 8 18
31 244 10.9 78 8 19
44 190 10.3 78 8 20
21 259 15.5 77 8 21
9 36 14.3 72 8 22
NA 255 12.6 75 8 23
45 212 9.7 79 8 24
168 238 3.4 81 8 25
73 215 8 86 8 26
NA 153 5.7 88 8 27
76 203 9.7 97 8 28
118 225 2.3 94 8 29
84 237 6.3 96 8 30
85 188 6.3 94 8 31
96 167 6.9 91 9 1
78 197 5.1 92 9 2
73 183 2.8 93 9 3
91 189 4.6 93 9 4
47 95 7.4 87 9 5
32 92 15.5 84 9 6
20 252 10.9 80 9 7
23 220 10.3 78 9 8
21 230 10.9 75 9 9
24 259 9.7 73 9 10
44 236 14.9 81 9 11
21 259 15.5 76 9 12
28 238 6.3 77 9 13
9 24 10.9 71 9 14
13 112 11.5 71 9 15
46 237 6.9 78 9 16
18 224 13.8 67 9 17
13 27 10.3 76 9 18
24 238 10.3 68 9 19
16 201 8 82 9 20
13 238 12.6 64 9 21
23 14 9.2 71 9 22
36 139 10.3 81 9 23
7 49 10.3 69 9 24
14 20 16.6 63 9 25
30 193 6.9 70 9 26
NA 145 13.2 77 9 27
14 191 14.3 75 9 28
18 131 8 76 9 29
20 223 11.5 68 9 30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment