It appears that read.csv() converts floats to factors if they have a large number of decimal places.
test_works.csv
:
a,b
1.255,2.993
1.834,2.555
test_fails.csv
:
a,b
1.2555555555555555555555,2.993
1.834,2.555
> str(read.csv("test_works.csv"))
'data.frame': 2 obs. of 2 variables:
$ a: num 1.25 1.83
$ b: num 2.99 2.56
> str(read.csv("test_fails.csv"))
'data.frame': 2 obs. of 2 variables:
$ a: Factor w/ 2 levels "1.2555555555555555555555",..: 1 2
$ b: num 2.99 2.56
Solution thanks to @blahah:
str(read.csv("test_fails.csv", colClasses=c("a"="numeric", "b"="numeric")))
I would love to hear the justification for this default behavior if someone knows it.