Skip to content

Instantly share code, notes, and snippets.

@ethanwhite
Last active August 29, 2015 14:02
Show Gist options
  • Save ethanwhite/041ea449ca4c5a41cc7f to your computer and use it in GitHub Desktop.
Save ethanwhite/041ea449ca4c5a41cc7f to your computer and use it in GitHub Desktop.
read.csv() with too many decimal places

It appears that read.csv() converts floats to factors if they have a large number of decimal places.

Data files

test_works.csv:

a,b
1.255,2.993
1.834,2.555

test_fails.csv:

a,b
1.2555555555555555555555,2.993
1.834,2.555

Import them into R using read.csv() and examine them

> str(read.csv("test_works.csv"))
'data.frame':	2 obs. of  2 variables:
 $ a: num  1.25 1.83
 $ b: num  2.99 2.56
> str(read.csv("test_fails.csv"))
'data.frame':	2 obs. of  2 variables:
 $ a: Factor w/ 2 levels "1.2555555555555555555555",..: 1 2
 $ b: num  2.99 2.56
@ethanwhite
Copy link
Author

Solution thanks to @blahah:

str(read.csv("test_fails.csv", colClasses=c("a"="numeric", "b"="numeric")))

I would love to hear the justification for this default behavior if someone knows it.

@ethanwhite
Copy link
Author

@ethanwhite
Copy link
Author

Also, this is new behavior and there is some debate over whether it should remain this way or change back:
http://r.789695.n4.nabble.com/type-convert-and-doubles-td4688616.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment