When using rio::import()
from a CSV data gets truncated at the first line with an error, in the example case described here, a different number of columns.
This does not happen when using readr::read_csv()
Dataset:
- Source: Data on Dengue surveillance in Peru, https://www.datosabiertos.gob.pe/dataset/vigilancia-epidemiol%C3%B3gica-de-dengue
- This dataset has 501,692 rows (as of 2024-04-24)
Thanks for your comments @chainsawriot, in the thread https://masto.machlis.com/@smach/112325794467872813, I mentioned that {rio} has this specific requirement for well-formed input data, so even though it provides a very uniform API for reading, when used for teaching to beginners in R (and programming), this requirement will be counterproductive, in particular if the participants are using real life data (warts and all), as the instructors will have to spend some time explaining why the issues appear, and why {rio} cannot be used, even though the aim was to teach them how to use {rio} to make it easy to read any input data they will encounter.
This happened to us when we were teaching people working in analysis of health surveillance data, previously we had used
readr::read_csv()
,vroom::read_csv()
, baseread.csv()
, etc., but some of the instructors wanted to make it easy for the participants, so we changed all the material to userio::import()
, the session was OK when showing an example w/ "clean" data, the problems started when they had to practice in class with the data they brought from their daily work. We had to have an extra session to explain the issues and the workarounds.Bottom line: it distracted from the aim of teaching how to get data into R, and was confusing to the participants.