I'm going to start off by describing a pretty common data analysis scenario, and then talk about how using R can help:
- You have a lot of individual spreadsheet files containing your data, and you need it all together, so you copy and paste each one into a master file.
- Next you do a bunch of data cleaning in the master spreadsheet - fixing date formats, unit conversions, transformations, etc.
- You then import the data into your favourite statistics program, run your analysis, and
- copy the outputs back into a spreadsheet or other graphing program to plot your results.
- You give the results to a colleague to review and she comes back with some concerns that something doesn't look quite right with the results. She also suggests that a different modelling technique would be more appropriate.
- You comb through the original data and realize that in some of the files one column was misaligned, and so in copying and pasting these into the master dataset this error was compounded over many rows.
- In ad