chrislkeller/pitch.md

## pitch.md

      
    Raw
  

              pitch.md
            
          
    Code allows us to make all kinds of visuals and tools that display data for analysis.
But when you're starting to mix code, data and journalism - and you lack a deep statistics background to draw upon - everything looks like a nail that you can whack with your shiny hammer. And everything - scatterplots to nearest neighbor to regression - seems important.
So how do you move from citing only the average, median & percent change in all of your work and begin to build skills and knowledge that can lead to a deeper analysis of datasets?
I propose a discussion that helps beginning data journalists/news apps developers better understand which analytical and statistical methods are best suited to different data situations.
For example:

What kind of data lends itself to a scatterplot & what does does the resulting graph tell you?
When does it make sense to use distribution graphs?
Why are "correlations are always meaningful but not necessarily useful?"
Are regression -- both linear and logistic -- and nearest neighbor that scary?