Skip to content

Instantly share code, notes, and snippets.

@TDahlberg
Last active January 22, 2020 00:45
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save TDahlberg/b432363c656975c99578 to your computer and use it in GitHub Desktop.
Save TDahlberg/b432363c656975c99578 to your computer and use it in GitHub Desktop.
GeoData Science Learning Resources

#“A data scientist is someone who knows more statistics than a computer scientist and more computer science than a statistician.” -Josh Blumenstock

#"A geo-data scientist is someone who knows more about GIS than either of those guys." -Tyler Dahlberg

#Going Geo Open Source In all likelihood a list like this has been written somewhere, by someone, for some reason. I know I'm not breaking any ground here; I'm just trying to organize on paper what's been going through my head ever since I got out of grad school.

As someone trained in traditional GIS (ie ArcGIS), I have a lot of hurdles to clear when it comes to 'open-sourcing' myself. Don't get me wrong, I love (and love-hate) ESRI's product. Using ArcGIS is like curling up by a nice warm fire in the middle of a snowstorm or looking out the window from a cozy couch on a rainy day. It'll probably always be there, and years from now it'll be just like you remember. But today's world is changing far more rapidly than ESRI can keep up. GIS is expanding into web mapping, spatial apps, and new applications of location information that spatial analysts didn't even dream of five years ago, let alone two or three.

##The Geo Data Scientist's Toolkit: ###QGIS QGIS doesn't replicate everything in ArcGIS, and its unfamiliar interface is hard to get used to to, but it's the only FOSS product that comes close to a desktop GIS solution. Bonus: Many of QGIS' operations run on GDAL, and even output the code below the tool simultaneously. Two birds!

###GDAL GDAL is a command-line tool that's become ubiquitous in the open source geo world. It's great for projection, clipping, transforming, and converting spatial files.

###Python Python is incredibly easy (well, for a programming language anyway) to pick up. Its syntax is natural and easy to read, and it's widely used and accepted already in the geospatial and statistics communities as a means of processing spatial data.

  • Tools:
    • IPython Notebook: A means to share and perform Python analysis.
    • Pandas: A Python data processing library that makes manipulating data structures easier.
    • Anaconda: A Python installation package that includes just about every data analysis package out there.
  • Learning Resources:

###Javascript If you want to be able to share your maps on the web, and you want to be free to innovate beyond the rails present in ArcGIS Online, CartoDB, and Mapbox, you NEED to know Javascript.

###R Command-line statistical software that has an incredible array of packages that can do bonkers stuff.

###Postgres+PostGIS This one is interesting. For me it represents and entirely different paradigm for thinking about spatial data (as databases, rather than as a bunch of flat proprietary files). The ability to feed in data and perform rapid spatial queries with near instant results is essential.

###Miscellaneous

###Courses & MOOCs:

###Data Tools

  • Enigma.io: Great place to search for public data sets from all over the world
  • Import.io: Data-scraping & API-making website. Desktop app too.
  • Data Science Toolbox: A cool toolbox of data science libraries in Python and R that you can install on any machine with Amazon EC2 or Vagrant.
  • Geojson.io: Create geojson point, line, and polygon files from scratch.Great web mapping tool.
  • Geomancer.io: Upload CSV with some sort of geographic identifer column, get out census data. It's magic.
  • A Paragraph: Literate MappingSimilar to Geojson.io, but lets you use natural language
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment