Skip to content

Instantly share code, notes, and snippets.

Geospatial privacy initial cheatsheet

There's more to it than this but this is a decent starting point.

  1. If there are personal identifiers in the dataset, the safest approach is to remove them entirely. If you need to keep them in to enable analytics on an 'anonymous user over time', make sure that you don't use a reversible technique like MD5 hashing. See the NYC Taxicab debacle for an example. 2. If you use hashes, use a long salt value and a cryptographically okay hash like SHA512. 3. Or randomize data order and assign serial (increasing integer) numbers to identifiers. 4. But really just removing any kind of identifiable ID is better than trying to obscure it.
  2. Set lower bounds for aggregation: if someone filters an API down to a single record, you may want to return nothing. There's the case where someone's able to craft a well-filtered query and just see one user’s data. See
@jepio
jepio / minted.py
Last active September 18, 2023 13:58
Pandoc filter to use minted for syntax highlighting
#!/usr/bin/env python3
'''
Filter to wrap Pandoc's CodeBlocks into minted blocks when using latex.
Pandoc's `fence_code_attributes` can be used to provide:
- the language (first class)
- minted's argumentless options (following classes)
- minted's options with arguments (attributes)
'''
@jsmits
jsmits / flake8.xml
Last active August 2, 2018 16:52
PyCharm 3.x Flake8 Configuration XML
<toolSet name="Code Checking">
<tool name="Flake8" showInMainMenu="true" showInEditor="true" showInProject="true" showInSearchPopup="true" disabled="false" useConsole="true" showConsoleOnStdOut="false" showConsoleOnStdErr="false" synchronizeAfterRun="true">
<exec>
<option name="COMMAND" value="/usr/local/bin/flake8" />
<option name="PARAMETERS" value="--max-complexity 10 $FilePath$" />
<option name="WORKING_DIRECTORY" value="$ProjectFileDir$" />
</exec>
<filter>
<option name="NAME" value="Filter 1" />
<option name="DESCRIPTION" />
@nicerobot
nicerobot / install.sh
Last active January 16, 2024 07:34
Whew! QGIS 2 on Mavericks built using only Homebrew packages (i.e. without KyngChaos) \o/ -- WARNING: A _major_ assumption here is that your Homebrew PREFIX is /usr/local . -- DISCLAIMER: It worked for me. YMMV
#!/bin/bash
# Run this:
#
# curl https://gist.github.com/nicerobot/7664605/raw/install.sh | bash -s do-sudo
#
# which will run:
[ -f qgis2-homebrew-build.sh ] || {
curl -O https://gist.github.com/nicerobot/7664605/raw/qgis2-homebrew-build.sh
@fge
fge / geometry.json
Created January 23, 2013 14:28
JSON Schema (v4) for a geometry as defined by GeoJSON
{
"$schema": "http://json-schema.org/draft-04/schema#",
"id": "http://json-schema.org/geojson/geometry.json#",
"title": "geometry",
"description": "One geometry as defined by GeoJSON",
"type": "object",
"required": [ "type", "coordinates" ],
"oneOf": [
{
"title": "Point",
@catawbasam
catawbasam / pandas_dbms.py
Last active February 17, 2024 15:16
Python PANDAS : load and save Dataframes to sqlite, MySQL, Oracle, Postgres
# -*- coding: utf-8 -*-
"""
LICENSE: BSD (same as pandas)
example use of pandas with oracle mysql postgresql sqlite
- updated 9/18/2012 with better column name handling; couple of bug fixes.
- used ~20 times for various ETL jobs. Mostly MySQL, but some Oracle.
to do:
save/restore index (how to check table existence? just do select count(*)?),
finish odbc,
@sgillies
sgillies / geo_interface.rst
Last active April 10, 2024 00:26
A Python Protocol for Geospatial Data

Author: Sean Gillies Version: 1.0

Abstract

This document describes a GeoJSON-like protocol for geo-spatial (GIS) vector data.

Introduction