Skip to content

Instantly share code, notes, and snippets.

Avatar
💭
In Tucson

Stephen Richard smrgeoinfo

💭
In Tucson
View GitHub Profile
View README.md

README is empty

@smrgeoinfo
smrgeoinfo / ERAV for field data.md
Last active Aug 29, 2015
GIST of the ERAV data model
View ERAV for field data.md

I've tried to extract the 'gist' of this interesting paper about managing crowd-sourced geospatial data. I think it has interesting applicability to dealing with a data system for geologic field observations.

from S. Andrew Sheppard, Andrea Wiggins, and Loren Terveen, 2014, Capturing Quality: Retaining Provenance for Curated Volunteer Monitoring Data: accessed at http://wq.io/media/papers/provenance_cscw14.pdf 2014-03-01

from the abstract:

a general outline of the workflow tasks common to field-based data collection, and a novel data model for preserving provenance metadata that allows for ongoing data exchange between disparate technical systems and participant skill levels.

Some key points (my take):

[observers] head out into the field to collect data. This step is referred to as the event in our proposed data model. As noted above, this is “the intersection of a person, a bird, a time, and a place” for eBird. Similarly, a River Watch sampling event is the combination of a sampling te

View hkey-algorithm.js
var _ = require('underscore'),
assert = require('assert');
// This is the input array of hkeys
var input = [
"0001",
"0001|0001",
"0001|0002",
"0001|0002|0001",
"0001|0003",
View ngds-ckan-pacakge.json
{
"id": "12340kjfha1092412",
"metadata_modified": "2013-09-13T20:35:03.757357",
"metadata_created": "2013-09-12T20:35:03.757357",
"title": "Whatever the title is",
"notes": "The abstract / description of this thing",
View WFS-bigData
[originally by Ryan Clark to Christoph and Raj, NGDS project]
You've got 1M points in your system. Here are the places where that's going to bottleneck you:
- Asking Geoserver for WMS or WFS results is going to mean pulling those 1M points from the database and then running them through a pipeline that results in either a) a large XML document (WFS) or b) an image (WMS). In both cases, query results will be held in memory on the server until the processing is complete.
- There is probably much less processing involved in building the XML doc than there is to render the image. However, the next step in the process is sending the result to the client. The XML doc will be huge (either in the "shortened" content model form, or the enormous flat form you mention). It will take forever to get across the wire, and you'll probably start timing out in various parts of the pipeline.
- The data arrives at the client's web browser. In the case of the WMS request, you're fine, because its a little file and a minimal
View schema.js
var _ = require('underscore');
module.exports = function (additionals, overrides) {
additionals = _.isArray(additionals) ? additionals : [];
overrides = _.isObject(overrides) ? overrides : {};
function conditional (field) {
return _.contains(additionals, field);
}
View color-map.csv
unit colors
Cba #ffebeb,#99b380
Cm #ebebcc,#ffccde
Ct #ffb3cc,#ccde66
Dtb #664dff
H20 #ebffff
IPMs #99ebde
Jc #ccffde
Je #cccc00
Jk #99de99
View owslib-metadata-mapping.json
{
"identifier": "gmd:fileIdentifier/gco:CharacterString",
"parentidentifier": "gmd:parentIdentifier/gco:CharacterString",
"language": "gmd:language/gco:CharacterString",
"dataseturi": "gmd:dataSetURI/gco:CharacterString",
"languagecode": "gmd:language/gmd:LanguageCode",
"datestamp": "gmd:dateStamp/gco:Date or gmd:dateStamp/gco:DateTime",
"charset": "gmd:characterSet/gmd:MD_CharacterSetCode/@codeListValue",
"hierarchy": "gmd:hierarchyLevel/gmd:MD_ScopeCode/@codeListValue",
"contact": {