Skip to content

Instantly share code, notes, and snippets.

View mdoering's full-sized avatar

Markus Döring mdoering

View GitHub Profile
@mdoering
mdoering / ChecklistValidationReport.java
Last active August 29, 2015 14:06
ValidationReport for both checklists and occurrences
package org.gbif.api.model.crawler;
import java.util.List;
import com.google.common.base.Objects;
public class ChecklistValidationReport {
// the number of records checked in the validation
private final int checkedRecords;
@mdoering
mdoering / gist:e19c813d04466b637cf9
Created October 28, 2014 14:34
Minutes of Darwin Core (DwC) Nomenclature - TDWG 2014

issue 1, Nomen

if nothing is decided NomenclaturalChecklist would be removed. Further discussions needed as not all consequences have been fully discussed. Decision: remove term!

issue 2 typifiedName

  • move to Identification group (Rich)
  • typeStatus definition needs updated
  • we should also have a new term to declare the typeSpecies/typeGenus (Walter)
package org.gbif.registry.doi;
import org.gbif.api.model.common.DOI;
import org.gbif.doi.metadata.datacite.DataCiteMetadata;
import java.util.UUID;
/**
* Registry internal service that guarantees to issue unique new DOIs and deals with scheduling
* DOI metadata updates and registration via RabbitMQ.
package org.gbif.registry.doi;
import org.gbif.api.model.common.DOI;
import org.gbif.doi.metadata.datacite.DataCiteMetadata;
import org.gbif.doi.service.DoiStatus;
import java.net.URI;
/**
* A message to request an update to a DOIs metadata and target URL in DataCite.
@mdoering
mdoering / gbif-download-all.xml
Created February 5, 2015 15:39
Large DataCite Metadata document
This file has been truncated, but you can view the full file.
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<resource xsi:schemaLocation="http://datacite.org/schema/kernel-3 http://schema.datacite.org/meta/kernel-3/metadata.xsd" xmlns="http://datacite.org/schema/kernel-3" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<identifier identifierType="DOI">10.15468/dl.ykyay0</identifier>
<creators>
<creator>
<creatorName>occdownload gbif.org</creatorName>
</creator>
</creators>
<titles>
<title>GBIF Occurrence Download</title>
package org.gbif.markus.udf;
import java.util.UUID;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import com.google.common.base.Strings;
import org.apache.commons.lang3.StringUtils;
import org.apache.hadoop.hive.ql.exec.Description;
curl -i -u occdownload.gbif.org:occdownload1 -H "Content-Type: application/json" -X PUT -d @download.json http://apps2.gbif-uat.org:8084/occurrence/download/0000478-150805223722583
@mdoering
mdoering / JdbcArrayTest.java
Last active February 18, 2016 09:24
Postgres JDBC text array behavior
import java.sql.Array;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
/**
*
*/
@mdoering
mdoering / datum.md
Last active February 25, 2016 16:19

Reprojecting coordinates according to their geodetic datum

For a long time Darwin Core has a term to declare the exact geodetic datum used for the given coordinate. Quite a few data publishers have used dwc:geodeticDatum http://rs.tdwg.org/dwc/terms/index.htm#geodeticDatum for some time to publish the datum of their location coordinates.

Until now GBIF has treated all coordinates as if they were in WGS84 http://en.wikipedia.org/wiki/World_Geodetic_System, the widespread global standard datum used by the Global Positioning System (GPS). Accordingly locations given in a different datum, for example NAD27 or AGD66, were displaced on GBIF maps a little. This so called "datum shift" is not dramatic, but can be up to a few hundred metres depending on the location and datum. The Univeristy of Colorado has a nice visualization of the impact: http://www.colorado.edu/geography/gcraft/notes/datum/datum_f.html

At GBIF we thought it is about time now to interpret the geodeticDatum and reproject all coordinates as goo