Skip to content

Instantly share code, notes, and snippets.

Tim Robertson timrobertson100

Block or report user

Report or block timrobertson100

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
View delme.txt
1) Verify the correct mailing list is in place (TR / DM)
2) Ensure the participants in the Kilkenny accord are happy that it be finalised (DM mail to list)
- with the change the 5,000€ is not a "hard limit"
3) Draft a communication to be sent to mailing list covering (DM)
- The Kilkenny Accord status and share it
View csv-duplicate.sql
ADD JAR /tmp/hadoop-compress-1.3-SNAPSHOT.jar;
ADD JAR /tmp/occurrence-hive-0.89-20181017.084448-7.jar;
ADD JAR /tmp/brickhouse-0.6.0.jar;
ADD JAR /tmp/occurrence-common-0.89-20181017.084442-7.jar;
ADD JAR /tmp/gbif-api-0.72-20181012.105547-3.jar;
SET io.seqfile.compression.type=BLOCK;
SET mapred.output.compression.codec=org.gbif.hadoop.compress.d2.D2Codec;
SET io.compression.codecs=org.gbif.hadoop.compress.d2.D2Codec;
timrobertson100 / taxstatus-top100.csv
Created Oct 1, 2018
Top 100 by occurrence count of taxonomicStatus in occurrence data
View taxstatus-top100.csv
taxonomicstatus c
NULL 987614012
accepted 21892669
Aceptado 3513846
accepted name 1213782
Accepted 956810
valid 675813
Temporal 317255
válido 277952
timrobertson100 / top.csv
Created Sep 19, 2018
Top 100 dynamic properties
View top.csv
v_dynamicproperties count
NULL 967996798
{"Activity":"Forage"} 2922013
"{'coverScaleCode':'+'}" 2440492
"{'coverScaleCode':'r'}" 1456845
"{'coverScaleCode':'1'}" 1428278
{} 1075352
{"Activity":"Display/Song"} 870676
{"Activity":"Resting"} 674730
"{'coverScaleCode':'3'}" 648481
timrobertson100 /
Last active Aug 13, 2018
ElasticsearchIOIT notes to self


Notes to self while reviewing the PR for BEAM-5107.

Maven instructions

ElasticsearchIOITcommon JDoc references mvn. To work around this quickly I did the following hack(!).

Added this to the elasticsearch-tests-common/build.gradle

timrobertson100 /
Last active Jun 13, 2018
Decoder of kudu operations (UNTESTED!!!)
* Decodes the protobuf bytes into {@link Operation} instances.
* <p>The encoded format is defined as follows:
* <ol>
* <li>"rows" is a byte array encoding of:
* <ol>
* <li>The operation type (e.g. Upsert) encoded as a byte
* <li>The "isSet" bitSet encoded as one or more bytes
timrobertson100 /
Created May 21, 2018
Example for Tim Cook of the apache/beam channel
package com.opencore.demo;
import org.apache.beam.sdk.testing.PAssert;
import org.apache.beam.sdk.testing.TestPipeline;
import org.apache.beam.sdk.transforms.Create;
import org.apache.beam.sdk.transforms.DoFn;
import org.apache.beam.sdk.transforms.ParDo;
import org.apache.beam.sdk.values.PCollection;
import org.junit.Rule;
void processWithRetry(
String collection, UpdateRequest request, int numberAttempts, Duration maxDuration)
throws IOException, InterruptedException {
// TODO: sanitize those params
request.setBasicAuthCredentials(username, password);
Sleeper sleeper = Sleeper.DEFAULT;
// Note: FluentBackoff counts retries excluding the original while we count attempts
// to remove any notion of ambiguity (hence the -1)
View DatasetProcessStatusMapper.xml
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE mapper PUBLIC "-// Mapper 3.0//EN" "" >
<mapper namespace="org.gbif.registry.persistence.mapper.DatasetProcessStatusMapper">
<resultMap id="CRAWL_JOB_MAP" type="CrawlJob">
<idArg column="dataset_key" javaType="java.util.UUID" jdbcType="OTHER"/>
<idArg column="attempt" javaType="int"/>
<arg column="endpoint_type" javaType="org.gbif.api.vocabulary.EndpointType" jdbcType="OTHER"/>
<arg column="target_url" javaType=""/>
View lineage.txt
interpretedOccurrence: {
id: 123,
decimalLatitude: 12.3445,
lineage: [
field: decimalLatitude,
source: rawOccurrence,
fields: [verbatimLatitude, verbatimLongitude, decimalLatitude, decimalLongitude, geodeticDatum, country, stateProvince],
services: [
You can’t perform that action at this time.