Skip to content

Instantly share code, notes, and snippets.

View tucotuco's full-sized avatar

John Wieczorek tucotuco

  • Museum of Vertebrate Zoology
  • UC Berkeley
View GitHub Profile
@tucotuco
tucotuco / KUFish-Sequences.txt
Created September 11, 2014 18:42
Custom Tabular Export from Open Refine for KU Fish Tissues Darwin Core Archive processed to create a Mimarks Specimen extension from associatedSequences from Genbank.
id isol_growth_condt biotic_relationship alt_elev submitted_to_insdc materialSampleID url seq_meth investigation_type target gene project_name ref_biomaterial collection_date geo_loc_name lat_lon
497cc487-eb99-421b-9591-23f52ea5246b not recorded native 1 497cc487-eb99-421b-9591-23f52ea5246b http://www.ncbi.nlm.nih.gov/nuccore/HQ168587 Sanger mimarks-specimen cytochrome oxidase subunit 1 University of Kansas Biodiversity Institute Fish Tissue Collection 2011. Evolution of a Neotropical marine fish lineage (Subfamily Chaenopsinae, Suborder Blennioidei) based on phylogenetic analysis of combined molecular and morphological data. Mol. Phylogenet. Evol.. 60(2): 236-248; 2013. Phylogeny and biogeography of a shallow water fish clade (Teleostei: Blenniiformes). BMC Evo. Biol.. 13:210: 1-18 2002-01-27 Fiji, Viti Levu -18.1483333, 178.3985
@tucotuco
tucotuco / MimarksSpecimenFromDwCAExportSpecification.json
Created September 11, 2014 18:34
Open Refine Custom Tabular Exporter for Mimarks Specimen data integration in a Darwin Core Archive
{
"format": "tsv",
"separator": "\t",
"lineSeparator": "\n",
"encoding": "UTF-8",
"outputColumnHeaders": true,
"outputBlankRows": false,
"columns": [
{
"name": "id",
@tucotuco
tucotuco / MimarksSpecimenFromDwCA.json
Created September 11, 2014 18:19
Open Refine Operation History for Mimarks Specimen data integration in a Darwin Core Archive
[
{
"op": "core/column-move",
"description": "Move column associatedSequences to position 0",
"columnName": "associatedSequences",
"index": 0
},
{
"op": "core/column-move",
"description": "Move column catalogNumber to position 0",
SELECT institutioncode, basisofrecord, count(institutioncode) as reps
FROM [dumps.full]
WHERE basisofrecord not in ('PreservedSpecimen', 'FossilSpecimen')
GROUP BY institutioncode, basisofrecord
ORDER BY reps DESC
LIMIT 1000
@tucotuco
tucotuco / gist:bd68c868c2aed03c8bee
Created July 3, 2014 01:32
Errors deleting nmnh/amphibians&reptiles documents - VertNet indexer in AppEngine
Partial failure....
2014-07-02 20:05:08.126 /index-delete-resource 200 19020ms 0kb AppEngine-Google; (+http://code.google.com/appengine) module=default version=indexer
0.1.0.2 - - [02/Jul/2014:16:05:08 -0700] "POST /index-delete-resource HTTP/1.1" 200 457 "http://indexer.vertnet-portal.appspot.com/index-delete-resource" "AppEngine-Google; (+http://code.google.com/appengine)" "indexer.vertnet-portal.appspot.com" ms=19020 cpu_ms=93 cpm_usd=0.100051 queue_name=index-delete-resource task_name=8128078467183889543 app_engine_release=1.9.6 trace_id=9692b3bd02d15e1f816494e545edfd85 instance=00c61b117cbe5b1f090bf6a6aebe2d9fb29b27
I 2014-07-02 20:04:49.109
Deleting resource:<br>Namespace: index-2014-03-12<br>Index_name: dwc<br>Resource: nmnh/nmnh-amphibians-reptiles<br>Batch size: <br>Max delete: <br>Dry run: <br>
I 2014-07-02 20:04:49.109
Query: resource:nmnh/nmnh-amphibians-reptiles namespace: index-2014-03-12 index: dwc
E 2014-07-02 20:04:59.067
Search ERROR on query: resource:nmnh/nmnh-amphibians-reptiles limit: 2
@tucotuco
tucotuco / gist:44fa6620976b5b048c15
Created July 2, 2014 21:57
Normal initiation of new resource deletion - VertNet indexer on AppEngine
2014-07-02 18:49:25.530 /index-delete-resource?resource=nmnh/nmnh-amphibians-reptiles&index_name=dwc&namespace=index-2014-03-12 200 4270ms 0kb Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36 module=default version=indexer
181.166.225.49 - gtuco.btuco [02/Jul/2014:14:49:25 -0700] "GET /index-delete-resource?resource=nmnh/nmnh-amphibians-reptiles&index_name=dwc&namespace=index-2014-03-12 HTTP/1.1" 200 311 - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36" "indexer.vertnet-portal.appspot.com" ms=4270 cpu_ms=70 cpm_usd=0.050035 app_engine_release=1.9.6 trace_id=991582a286527b3a279820da4b75e7da instance=00c61b117c8e3f396df83f6cbbdf11e095208f
I 2014-07-02 18:49:21.284
Deleting resource:<br>Namespace: index-2014-03-12<br>Index_name: dwc<br>Resource: nmnh/nmnh-amphibians-reptiles<br>Batch size: <br>Max delete: <br>Dry run: <br>
I 2014-07-02 18:49:21.284
Query: resource:nmn
@tucotuco
tucotuco / gist:4b681fdfb0740355f0b2
Last active August 29, 2015 14:03
App Engine Error Log deleting NMNH Mammals - VertNet indexer on AppEngine
2014-07-02 18:35:33.248 /index-delete-resource?resource=nmnh/nmnh-mammals&index_name=dwc&namespace=index-2014-03-12 200 29903ms 0kb Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36 module=default version=indexer
181.166.225.49 - gtuco.btuco [02/Jul/2014:14:35:33 -0700] "GET /index-delete-resource?resource=nmnh/nmnh-mammals&index_name=dwc&namespace=index-2014-03-12 HTTP/1.1" 200 220 - "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_8) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36" "indexer.vertnet-portal.appspot.com" ms=29904 cpu_ms=47 cpm_usd=0.150025 app_engine_release=1.9.6 trace_id=0b919ff2c264c1300e9f5ee336ba219f instance=00c61b117c8e3f396df83f6cbbdf11e095208f
I 2014-07-02 18:35:03.367
Deleting resource:<br>Namespace: index-2014-03-12<br>Index_name: dwc<br>Resource: nmnh/nmnh-mammals<br>Batch size: <br>Max delete: <br>Dry run: <br>
I 2014-07-02 18:35:03.367
Query: resource:nmnh/nmnh-mammals namespace: index-20
if len(ids) < 1: # Didn't find any matches in this batch.
if len(docs) == 100:
next_id = docs[-1].doc_id
else:
index.delete(id)
return
else: # Matches found, delete them.
blast, next_id = ids[:-1], ids[-1]
index.delete(blast)
params = dict(index_name=index_name, namespace=namespace, id=next_id,
user=> (use 'gulo.harvest)
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
Warning: *default-encoding* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *default-encoding* or change the name. (clojure/contrib/io.clj:73)
Warning: *buffer-size* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *buffer-size* or change the name. (clojure/contrib/io.clj:79)
Warning: *byte-array-type* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please either indicate ^:dynamic *byte-array-type* or change the name. (clojure/contrib/io.clj:84)
Warning: *char-array-type* not declared dynamic and thus is not dynamically rebindable, but its name suggests otherwise. Please eithe
gulo.harvest=> (sync-resource-table)
java.io.IOException: Server returned HTTP response code: 400 for URL: https://vertnet.cartodb.com/api/v1/sql/
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1625)
at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:254)
at com.cartodb.impl.ApiKeyCartoDBClient.executeQuery(ApiKeyCartoDBClient.java:106)
at cartodb.core$apikey_execute.doInvoke(core.clj:25)
at clojure.lang.RestFn.invoke(RestFn.java:445)
at cartodb.core$query.doInvoke(core.clj:47)
at clojure.lang.RestFn.invoke(RestFn.java:464)
at gulo.harvest$sync_resource_table.invoke(harvest.clj:204)