Skip to content

Instantly share code, notes, and snippets.

@timrobertson100
Created November 18, 2019 12:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save timrobertson100/25f50ac4e499da530351fe2b41929d6c to your computer and use it in GitHub Desktop.
Save timrobertson100/25f50ac4e499da530351fe2b41929d6c to your computer and use it in GitHub Desktop.
Diagnosing id duplication

Using the lookup tool on c5gateway-vh.gbif.org we can get the keys for the id 1668748136:

12:06:39 UTC c5gateway-vh /usr/local/bin $ ./lookup-occurrence-key 1668748136
Lookup 1668748136 with dataset key from API 97bd086a-cf43-11e2-a9b3-00145eb45e9a
 27:97bd086a-cf43-11e2-a9b3-00145eb45e9a|JMRC|JMRCfungicoll|JMRC:FSU:02570 / 14837 / 750|null column=o:i, timestamp=1553909664771, value=\x00\x00\x00\x00cw\x13h
 73:97bd086a-cf43-11e2-a9b3-00145eb45e9a|http://id.snsb.info/ext/14837/14837/5004 column=o:i, timestamp=1563244584180, value=\x00\x00\x00\x00cw\x13h
 74:97bd086a-cf43-11e2-a9b3-00145eb45e9a|http://id.snsb.info/ext/14837/14837/5005 column=o:i, timestamp=1563244586420, value=\x00\x00\x00\x00cw\x13h
 75:97bd086a-cf43-11e2-a9b3-00145eb45e9a|http://id.snsb.info/ext/14837/14837/5006 column=o:i, timestamp=1553909265952, value=\x00\x00\x00\x00cw\x13h
 76:97bd086a-cf43-11e2-a9b3-00145eb45e9a|http://id.snsb.info/ext/14837/14837/5007 column=o:i, timestamp=1563244589868, value=\x00\x00\x00\x00cw\x13h
 77:97bd086a-cf43-11e2-a9b3-00145eb45e9a|http://id.snsb.info/ext/14837/14837/5008 column=o:i, timestamp=1563244566807, value=\x00\x00\x00\x00cw\x13h

The timestamps are in nanoseconds, so strip the last 3 digits and get them using:

 date -d @1563244589

This shows us that for 1668748136 we have:

| Timestamp           | Type         | Key                                             |
|---------------------|--------------|-------------------------------------------------|
| 2019-03-30 01:27:45 | occurrenceID | http://id.snsb.info/ext/14837/14837/5006        |
| 2019-05-30 01:34:24 | Triplet      | JMRC|JMRCfungicoll|JMRC:FSU:02570 / 14837 / 750 |
| 2019-07-16 02:36:06 | occurrenceID | http://id.snsb.info/ext/14837/14837/5008        |
| 2019-07-16 02:36:24 | occurrenceID | http://id.snsb.info/ext/14837/14837/5004        |
| 2019-07-16 02:36:26 | occurrenceID | http://id.snsb.info/ext/14837/14837/5005        |
| 2019-07-16 02:36:29 | occurrenceID | http://id.snsb.info/ext/14837/14837/5007        |

@MattBlissett
Copy link

Attempt 125

            <abcd:UnitGUID>http://id.snsb.info/ext/14837/14837/5004</abcd:UnitGUID>
            <abcd:SourceInstitutionID>JMRC</abcd:SourceInstitutionID>
            <abcd:SourceID>JMRCfungicoll</abcd:SourceID>
            <abcd:UnitID>JMRC:FSU:02570 / 14837 / 750</abcd:UnitID>

            <abcd:UnitGUID>http://id.snsb.info/ext/14837/14837/5005</abcd:UnitGUID>
            <abcd:SourceInstitutionID>JMRC</abcd:SourceInstitutionID>
            <abcd:SourceID>JMRCfungicoll</abcd:SourceID>
            <abcd:UnitID>JMRC:FSU:02570 / 14837 / 750</abcd:UnitID>

            <abcd:UnitGUID>http://id.snsb.info/ext/14837/14837/5006</abcd:UnitGUID>
            <abcd:SourceInstitutionID>JMRC</abcd:SourceInstitutionID>
            <abcd:SourceID>JMRCfungicoll</abcd:SourceID>
            <abcd:UnitID>JMRC:FSU:02570 / 14837 / 750</abcd:UnitID>

            <abcd:UnitGUID>http://id.snsb.info/ext/14837/14837/5007</abcd:UnitGUID>
            <abcd:SourceInstitutionID>JMRC</abcd:SourceInstitutionID>
            <abcd:SourceID>JMRCfungicoll</abcd:SourceID>
            <abcd:UnitID>JMRC:FSU:02570 / 14837 / 750</abcd:UnitID>

            <abcd:UnitGUID>http://id.snsb.info/ext/14837/14837/5008</abcd:UnitGUID>
            <abcd:SourceInstitutionID>JMRC</abcd:SourceInstitutionID>
            <abcd:SourceID>JMRCfungicoll</abcd:SourceID>
            <abcd:UnitID>JMRC:FSU:02570 / 14837 / 750</abcd:UnitID>

@MattBlissett
Copy link

So we can't and don't validate BioCASe for unique GUIDs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment