Skip to content

Instantly share code, notes, and snippets.

@csirac2
Last active December 15, 2015 20:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save csirac2/5317841 to your computer and use it in GitHub Desktop.
Save csirac2/5317841 to your computer and use it in GitHub Desktop.
SIAD UNSW_ENT import into ALA sandbox
# Original header:
# Name Country State or province Secondary subdivision Locality Latitude Longitude Elevation meters Coll. start date Coll. end date Collector UID USI Number of specimens Sex Type status DeterminationVB DetBy DetDate Host Family Host name Host authority Host det HerbID Depository Host photo Locality photo _end
# Becomes:
# scientificName country stateProvince county verbatimLocality decimalLatitude decimalLongitude minimumElevationInMeters Coll. start date eventDate recordedBy catalogNumber recordNumber individualCount lifeStage sex typeStatus DeterminationVB identifiedBy dateIdentified Host Family Host name Host authority Host det HerbID Depository Host photo Locality photo _end associatedTaxa eventDate associatedMedia
# Break up the life stage/sex column
# %s/^\(\(\([^\t]*\)\t\)\{14\}\)\([^\ ]*\) \([^\t]*\)/\1\4\t\5/g
# Append an associatedTaxa column, built from: "Host Family Host name (Host authority)" columns
# %s/^\(\(\([^\t]*\)\t\)\{20\}\)\([^\t]*\)\t\([^\t]*\)\t\([^\t]*\)\(.*\)$/\1\4\t\5\t\6\7\t\4 \5 (\6)/g
# %s/\t* *()$//g:t
# Append an eventDate column, built from: "Coll. Start Date" & "Coll. End Date" columns
# %s/^\(\(\([^\t]*\)\t\)\{8\}\)\([^\t]*\)\t\([^\t]*\)\(.*\)$/\1\4\t\5\6\t\4\/\5/g
# %s/\/$/g
# %s/\t\//g
# Append an associatedMedia column, built from: "Host photo; Locality photo"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment