Skip to content

Instantly share code, notes, and snippets.

@cl4rk3
cl4rk3 / GUIDCleanup.sql
Last active August 29, 2015 14:23
Cleanup Multiple MAVIS records per AAPB GUID
-- Shows that there are 1819 titles with duplicated GUIDS
SELECT COUNT(title_no)
FROM aims.TITLE_OTHER_ID_FQV
WHERE identifier IN
(SELECT identifier
FROM
(SELECT identifier,
COUNT(title_no) AS tcount
FROM aims.TITLE_OTHER_ID_FQV
WHERE IDENTIFIER_TYPE = 'AMARCGUID'
@cl4rk3
cl4rk3 / filemarkToTapeToGUIDMaker
Last active August 29, 2015 14:22
bash one liners for parsing, counting, and analyzing AA tape manifest
# there are 73940 GUIDs (assuming that there is one bagit.txt per "package" or GUID)
cat cpb-aa.global_lto_report.01.26.2014.csv |grep even_num|grep bagit.txt|sed 's/\//,/g'|awk -F "," '{ print $1","$3","$11 }'|wc -l
# creates 3 column list (filemark,tape id,AA GUID)
cat cpb-aa.global_lto_report.01.26.2014.csv |grep even_num|grep bagit.txt|sed 's/\//,/g'|awk -F "," '{ print $1","$3","$11 }' > filemarkToTapeToGUID.csv
# create 3 column list (tape id,filemark (TAR#),count of packages in TAR)
cat filemarkToBoxToGUID.csv|awk -F "," '{ print $1"\t"$2}'|sort|uniq -c|sort -n|awk '{print $3","$2","$1}'|sort -u > tapeTofilemarkTocount.csv
# count all files that have more than one file per GUID
http://sourceforge.net/projects/loc-xferutils/files/loc-bagger/2.1.2/bagger-2.1.2.zip/download
@cl4rk3
cl4rk3 / testTapeList.csv
Created June 1, 2015 21:14
American Archive Test Tape List
2025415 cpb-aacip/55-10jsz865 AA0006L5
2025468 cpb-aacip/55-10jsz865 AA0006L5
2025416 cpb-aacip/55-10jsz8j2 AA0006L5
2025469 cpb-aacip/55-10jsz8j2 AA0006L5
2025417 cpb-aacip/55-10jsz9x9 AA0006L5
2025470 cpb-aacip/55-10jsz9x9 AA0006L5
2025418 cpb-aacip/55-10wq05kr AA0006L5
2025471 cpb-aacip/55-10wq05kr AA0006L5
2025419 cpb-aacip/55-10wq07nf AA0006L5
2025472 cpb-aacip/55-10wq07nf AA0006L5
@cl4rk3
cl4rk3 / mavisImportHints.txt
Created June 1, 2015 13:24
MAVIS import hints
mavisXml -xmlImport:T:\ILS2MAVIS\Titles\MovingImage\20150430_VAG_Titles_spec.xml
<mavisImport>
<fileList>
<inputFile>T:\ILS2MAVIS\Titles\MovingImage\VhsDigiBetaBetaCam_Ils2Mavis_titles_VAG.20150401.bib_conversion.xml</inputFile>
</fileList>
</mavisImport>
@cl4rk3
cl4rk3 / digitized.xml
Created May 27, 2015 21:01
working digitized data
This file has been truncated, but you can view the full file.
<?xml version="1.0" encoding="UTF-8"?>
<pbcoreCollection>
<pbcoreDescriptionDocument xmlns:xsi="xsi" xmlns="http://www.pbcore.org/PBCore/PBCoreNamespace.html" xsi:xmlns="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.pbcore.org/PBCore/PBCoreNamespace.html http://www.pbcore.org/xsd/pbcore-2.0.xsd">
<pbcoreAssetType>Episode</pbcoreAssetType>
<pbcoreAssetDate dateType="created">1982-10-09</pbcoreAssetDate>
<pbcoreAssetDate dateType="created">1982-10-10</pbcoreAssetDate>
<pbcoreIdentifier source="http://americanarchiveinventory.org">cpb-aacip/35-5269pf7n</pbcoreIdentifier>
<pbcoreIdentifier source="Sony Ci">081a154e049143a898f6ea7a4728dfe7</pbcoreIdentifier>
<pbcoreTitle titleType="Episode Number">201-214</pbcoreTitle>
<pbcoreTitle titleType="Series">Bluegrass Ramble</pbcoreTitle>
{
{I have|I’ve} been {surfing|browsing} online more than {three|3|2|4} hours today, yet
I never found any interesting article like yours. {It’s|It is}
pretty worth enough for me. {In my opinion|Personally|In my view},
if all {webmasters|site owners|website owners|web owners} and bloggers made good content as you did, the {internet|net|web} will be {much more|a lot more} useful than ever before.|
I {couldn’t|could not} {resist|refrain from} commenting.
{Very well|Perfectly|Well|Exceptionally well} written!|
{I will|I’ll} {right away|immediately} {take hold of|grab|clutch|grasp|seize|snatch} your {rss|rss feed} as I {can not|can’t} {in finding|find|to find} your {email|e-mail} subscription {link|hyperlink} or {newsletter|e-newsletter} service.
Do {you have|you’ve} any? {Please|Kindly} {allow|permit|let} me {realize|recognize|understand|recognise|know} {so that|in order that} I {may
just|may|could} subscribe. Thanks.|
@cl4rk3
cl4rk3 / workingMinimalMavisTitle-RS-bare2.xml
Last active August 29, 2015 14:18
Minimal working mavis title record with rights info
<?xml version="1.0" encoding="ISO-8859-1"?>
<mavis database="LOC:mbrp" version="03.07.06" organisation="Library of Congress" xmlns:xl="http://www.w3.org/TR/xlink" xmlns="http://www.wizardis.com.au/2005/12/MAVIS">
<TitleWork>
<objectIdentifiers>
<ObjectIdentifier>
<identifier>cpb-aacip/46-41mgqsxz</identifier>
<identifierType>AMARCGUID</identifierType>
</ObjectIdentifier>
</objectIdentifiers>
<mediums>
@cl4rk3
cl4rk3 / workingMinimalMavisTitle-MI-bare.xml
Last active August 29, 2015 14:17
Example of importable MAVIS title record xml for moving image
<?xml version="1.0" encoding="ISO-8859-1"?>
<mavis database="LOC:mbrs" version="03.07.06" organisation="Library of Congress" xmlns:xl="http://www.w3.org/TR/xlink" xmlns="http://www.wizardis.com.au/2005/12/MAVIS">
<TitleWork>
<objectIdentifiers>
<ObjectIdentifier>
<identifier>cpb-aacip/189-8380gkx9</identifier>
<identifierType>AMARCGUID</identifierType>
</ObjectIdentifier>
<ObjectIdentifier>
<identifier>LAC-2531/1</identifier>
<?xml version="1.0" encoding="ISO-8859-1"?>
<mavis database="LOC:mbrs" version="03.07.06" organisation="Library of Congress" xmlns:xl="http://www.w3.org/TR/xlink" xmlns="http://www.wizardis.com.au/2005/12/MAVIS">
<TitleWork>
<objectIdentifiers>
<ObjectIdentifier>
<identifier>204-053ffc0w</identifier>
<identifierType>AMARCGUID</identifierType>
</ObjectIdentifier>
<ObjectIdentifier>
<identifier>CAS0221</identifier>