Skip to content

Instantly share code, notes, and snippets.

@ddanderson
Created February 13, 2011 10:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save ddanderson/824579 to your computer and use it in GitHub Desktop.
Save ddanderson/824579 to your computer and use it in GitHub Desktop.
A demonstration of how to convert a Berkeley DB file. Discussion here: libdb.wordpress.com
#!/usr/bin/perl
#
# Copyright (c) 2011
# Donald D. Anderson. All rights reserved.
#
# Redistribution and use in source and binary forms are permitted.
# This software is provided 'as is' and any express or
# implied warranties, including, but not limited to, the implied
# warranties of merchantability, fitness for a particular purpose, or
# non-infringement, are disclaimed.
#
################################################################
#
# Usage: bdb_convert_data fromfile tofile
#
# Converts data in a Berkeley DB database.
# Can be customized to remove or add bytes to each record,
# and even add new records.
#
# Warning: this should normally be used on a quiet system (or best, a
# system that has no live BDB processes. There is nothing
# transactional about this script!
#
# Warning: should not be used if you've changed your btree compare function
# (since db_load will not do the right thing).
#
# See discussion on http://libdb.wordpress.com
die "Usage: $0 fromfile tofile" if ($#ARGV != 1);
$from = $ARGV[0];
$to = $ARGV[1];
die "$0: $from: does not exist" if (! -f $from);
die "$0: $to: exists, will not overwrite" if (-f $to);
open IN, "db_dump $from|" || die "$0: cannot run db_dump";
unlink "$to.TMP";
open OUT, "|db_load $to.TMP" || die "$0: cannot run db_load";
# convert_key and convert_value are called with $_ set to
# a data line from the db_dump output. Each data line starts
# with a single space, then there are hex digits, a pair of hex
# digits for each 8 bit char. E.g. ' 65696e7320d0bfd1' is 8 bytes.
# This convert_key passes through the key without modification
sub convert_key() {
$line=$_;
#print "key: $line";
print OUT "$line";
}
# This convert_value 'removes' the second 4 bytes, for demonstration
# !! **** modify this as necessary **** !!
#
sub convert_value() {
$line=$_;
# !! **** here's the custom part **** !!
if (length($_) > 17) {
$line = substr($_,0,9) . substr($_,17);
}
#print "dat: $line";
print OUT "$line";
}
$iskey = 1;
# The dbdump format contains some header info, that starts
# in the first column. Those lines are copied directly.
# The data appears with a single space in the first column,
# followed by a bunch of hex numbers. Lines of data alternate
# between keys and values.
while (<IN>) {
if (/^ /) {
if ($iskey) {
&convert_key;
} else {
&convert_value;
}
# alternate lines
$iskey = ! $iskey;
}
else {
print OUT $_;
}
}
close (IN);
close (OUT);
rename "$to.TMP", "$to";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment