Skip to content

Instantly share code, notes, and snippets.

@chorny
Forked from ddanderson/bdb_convert_file
Created February 14, 2011 18:53
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save chorny/826341 to your computer and use it in GitHub Desktop.
Save chorny/826341 to your computer and use it in GitHub Desktop.
#!/usr/bin/perl
#
# Copyright (c) 2011
# Donald D. Anderson. All rights reserved.
#
# Redistribution and use in source and binary forms are permitted.
# This software is provided 'as is' and any express or
# implied warranties, including, but not limited to, the implied
# warranties of merchantability, fitness for a particular purpose, or
# non-infringement, are disclaimed.
#
################################################################
#
# Usage: bdb_convert_data fromfile tofile
#
# Converts data in a Berkeley DB database.
# Can be customized to remove or add bytes to each record,
# and even add new records.
#
# Warning: this should normally be used on a quiet system (or best, a
# system that has no live BDB processes. There is nothing
# transactional about this script!
#
# Warning: should not be used if you've changed your btree compare function
# (since db_load will not do the right thing).
#
# See discussion on http://libdb.wordpress.com
die "Usage: $0 fromfile tofile" if ($#ARGV != 1);
my $from = $ARGV[0];
my $to = $ARGV[1];
die "$0: $from: does not exist" if (! -f $from);
die "$0: $to: exists, will not overwrite" if (-f $to);
open IN, "db_dump $from|" or die "$0: cannot run db_dump";
unlink "$to.TMP";
open OUT, "|db_load $to.TMP" or die "$0: cannot run db_load";
# convert_key and convert_value are called with $_ set to
# a data line from the db_dump output. Each data line starts
# with a single space, then there are hex digits, a pair of hex
# digits for each 8 bit char. E.g. ' 65696e7320d0bfd1' is 8 bytes.
# This convert_key passes through the key without modification
sub convert_key {
my $line = shift;
#print "key: $line";
print OUT "$line";
}
# This convert_value 'removes' the second 4 bytes, for demonstration
# !! **** modify this as necessary **** !!
#
sub convert_value {
my $line = shift;
# !! **** here's the custom part **** !!
if (length($_) > 17) {
$line = substr($line,0,9) . substr($line,17);
}
#print "dat: $line";
print OUT "$line";
}
my $iskey = 1;
# The dbdump format contains some header info, that starts
# in the first column. Those lines are copied directly.
# The data appears with a single space in the first column,
# followed by a bunch of hex numbers. Lines of data alternate
# between keys and values.
while (<IN>) {
if (/^ /) {
if ($iskey) {
convert_key($_);
} else {
convert_value($_);
}
# alternate lines
$iskey = ! $iskey;
}
else {
print OUT $_;
}
}
close (IN);
close (OUT);
rename "$to.TMP", "$to";
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment