Skip to content

Instantly share code, notes, and snippets.

@pete
Created February 1, 2013 21:31
Show Gist options
  • Save pete/4694295 to your computer and use it in GitHub Desktop.
Save pete/4694295 to your computer and use it in GitHub Desktop.
Ridiculous nonsense.
#!/usr/bin/env ruby
# Hey, wanna see something fun?
#
# My disk broke. Thinking quickly, I decided to harness the unlimited power of
# User Error, and partially wreck my backup.
#
# Don't ask.
#
# md0 had most of the data. But the filesystem and partition table were
# wrecked, and some unknown amount of data had been exchanged between it and
# md1, thanks to hastily typed mdadm commands.
#
# No, I didn't confuse drive numbers. I said not to ask.
#
# md1 had most of the data, too. But the partitions were a different format.
# Also the first chunks of it were wrecked.
#
# sdf had a partial dd of md0. How much was a mystery to me.
#
# This script represents me, in a panic ("OH NOES MY DATA!"), dashing
# something off to determine exactly where the disks diverged and where the
# similarities were. The idea was that if I could figure out where md0 and sdf
# sync'd up and where they stopped, as well as where md0 and md1 diverged, then
# I could
#
# The variables are named by bizarre conventions, created and abandoned on the
# fly. The logic is shaky at best. Swaths were rewritten, copied and pasted
# elsewhere, or deleted. Most of the time you see a lambda, it's because I
# didn't want to name or pass function arguments. Global variables cropped up
# casually. Copy-paste was used in place of arrays. Useless optimizations
# appeared, possibly due to hallucinations. The script ran for several seconds
# and was killed and restarted as I added more hacks, more strange
# optimizations, and inane tweaks to the output.
#
# Except the comments, of which there were none originally, here is some raw,
# unedited panic code. The fevered scripting dreams of a madman.
md0 = File.open('/dev/md0')
md1 = File.open('/dev/md1')
sdf = File.open('/dev/sdf')
# The block size? Yeah, it was absolutely chosen arbitrarily.
offt = 0
bs = 4096
t0 = t1 = ts = nil
# This was so the reads would happen in parallel, and the rest of the program
# wouldn't need to stall waiting for them.
nxread = lambda {
t0 = Thread.new { md0.sysread(bs) }
t1 = Thread.new { md1.sysread(bs) }
ts = Thread.new { sdf.sysread(bs) }
}
nxread[]
# puts() et al block. This odd thread-chaining was so that, if I paused the
# output to examine it, the program would merely leak threads rather than block
# when its output buffers filled up, and so that the output would stay in
# order. It's worth noting that pt() relies on being called by only one
# thread. It spits out its pid because I was trapping USR1 at some point.
$t = Thread.new{ puts "Starting as #{Process.pid} at #{Time.now} (#{Time.now.to_i})", '_' * 80, '' }
def pt offt, &b
pt = $t
$t = Thread.new { pt.join; b.call(offt) }
end
# Yes, I have heard of queues...why do you ask?
# I don't remember why I was *this* afraid of repeating the same sprintf() more
# than once. I probably did this because I couldn't sit still while watching
# the output.
$fmtofft = Hash.new {|h,k| h[k] = ('0x%016x' % k) }
# Yes, it uses an array as a key, no, there isn't a very good reason. Don't
# ask.
$fmtstat = Hash.new { |h,k|
omd0 = []
omd1 = []
osdf = []
ls0, ls1, lsb = *k
if ls0; omd0 << 'sdf'; osdf << 'md0'; end
if ls1; omd1 << 'sdf'; osdf << 'md1'; end
if lsb; omd0 << 'md1'; omd1 << 'md0'; end
h[k] = {
:'*' => "<<md0:[#{omd0.join(',')}]>>\t<<md1:[#{omd1.join(',')}]>>",
:'0' => "md0:[#{omd0.join(',')}]\t(sdf:[#{osdf.join(',')}])",
:'1' => "md1:[#{omd1.join(',')}]\t(sdf:[#{osdf.join(',')}])",
}
}
# I don't even remember why this came out the way it did.
def pst(offt, c, s0, s1, sb)
pt(offt) { |lo|
puts "[#{c}] #{$fmtofft[lo]}: #{$fmtstat[[s0,s1,sb]][c]}"
}
end
s0, s1, sb = nil, nil, true
# This variable and the accompanying lambda, you can probably tell, were
# afterthoughts. Yes, there are two different output formats: this one and
# the one that flooded stdout. Don't ask.
changelog = {}
logc = lambda { |cof,c0,c1,cb,ls0,ls1,lsb|
lc = {}
lc[:changed] = []
lc[:md0] = []
lc[:md1] = []
lc[:sdf] = []
if c0; lc[:changed] << 'md0<=>sdf'; end
if c1; lc[:changed] << 'md1<=>sdf'; end
if cb; lc[:changed] << '[[md0<=>md1]]'; end
if ls0; lc[:md0] << :sdf; lc[:sdf] << :md0; end
if ls1; lc[:md1] << :sdf; lc[:sdf] << :md1; end
if lsb; lc[:md0] << :md1; lc[:md1] << :md0; end
changelog[$fmtofft[cof]] = lc
}
require 'pp'
loop {
# So, this pulls the last three blocks read, and then fires off the threads
# to read the next set of blocks while concerning itself with the logic.
r0 = t0.value
r1 = t1.value
rs = ts.value
nxread[]
# This code is awful. Essentially, I wanted to see when a thing started or
# stopped matching another thing.
ps0, ps1, psb = s0, s1, sb
s0 = r0 == rs
s1 = r1 == rs
sb = r0 == r1
c0 = s0 != ps0
c1 = s1 != ps1
cb = sb != psb
# Yep. Not a loop.
out = false
if c0
pst(offt, :'0', s0, s1, sb)
out = true
end
if c1
pst(offt, :'1', s0, s1, sb)
out = true
end
if cb
pst(offt, :'*', s0, s1, sb)
out = true
end
if out
logc[offt,c0,c1,cb,s0,s1,sb]
pt(offt) { |o|
puts "#{Time.now} (#{Time.now.to_i})",
sprintf(' 0x%x / %d ', o, o).center(80, '-'),
''
}
# This was originally what happened if you sent USR1, which is why it
# is a different call to pt(), with differently named variables, a
# different output format, etc. I had the idea that I'd dump state
# periodically so that I could get the script to resume if interrupted,
# but I ended up not doing that.
pt(offt) { |lo|
# It doesn't use $fmtofft. I'm sure there was a reason for this.
# Your guess is as good as mine.
lof = ('0x%016x' % lo)
sls = ''
PP.pp({
:offt => lof,
:cur => (($fmtstat[[s0, s1, sb]][:'*'] rescue nil)),
:changelog => changelog,
'$fmtstat' => $fmtstat,
}, sls)
File.open('/tmp/cmppart.stat', 'w') { |f|
f.puts sls
}
}
end
offt += bs
}
# Epilogue: I recovered my data. I am running the backup script as we speak.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment