Skip to content

Instantly share code, notes, and snippets.

@automatthew
Created February 12, 2010 11:45
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save automatthew/302498 to your computer and use it in GitHub Desktop.
Save automatthew/302498 to your computer and use it in GitHub Desktop.
script for editing pdf metadata using pdftk
#!/usr/bin/env ruby
require 'tempfile'
require 'fileutils'
FileUtils.mkdir_p "processed"
def metadata(filename)
command = "pdftk #{filename.dump} dump_data"
puts command
f = IO.popen(command)
str = f.read
pdfprops(str)
end
def update(filename)
hash = metadata(filename)
yield hash if block_given?
dfile = "#{filename}.data"
tmp = File.new(dfile, "w")
hash.each do |k,v|
tmp.puts "InfoKey: #{k}\nInfoValue: #{v}"
end
tmp.flush
newfile = "processed/" + hash["Title"]
command = "pdftk #{filename.dump} update_info #{dfile.dump} output #{newfile.dump}.pdf"
system command
tmp.close
end
def pdfprops(data)
recs = data.scan(/InfoKey:\s(.+)\nInfoValue:\s(.+)/)
Hash[*recs.flatten]
end
if $PROGRAM_NAME == __FILE__
author = ARGV.shift
ARGV.each do |file|
update(file) do |meta|
title = File.basename(file, ".pdf")
meta["Title"] = title
meta["Author"] = author
end
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment