Skip to content

Instantly share code, notes, and snippets.

@magegu
Created March 16, 2017 15:44
Show Gist options
  • Save magegu/4ba729437ee4b4905182e5ad76f7ac05 to your computer and use it in GitHub Desktop.
Save magegu/4ba729437ee4b4905182e5ad76f7ac05 to your computer and use it in GitHub Desktop.
print dina4/dina3 stats for PDF files in recursive folders
require 'pdf-reader'
def check_pdf(input_path)
dina5count = 0
dina4count = 0
dina3count = 0
dina2count = 0
dina1count = 0
dina0count = 0
threshold = 1.25
reader = PDF::Reader.new(input_path)
reader.pages.each do |page|
bbox = page.attributes[:MediaBox]
width = bbox[2] - bbox[0]
height = bbox[3] - bbox[1]
if height < width
oldwidth = width
width = height
height = oldwidth
end
#logger.debug "#{width} #{height}"
if (width <= 420*threshold) && (height <= 595*threshold)
dina5count += 1
elsif (width <= 595*threshold) && (height <= 842*threshold)
dina4count += 1
elsif (width <= 842*threshold) && (height <= 1190*threshold)
dina3count += 1
elsif (width <= 1190*threshold) && (height <= 1684*threshold)
dina2count += 1
elsif (width <= 1684*threshold) && (height <= 2380*threshold)
dina1count += 1
elsif (width <= 2380*threshold) && (height <= 3368*threshold)
dina0count += 1
else
logger.error "error: unkown filesize for mailing: #{page.inspect}}"
end
end
[dina5count, dina4count, dina3count + dina2count + dina1count + dina0count]
end
docs = 0
dina4 = 0
dina3 = 0
Dir["./**/*.pdf"].each do |f|
dina5count, dina4count, dina3count = check_pdf(f)
docs += 1
puts "#{f}, #{dina5count+dina4count}, #{dina3count} "
dina4 += dina5count + dina4count
dina3 += dina3count
end
puts "all, #{docs}: #{dina4}, #{dina3}"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment