Skip to content

Instantly share code, notes, and snippets.

@KZhidovinov
Last active January 18, 2017 08:49
Show Gist options
  • Save KZhidovinov/5f9b8d1f6680c38b6f1f0d60991fa74e to your computer and use it in GitHub Desktop.
Save KZhidovinov/5f9b8d1f6680c38b6f1f0d60991fa74e to your computer and use it in GitHub Desktop.
Script for converting PDF books to readable format on small screens.
# Testes on ruby-2.4.0
# With rmagick 2.16.0
# And ImageMagick 6.9.7-4
require 'rmagick'
require 'fileutils'
INPUT_FILE='/Users/kzhidovinov/Downloads/code_complete.pdf'
OUTPUT_FILE = '/Users/kzhidovinov/Downloads/code_complete_1.pdf'
TMP_DIR = File.join(File.dirname(__FILE__), 'temp')
# Size of page
TARGET_HEIGHT = 560
TARGET_WIDTH = 760
DENSITY = 300
FileUtils.mkpath(TMP_DIR)
system("rm #{File.join(TMP_DIR, '*')}")
output_mask = File.join(TMP_DIR, 'p%04d.png')
# Use ghostscript util to split PDF into PNG images.
system("gs -dNOPAUSE -dBATCH -sDEVICE=pngalpha -sOutputFile=\"#{output_mask}\" -r#{DENSITY} #{INPUT_FILE}")
# Will contain page file names
out_filenames = []
# Process all pages
Dir.glob("#{TMP_DIR}/p*.png").each_with_index do |file, idx|
puts "Processing #{file}"
# read image from file
image = Magick::Image.read(file).first
image.density = "#{DENSITY}x#{DENSITY}"
# trim spaces
image.trim!(true)
# make image gray
image.colorspace = Magick::GRAYColorspace
# resize image to width = TARGET_WIDTH
image.change_geometry!("#{TARGET_WIDTH}x#{TARGET_HEIGHT}^") do |w, h, img|
img.resize!(w, h)
end
# Split current pages to fit target page size.
height = image.rows
i = 0
while i * TARGET_HEIGHT < height
filename = File.join(TMP_DIR, "#{'%06d' % (idx*100+i)}.out.pdf")
out_filenames << filename
out_img = image.crop(Magick::NorthGravity, 0, [i * TARGET_HEIGHT - 10, 0].max, TARGET_WIDTH, TARGET_HEIGHT, true)
.extent(TARGET_WIDTH, TARGET_HEIGHT)
.rotate(90)
.sharpen(5, 10.0)
out_img.define('pdf:use-trimbox', true)
out_img.density = "#{DENSITY}x#{DENSITY}"
out_img.write(filename)
i = i + 1
end
end
# merge pages into PDF
system("gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile=\"#{OUTPUT_FILE}\" #{out_filenames.join(' ')}")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment