Skip to content

Instantly share code, notes, and snippets.

@bbenno
Forked from emad-elsaid/pdf2txt.rb
Last active February 4, 2021 13:04
Show Gist options
  • Save bbenno/291c433320a44324c940a29295a86938 to your computer and use it in GitHub Desktop.
Save bbenno/291c433320a44324c940a29295a86938 to your computer and use it in GitHub Desktop.
PDF to Text converter using ruby
#!/usr/bin/env ruby
# frozen_string_literal: true
require 'pdf/reader'
# credits to :
# https://github.com/yob/pdf-reader/blob/master/examples/text.rb
# usage example:
# ruby pdf2txt.rb /path-to-file/file1.pdf [/path-to-file/file2.pdf..]
ARGV.each do |filename|
PDF::Reader.open(filename) do |reader|
puts "Converting : #{filename}"
pageno = 0
txt = reader.pages.map do |page|
pageno += 1
begin
print "Converting Page #{pageno}/#{reader.page_count}\r"
page.text
rescue
puts "Page #{pageno}/#{reader.page_count} Failed to convert"
''
end
end
puts "\nWriting text to disk"
File.write "#{filename}.txt", txt.join("\n")
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment