Skip to content

Instantly share code, notes, and snippets.

@ACEfanatic02
Last active August 29, 2015 13:57
Show Gist options
  • Save ACEfanatic02/9439151 to your computer and use it in GitHub Desktop.
Save ACEfanatic02/9439151 to your computer and use it in GitHub Desktop.
Simple grep for word in a directory of textfiles. Work in progress...
#!/usr/bin/env ruby
# Change SEARCH_ROOT to point to proper directory.
# USAGE: ruby grepword.rb <word>
SEARCH_ROOT = "/path/to/books/directory"
word = Regexp.new(ARGV[0].encode('utf-8', 'external'))
def clean line
['sjis', 'utf-8', 'utf-16'].each do |encoding|
begin
cleaned = line.encode('utf-8', encoding)
rescue Exception => e
next
end
return cleaned if cleaned.valid_encoding?
end
""
end
def match_words word
Dir.glob("#{SEARCH_ROOT}/**/*.txt").each do |file|
next unless File.file?(file)
File.open(file) do |f|
f.each_line do |line|
line = clean(line)
if match = line.match(word)
yield file, line, match
end
end
end
end
end
match_words(word) do |file, line, match|
puts File.basename file
puts line
puts
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment