Skip to content

Instantly share code, notes, and snippets.

@abachman
Created November 22, 2014 00:59
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abachman/b0a67183d0f3930d5892 to your computer and use it in GitHub Desktop.
Save abachman/b0a67183d0f3930d5892 to your computer and use it in GitHub Desktop.
selectively filtering word lists
require 'readline'
list1 = ARGV[0]
list2 = ARGV[1]
badwords = File.readlines(list1).map(&:strip)
words = File.readlines(list2).map(&:strip)
puts "remove #{ list1 } #{badwords.size} words from #{ list2 } #{ words.size }"
filter = []
badwords.each do |word|
if words.include?(word)
answer = Readline.readline("remove #{word}? ", false)
if answer.size == 0
filter << word
else
puts "keep #{ word }"
end
end
end
if filter.size > 0
puts "DROPPING #{ filter.join(', ') }"
File.open(list2, 'w') do |f|
f.puts((words - filter).map {|w| "#{w}\n"})
end
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment