Created
February 21, 2012 17:02
-
-
Save kdabir/1877377 to your computer and use it in GitHub Desktop.
to find if any word exists in *every* line in input
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/** | |
* Problem: to find if any word exists in *every* line in input | |
*/ | |
def input = """This is big line | |
This is another big big line | |
There is something common here | |
""" | |
/* APPROACH ONE : order of O(n) | |
1. store each unique word in line, in a map against its count | |
2. all the words with occurences = line count are the ones we are looking for | |
*/ | |
def word_count = [:] // store word -> occurence map here | |
def max_count=0, line_count=0 // | |
input.eachLine { line, n-> | |
def unique = [] as HashSet | |
line.split(" ").each { unique << it } // O(n) | |
unique.each { word-> // O(n) | |
def count = word_count[word] // O(1) | |
word_count[word]= (count) ? count+1 : 1 // O(1) | |
} | |
line_count++ | |
} | |
word_count.each {word,count-> // complexity O(n) | |
if (count==line_count) println "*$word* appears in every line" | |
} | |
/* APPROACH TWO : order of O(n) (faster and consumes less memory) | |
1. store 1st line's uniq words in set (lets call it 'repeating') | |
2. from next line on, only move forward with those words in 'repeating' that also exists in this line | |
3. may abort loop if 'repeating' becomes empy | |
*/ | |
def repeating = null // would be hashset | |
for (line in input.split(/\n/)){ | |
def this_line = [] as HashSet | |
line.split(" ").each {this_line << it } // O(n) | |
if (repeating==null) repeating = this_line | |
else repeating.retainAll(this_line) // O(n) as contains(find) is O(1) in HashSet becuase it uses map internally | |
if (repeating.size()==0) { | |
println "no repeating element found" | |
break // no further process required | |
} | |
} | |
if (repeating.size()) println "$repeating appears in every line" |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment