Skip to content

Instantly share code, notes, and snippets.

@outro56
Last active August 29, 2015 14:22
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save outro56/20915d7e65ee2f27a03a to your computer and use it in GitHub Desktop.
Save outro56/20915d7e65ee2f27a03a to your computer and use it in GitHub Desktop.
Find the top X frequent numbers in a stream
//Find the top X frequent numbers in a stream
// http://www.cs.berkeley.edu/~satishr/cs270/sp11/rough-notes/Streaming-two.pdf
// Returns the top numbers k and their frequencies in the set
// Complexity:
// space = O(k)
// time = O(N)
map<int, int> getFreq(k, Set) {
A = map<int, int>
for (k : getTop(k, Set) {
A[k] = (select i in Set where i == k).Count()
}
return A
}
// This function returns the top k numbers in the set S
// Complexity:
// space = O(k)
// time = O(N)
List<int> getTop(k, Set) {
auto A = map<int, int>
for (i : Set) {
if (A.size() < k) {
A[i] = 1
} else if (A.contains(i)) {
A[i] += 1 // increment its frequency
} else {
for (t : A.keys) {
if (A[t] == 0)
A.remove(t)
}
if (A.size() < X) {
A[i] = 1
} else {
for (t : A.keys) {
A[t] -= 1; // decrement frequency
}
}
}
}
return A.keys
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment