Skip to content

Instantly share code, notes, and snippets.

@copenhas
Created July 13, 2011 03:32
Show Gist options
  • Save copenhas/1079659 to your computer and use it in GitHub Desktop.
Save copenhas/1079659 to your computer and use it in GitHub Desktop.
///<summary>
/// Helper class to get the word counts from text
///</summary>
public class WordCount {
///<summary>
/// Takes in a string of text cuts it up and builds a list of the unique words and
/// the number of times they occurred in the list. Optionally takes a regular
/// expression to use as the delimiter. By default the delimiter is whitespace.
///</summary>
public IDictionary<string, int> GetUniqueWords(string text, Regex delimiterPattern = null) {
//implement me
}
///<summary>
/// Takes in a StringReader to use to retrieve text. Then cuts the text up and builds
/// a list of the unique words and the number of times they occurred in the list.
/// Optionally takes a regular expression to use as the delimiter. By default the
/// delimiter is whitespace.
///</summary>
public IDictionary<string, int> GetUniqueWords(StringReader reader,
Regex delimiterPattern = null) {
//implement me
}
}
@copenhas
Copy link
Author

Implement the class then use the first one to tell me the unique word counts for this text:

"The dog jumped over the moon, but that's because the dog had wings and the moon had fallen"

Then open a file and read it in as a stream (may need to tweak the interface of the class, if you are confused just skip it) to pump the text in efficiently into the second method. Use this regular expression "|" (matches pipes) to cut the text up.

"dog|dog|moon|over|jump|dog|jump"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment