Skip to content

Instantly share code, notes, and snippets.

@itaifrenkel
Created November 19, 2012 14:05
Show Gist options
  • Save itaifrenkel/4110821 to your computer and use it in GitHub Desktop.
Save itaifrenkel/4110821 to your computer and use it in GitHub Desktop.
Tokenizing unprocessed tweets in-memory
public class TweetParser {
SQLQuery<SpaceDocument> getTemplate() {
return new SQLQuery<SpaceDocument>("Tweet", "Processed = ?", false);
}
public SpaceDocument eventListener(SpaceDocument tweet) {
Long id = (Long) tweet.getProperty("Id");
String text = tweet.getProperty("Text");
if (text != null) {
//tokenize tweet in memory
gigaSpace.write(new TokenizedTweet(id, tokenize(text)));
}
//mark tweet as processed in memory
tweet.setProperty("Processed", true);
return tweet;
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment