Skip to content

Instantly share code, notes, and snippets.

@wesslen
Created March 16, 2017 17:05
Show Gist options
  • Save wesslen/b13bd9f5a12a98150dffd352f3ab64c1 to your computer and use it in GitHub Desktop.
Save wesslen/b13bd9f5a12a98150dffd352f3ab64c1 to your computer and use it in GitHub Desktop.
PySpark Gnip Filter by Hashtag
# filters all tweets that mention the hashtag #lovetrumpshate from the data frame tweets
activities = tweets.filter((array_contains(tweets.twitter_entities.hashtags.text,"lovetrumpshate")))
activities.count()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment