Skip to content

Instantly share code, notes, and snippets.

@XWilliamY
Last active September 10, 2020 15:55
Show Gist options
  • Save XWilliamY/b123db76d170be7d914032115742c62d to your computer and use it in GitHub Desktop.
Save XWilliamY/b123db76d170be7d914032115742c62d to your computer and use it in GitHub Desktop.
groups = scraper.get_result_similar(url, grouped=True)

Since groups is a dictionary, you can get the names of the rules by calling

groups.keys()

You can then key into the dictionary using a particular review.

groups['rule_1o6e'][:10]

Output:

['Kenny P.Sunset Park, NY2 friends3 reviews',
 'Share reviewEmbed review',
 '8/26/2020',
 'Tried their Brown Sugar milk tea and it was not bad compare to Tiger Sugar. I prefer this over Tiger Sugar due to the L size option and sweetness content. It was my to go bubble tea spot for the last two days straight. Will visit again!',
 'Useful 2FunnyCool 1',
 'Grace W.Hackensack River Waterfront, Jersey City, NJ0 friends1 review',
 'Share reviewEmbed review',
 '7/19/2020',
 "The Oolong Tea Latte with Pudding is amazing! \xa0You can tell it's made of real tea..not from the powder like other places. The oolong fragrance is so nice and refreshing. Not to mention the pudding is so creamy and the sweetness is just right. If you are looking for a high quality tea place with excellent service this is the spot!",
 'Useful 1FunnyCool 1']

If the results of this rule accurately reflects the desired results, you can choose to keep this rule with the following method call:

scraper.keep_rules('rule_io6e')

Then, save this model:

scraper.keep_rules('yelp-reviews')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment