Skip to content

Instantly share code, notes, and snippets.

@tesths
Created November 16, 2018 12:29
Show Gist options
  • Save tesths/fbbb70b99af523d8b4eaaa939e648d57 to your computer and use it in GitHub Desktop.
Save tesths/fbbb70b99af523d8b4eaaa939e648d57 to your computer and use it in GitHub Desktop.
SanaChina961229 Web Scraper
{
"_id": "weibo",
"startUrl": [
"https://weibo.com/SanaChina961229?is_search=0&visible=0&is_all=1&is_tag=0&profile_ftype=1&page=[1-2]"
],
"selectors": [
{
"id": "real-content",
"type": "SelectorElementScroll",
"parentSelectors": [
"_root"
],
"selector": "div.WB_cardwrap.WB_feed_type.S_bg2",
"multiple": true,
"delay": "15000"
},
{
"id": "content1",
"type": "SelectorText",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_text.W_f14",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "@someone",
"type": "SelectorText",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_expand div.WB_info",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "forward",
"type": "SelectorText",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_expand div.WB_text",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "link",
"type": "SelectorLink",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_text a",
"multiple": false,
"delay": 0
},
{
"id": "picture",
"type": "SelectorImage",
"parentSelectors": [
"real-content"
],
"selector": "li.WB_pic img",
"multiple": false,
"delay": 0
},
{
"id": "group",
"type": "SelectorHTML",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_media_wrap.clearfix",
"multiple": false,
"regex": "",
"delay": 0
},
{
"id": "time",
"type": "SelectorText",
"parentSelectors": [
"real-content"
],
"selector": "div.WB_from a.S_txt2:nth-of-type(1)",
"multiple": false,
"regex": "",
"delay": 0
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment