Skip to content

Instantly share code, notes, and snippets.

@yhonzou
Created April 21, 2018 21:04
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save yhonzou/c96e25305d55f5e4f054a9173eb53a1e to your computer and use it in GitHub Desktop.
Save yhonzou/c96e25305d55f5e4f054a9173eb53a1e to your computer and use it in GitHub Desktop.
# awesome-spider
收集各种爬虫 (默认爬虫语言为 python), 欢迎大家 提 pr 或 issue, 收集脚本见此项目 [github-search](https://github.com/facert/github_search)
### A
* [暗网爬虫(Go)](https://github.com/s-rah/onionscan)
* [爱丝APP图片爬虫](https://github.com/x-spiders/aiss-spider)
### B
* [Bilibili 用户](https://github.com/airingursb/bilibili-user)
* [Bilibili 视频](https://github.com/airingursb/bilibili-video)
* [B站760万视频信息爬虫](https://github.com/chenjiandongx/bili-spider)
* [博客园(node.js)](https://github.com/chokcoco/cnblogSpider)
* [百度百科(node.js)](https://github.com/nswbmw/micro-scraper)
* [北邮人水木清华招聘](https://github.com/Marcus-T/Crawler_Job)
* [百度云网盘](https://github.com/gudegg/yunSpider)
* [琉璃神社爬虫](https://github.com/Chion82/hello-old-driver)
### C
* [cnblog](https://github.com/jackgitgz/CnblogsSpider)
* [caoliu 1024](https://github.com/LintBin/1024crawer)
### D
* [豆瓣读书](https://github.com/lanbing510/DouBanSpider)
* [豆瓣爬虫集](https://github.com/dontcontactme/doubanspiders)
* [豆瓣害羞组](https://github.com/rockdai/haixiu)
* [DNS记录和子域名](https://github.com/TheRook/subbrute)
### E
* [E绅士](https://github.com/shuiqukeyou/E-HentaiCrawler)
### G
* [Girl-atlas](https://github.com/pein0119/girl-atlas-crawler)
* [girl13](https://github.com/xuelangcxy/girlCrawler)
* [github trending](https://github.com/bonfy/github-trending)
* [Github 仓库及用户分析爬虫](https://github.com/chenjiandongx/Github)
### I
* [Instagram](https://github.com/xTEddie/Scrapstagram)
* [INC500 世界5000强爬虫](https://github.com/XetRAHF/Scrapping-INC500)
### J
* [京东](https://github.com/taizilongxu/scrapy_jingdong)
* [京东搜索+评论](https://github.com/Chyroc/JDong)
* [京东商品+评论](https://github.com/samrayleung/jd_spider)
* [机票](https://github.com/fankcoder/findtrip)
* [煎蛋妹纸](https://github.com/kulovecc/jandan_spider)
* [今日头条,网易,腾讯等新闻](https://github.com/lzjqsdd/NewsSpider)
### K
* [看知乎](https://github.com/atonasting/zhihuspider)
* [课程格子校花榜](https://github.com/xinqiu/kechenggezi-Spider)
* [konachan](https://github.com/wudaown/konachanDL)
### L
* [链家](https://github.com/lanbing510/LianJiaSpider)
* [链家成交在售在租房源](https://github.com/XuefengHuang/lianjia-scrawler)
* [拉勾](https://github.com/GuozhuHe/webspider)
* [炉石传说](https://github.com/youfou/hsdata)
* [leetcode](https://github.com/bonfy/leetcode)
* [领英销售导航器爬虫 LinkedInSalesNavigator](https://github.com/XetRAHF/Spider_LinkedInSalesNavigatorURL)
### M
* [马蜂窝(node.js)](https://github.com/golmic/mafengwo-spider)
* [MyCar](https://github.com/Thoxvi/MyCar_python)
* [漫画喵 一键下载漫画~](https://github.com/miaoerduo/cartoon-cat)
* 美女写真套图爬虫 [(一)](https://github.com/chenjiandongx/mmjpg)[(二)](https://github.com/chenjiandongx/mzitu)
### N
* [新闻监控](https://github.com/NolanZhao/news_feed)
### O
* [ofo共享单车爬虫](https://github.com/SilverBooker/ofoSpider)
### P
* [Pixiv](https://github.com/littleVege/pixiv_crawl)
* [PornHub](https://github.com/xiyouMc/WebHubBot)
* [packtpub](https://github.com/niqdev/packtpub-crawler)
* [91porn](https://github.com/eqblog/91_porn_spider)
### Q
* [QQ空间](https://github.com/LiuXingMing/QQSpider)
* [QQ 群](https://github.com/caspartse/QQ-Groups-Spider)
* [清华大学网络学堂爬虫](https://github.com/kehao95/thu_learn)
* [去哪儿](https://github.com/lining0806/QunarSpider)
* [前程无忧Python招聘岗位信息爬取分析](https://github.com/chenjiandongx/51job)
### R
* [人人影视](https://github.com/gnehsoah/yyets-spider)
* [RSS 爬虫](https://github.com/shanelau/rssSpider)
* [rosi 妹子图](https://github.com/evilcos/crawlers)
* [reddit 壁纸](https://github.com/tsarjak/WallpapersFromReddit)
* [reddit](https://github.com/dannyvai/reddit_crawlers)
### S
* [soundcloud](https://github.com/Cortexelus/dadabots)
* [Stackoverflow 100万问答爬虫](https://github.com/chenjiandongx/stackoverflow)
* [Shadowsocks 账号爬虫](https://github.com/chenjiandongx/soksaccounts)
### T
* [tumblr](https://github.com/facert/tumblr_spider)
* [TuShare](https://github.com/waditu/tushare)
* [天猫双12爬虫](https://github.com/LiuXingMing/Tmall1212)
* [Taobao mm](https://github.com/carlonelong/TaobaoMMCrawler)
* [Tmall 女性文胸尺码爬虫](https://github.com/chenjiandongx/cup-size)
* [淘宝直播弹幕爬虫(node)](https://github.com/xiaozhongliu/taobao-live-crawler)
### V
* [视频信息爬虫](https://github.com/billvsme/videoSpider)
* [电影网站](https://github.com/chenqing/spider)
### W
* [乌云公开漏洞](https://github.com/hanc00l/wooyun_public)
* [微信公众号](https://github.com/bowenpay/wechat-spider)
* [“代理”方式抓取微信公众号文章](https://github.com/lijinma/wechat_spider)
* [网易新闻](https://github.com/armysheng/tech163newsSpider)
* [网易精彩评论](https://github.com/dongweiming/commentbox)
* [微博主题搜索分析](https://github.com/luzhijun/weiboSA)
* [网易云音乐](https://github.com/RitterHou/music-163)
### X
* [雪球股票信息(java)](https://github.com/decaywood/XueQiuSuperSpider)
* [新浪微博](https://github.com/LiuXingMing/SinaSpider)
* [新浪微博分布式爬虫](https://github.com/ResolveWang/weibospider)
### Y
* [英美剧 TV (node.js)](https://github.com/pockry/tv-crawler)
### Z
* [ZOL 手机壁纸爬虫](https://github.com/chenjiandongx/wallpaper)
* [知乎(python)](https://github.com/LiuRoy/zhihu_spider)
* [知乎(php)](https://github.com/owner888/phpspider)
* [知网](https://github.com/yanzhou/CnkiSpider)
* [知乎妹子](https://github.com/yjm12321/zhihu-girl)
* [自如实时房源提醒](https://github.com/facert/ziroom_realtime_spider)
### 其他
* [各种爬虫](https://github.com/Nyloner/Nyspider)
* [DHT 爬虫](https://github.com/blueskyz/DHTCrawler)
* [SimDHT](https://github.com/dontcontactme/simDHT)
* [p2pspider](https://github.com/dontcontactme/p2pspider)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment