The following gist is an extract of the article Building a simple crawler. It allows crawling from a URL and for a given number of bounce.
from crawler import Crawler
crawler = Crawler()
crawler.crawl('http://techcrunch.com/')
{ | |
"bitcoin": [ | |
{ | |
"24h_vol": null, | |
"date": "12/5/2013", | |
"market_cap": null, | |
"price_usd": null | |
}, | |
{ | |
"24h_vol": null, |
<!DOCTYPE html> | |
<html lang="en"> | |
<meta charset="utf-8"> | |
<script src="https://d3js.org/d3.v4.min.js"></script> | |
<!-- Google fonts Second Reference--> | |
<link href="https://fonts.googleapis.com/css?family=Barlow:100,200,300,400,500&display=swap" rel="stylesheet"> | |
<!-- Connecting with D3 library--> | |
<script src="https://d3js.org/d3.v3.min.js"></script> |
license: gpl-3.0 |
license: gpl-3.0 |
license: gpl-3.0 |
The following gist is an extract of the article Building a simple crawler. It allows crawling from a URL and for a given number of bounce.
from crawler import Crawler
crawler = Crawler()
crawler.crawl('http://techcrunch.com/')