Skip to content

Instantly share code, notes, and snippets.

@FrankSpierings
FrankSpierings / crewl.py
Created February 5, 2019 12:27
CeWL alternative in Python, based on Scrapy Framework.
# -*- coding: utf-8 -*-
import scrapy
import argparse
import re
from scrapy import signals
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor
from scrapy.crawler import CrawlerProcess