Skip to content

Instantly share code, notes, and snippets.

View cdgz's full-sized avatar

Chingis cdgz

  • Australia
  • 22:27 (UTC +10:00)
View GitHub Profile
@cdgz
cdgz / crawl.py
Created October 25, 2011 15:14 — forked from jonhurlock/crawl.py
Python Web Crawler - jonhurlock
#!/usr/bin/env python
"""
Simple Indexer
=================================
Author: Jon Hurlock, October 2011
This script basically crawls a domain (not just a page) and
then extracts all links <a href=""></a>, and finds all links
on that domain it also is able extract different file types