- Tim Jones homepage (author of primary textbook, note that book code is only on cdrom with book)
- GNU/Linux Application Programming (1st ed.) by Tim Jones (much of first edition of Jones text on Google Books)
- Beginning Linux Programming (4th ed.) by Matthew and Stones (alternative text, can download code from this site)
- free download of Beginning Linux Programming (4th ed.)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Render HTML for scraping""" | |
# -*- coding: utf-8 -*- | |
import os | |
import sys | |
from contextlib import contextmanager | |
from multiprocessing import Pool | |
try: | |
TimeoutError |
adapted from the article "Crawling anonymously with Tor in Python" by S. Acharya, Nov 2, 2013.
The most common use-case is to be able to hide one's identity using TOR or being able to change identities programmatically, for example when you are crawling a website like Google and you don’t want to be rate-limited or blocked via IP address.
Install Tor.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// XPath CheatSheet | |
// To test XPath in your Chrome Debugger: $x('/html/body') | |
// http://www.jittuu.com/2012/2/14/Testing-XPath-In-Chrome/ | |
// 0. XPath Examples. | |
// More: http://xpath.alephzarro.com/content/cheatsheet.html | |
'//hr[@class="edge" and position()=1]' // every first hr of 'edge' class |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FILE SPACING: | |
# double space a file | |
sed G | |
# double space a file which already has blank lines in it. Output file | |
# should contain no more than one blank line between lines of text. | |
sed '/^$/d;G' |