- Backwards incompatible Switched HTTPCacheMiddleware backend to filesystem (
541
) To restore old backend set HTTPCACHE_STORAGE to scrapy.contrib.httpcache.DbmCacheStorage - Proxy https:// urls using CONNECT method (
392
,397
) - Add a middleware to crawl ajax crawleable pages as defined by google (
343
)
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ scrapy shell https://www.ssehl.co.uk/HALO/publicLogon.do -c "response.xpath('//title').extract()" | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Scrapy 0.23.0 started (bot: scrapybot) | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Optional features available: ssl, http11 | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0} | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware | |
2014-05-08 16:33:22-0300 [scrapy] INFO: Enabled item pipelines: | |
2014-05-08 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ scrapy shell http://scrapy.org/images/logo.png | |
2014-04-21 23:53:11-0300 [scrapy] INFO: Scrapy 0.23.0 started (bot: scrapybot) | |
2014-04-21 23:53:11-0300 [scrapy] INFO: Optional features available: ssl, http11 | |
2014-04-21 23:53:11-0300 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0} | |
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState | |
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats | |
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware | |
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled item pipelines: | |
2014-04-21 23:53:12-0300 [scrapy] DEBUG: Telnet console listen |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
------------------------------------------------------------ | |
/home/daniel/envs/setup3/bin/pip run on Thu Mar 6 12:23:59 2014 | |
Downloading/unpacking cryptography | |
Getting page https://pypi.python.org/simple/cryptography/ | |
URLs to search for versions for cryptography: | |
* https://pypi.python.org/simple/cryptography/ | |
Analyzing links from page https://pypi.python.org/simple/cryptography/ | |
Skipping https://pypi.python.org/packages/cp26/c/cryptography/cryptography-0.2-cp26-none-win32.whl#md5=13e5c4b19520e7dc6f07c6502b3f74e2 (from https://pypi.python.org/simple/cryptography/) because it is not compatible with this Python | |
Skipping https://pypi.python.org/packages/cp26/c/cryptography/cryptography-0.2.1-cp26-none-win32.whl#md5=00e733648ee5cdb9e58876238b1328f8 (from https://pypi.python.org/simple/cryptography/) because it is not compatible with this Python | |
Skipping https://pypi.python.org/packages/cp26/c/cryptography/cryptography-0.2.2-cp26-none-win32.whl#md5=b52f9b5f5c980ebbe090f945a44be2a5 (from https:/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Downloading cryptography-0.2.2.tar.gz (13.8MB): 13.8MB downloaded | |
Running setup.py (path:/tmp/pip_build_root/cryptography/setup.py) egg_info for package cryptography | |
no previously-included directories found matching 'documentation/_build' | |
zip_safe flag not set; analyzing archive contents... | |
six: module references __file__ | |
Installed /tmp/pip_build_root/cryptography/six-1.5.2-py2.7.egg | |
Searching for cffi>=0.8 | |
Reading http://33.33.33.41:3141/vagrant/dev/+simple/cffi/ | |
Best match: cffi 0.8.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# global parameters | |
global | |
# log on syslog of 127.0.0.1 udp port 514 (default) using local0 facility. | |
log 127.0.0.1 local0 | |
# maximum number of concurrent connections | |
maxconn 4096 | |
# drop privileges after port binding | |
user nobody | |
group nogroup | |
# run in daemon mode |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import encodings | |
import lxml.etree | |
for enc in set(encodings.aliases.aliases.values()): | |
try: | |
parser = lxml.etree.HTMLParser(recover=True, encoding=enc) | |
except LookupError as exc: | |
print str(exc) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import encodings | |
import lxml.etree | |
for enc in set(encodings.aliases.aliases.values()): | |
try: | |
parser = lxml.etree.HTMLParser(recover=True, encoding=enc) | |
except LookupError as exc: | |
print str(exc) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
~$ scrapy shell http://www.jobberman.com/jobs-in-nigeria/3/by-industry/vacancies-in-ict-telecommunications-companies-in-nigeria/ | |
2014-01-22 23:09:26-0200 [scrapy] INFO: Scrapy 0.23.0 started (bot: scrapybot) | |
2014-01-22 23:09:26-0200 [scrapy] INFO: Optional features available: ssl, http11, boto, django | |
2014-01-22 23:09:26-0200 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0} | |
2014-01-22 23:09:27-0200 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState | |
2014-01-22 23:09:28-0200 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats | |
2014-01-22 23:09:28-0200 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware | |
2014-01-22 23:09:28-0200 [scrapy] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
---------- | |
ID: app | |
Function: docker.running | |
Result: False | |
Comment: Container 'shipyard' cannot be started | |
Traceback (most recent call last): | |
File "/var/cache/salt/minion/extmods/modules/dockerio.py", line 904, in start | |
for k, v in port_bindings.iteritems(): | |
AttributeError: 'list' object has no attribute 'iteritems' | |
Changes: |