Skip to content

Instantly share code, notes, and snippets.

@dangra
Created April 22, 2014 02:53
Show Gist options
  • Save dangra/11163894 to your computer and use it in GitHub Desktop.
Save dangra/11163894 to your computer and use it in GitHub Desktop.
$ scrapy shell http://scrapy.org/images/logo.png
2014-04-21 23:53:11-0300 [scrapy] INFO: Scrapy 0.23.0 started (bot: scrapybot)
2014-04-21 23:53:11-0300 [scrapy] INFO: Optional features available: ssl, http11
2014-04-21 23:53:11-0300 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled extensions: TelnetConsole, CloseSpider, WebService, CoreStats, SpiderState
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2014-04-21 23:53:12-0300 [scrapy] INFO: Enabled item pipelines:
2014-04-21 23:53:12-0300 [scrapy] DEBUG: Telnet console listening on 0.0.0.0:6023
2014-04-21 23:53:12-0300 [scrapy] DEBUG: Web service listening on 0.0.0.0:6080
2014-04-21 23:53:12-0300 [default] INFO: Spider opened
2014-04-21 23:53:13-0300 [default] DEBUG: Redirecting (302) to <GET http://scrapy.org/images/logo.png> from <GET http://scrapy.org/images/logo.png>
2014-04-21 23:53:13-0300 [default] DEBUG: Crawled (200) <GET http://scrapy.org/images/logo.png> (referer: None)
Traceback (most recent call last):
File "/home/daniel/envs/scrapy/bin/scrapy", line 10, in <module>
execfile(__file__)
File "/home/daniel/src/scrapy/bin/scrapy", line 4, in <module>
execute()
File "/home/daniel/src/scrapy/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/home/daniel/src/scrapy/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/home/daniel/src/scrapy/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/home/daniel/src/scrapy/scrapy/commands/shell.py", line 50, in run
shell.start(url=url, spider=spider)
File "/home/daniel/src/scrapy/scrapy/shell.py", line 45, in start
self.fetch(url, spider)
File "/home/daniel/src/scrapy/scrapy/shell.py", line 93, in fetch
self.populate_vars(response, request, spider)
File "/home/daniel/src/scrapy/scrapy/shell.py", line 102, in populate_vars
self.vars['sel'] = Selector(response)
File "/home/daniel/src/scrapy/scrapy/selector/unified.py", line 79, in __init__
_root = LxmlDocument(response, self._parser)
File "/home/daniel/src/scrapy/scrapy/selector/lxmldocument.py", line 27, in __new__
cache[parser] = _factory(response, parser)
File "/home/daniel/src/scrapy/scrapy/selector/lxmldocument.py", line 13, in _factory
body = response.body_as_unicode().strip().encode('utf8') or '<html/>'
AttributeError: 'Response' object has no attribute 'body_as_unicode'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment