Skip to content

Instantly share code, notes, and snippets.

@jfeldstein
Created April 10, 2015 21:58
Show Gist options
  • Save jfeldstein/fb6d8b1b9312b9207ba9 to your computer and use it in GitHub Desktop.
Save jfeldstein/fb6d8b1b9312b9207ba9 to your computer and use it in GitHub Desktop.
Crawlera Logs
Time (UTC) Level Message
0: 2015-04-08 21:35:55 INFO Log opened.
1: 2015-04-08 21:35:55 INFO Scrapy 0.25.0-298-g5846d61 started
2: 2015-04-08 21:35:55 INFO using set_wakeup_fd
3: 2015-04-08 21:35:55 INFO Scrapy 0.25.0-298-g5846d61 started (bot: amazon)
4: 2015-04-08 21:35:55 INFO Optional features available: ssl, http11, boto
5: 2015-04-08 21:35:55 INFO Overridden settings: {'NEWSPIDER_MODULE': 'amazon.spiders', 'LOG_LEVEL': 'INFO', 'CONCURRENT_REQUESTS_PER_DOMAIN': 32, 'CONCURRENT_REQUESTS': 32, 'SPIDER_MODULES': ['amazon.spiders'], 'STATS_CLASS': 'hworker.bot.stats.HubStorageStatsCollector', 'BOT_NAME': 'amazon', 'DOWNLOAD_TIMEOUT': 600, 'MEMUSAGE_LIMIT_MB': 512, 'MEMUSAGE_ENABLED': True, 'TELNETCONSOLE_HOST': '0.0.0.0', 'LOG_FILE': 'scrapy.log', 'DOWNLOAD_DELAY': 2.0, 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'}
6: 2015-04-08 21:35:56 INFO HubStorage: writing items to http://storage.scrapinghub.com/items/12747/2/19
7: 2015-04-08 21:35:56 INFO Enabled extensions: LogStats, TelnetConsole, StackTraceDump, CloseSpider, MemoryUsage, CoreStats, SpiderState, HubstorageExtension
8: 2015-04-08 21:35:56 INFO Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CrawleraMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
9: 2015-04-08 21:35:56 INFO Enabled spider middlewares: HubstorageMiddleware, HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
10: 2015-04-08 21:35:56 INFO Enabled item pipelines:
11: 2015-04-08 21:35:56 INFO Spider opened
12: 2015-04-08 21:35:56 INFO Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
13: 2015-04-08 21:35:56 INFO TelnetConsole starting on 6023
14: 2015-04-08 21:36:56 INFO Crawled 26 pages (at 26 pages/min), scraped 0 items (at 0 items/min)
15: 2015-04-08 21:37:56 INFO Crawled 52 pages (at 26 pages/min), scraped 3 items (at 3 items/min)
16: 2015-04-08 21:38:56 INFO Crawled 77 pages (at 25 pages/min), scraped 6 items (at 3 items/min)
17: 2015-04-08 21:39:56 INFO Crawled 103 pages (at 26 pages/min), scraped 13 items (at 7 items/min)
18: 2015-04-08 21:40:56 INFO Crawled 128 pages (at 25 pages/min), scraped 15 items (at 2 items/min)
19: 2015-04-08 21:41:56 INFO Crawled 153 pages (at 25 pages/min), scraped 22 items (at 7 items/min)
20: 2015-04-08 21:42:56 INFO Crawled 178 pages (at 25 pages/min), scraped 28 items (at 6 items/min)
0: 2015-04-05 20:33:04 INFO Log opened.
1: 2015-04-05 20:33:04 INFO Scrapy 0.25.0-269-gee17902 started
2: 2015-04-05 20:33:04 INFO using set_wakeup_fd
3: 2015-04-05 20:33:05 INFO Scrapy 0.25.0-269-gee17902 started (bot: amazon)
4: 2015-04-05 20:33:05 INFO Optional features available: ssl, http11, boto
5: 2015-04-05 20:33:05 INFO Overridden settings: {'NEWSPIDER_MODULE': 'amazon.spiders', 'LOG_LEVEL': 'INFO', 'CONCURRENT_REQUESTS_PER_DOMAIN': 32, 'CONCURRENT_REQUESTS': 32, 'SPIDER_MODULES': ['amazon.spiders'], 'STATS_CLASS': 'hworker.bot.stats.HubStorageStatsCollector', 'BOT_NAME': 'amazon', 'DOWNLOAD_TIMEOUT': 600, 'MEMUSAGE_LIMIT_MB': 512, 'MEMUSAGE_ENABLED': True, 'TELNETCONSOLE_HOST': '0.0.0.0', 'LOG_FILE': 'scrapy.log', 'DOWNLOAD_DELAY': 5.0, 'USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.104 Safari/537.36'}
6: 2015-04-05 20:33:06 INFO HubStorage: writing items to http://storage.scrapinghub.com/items/12747/2/3
7: 2015-04-05 20:33:06 INFO Enabled extensions: LogStats, TelnetConsole, StackTraceDump, CloseSpider, MemoryUsage, CoreStats, SpiderState, HubstorageExtension
8: 2015-04-05 20:33:06 INFO Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CrawleraMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
9: 2015-04-05 20:33:07 INFO Enabled spider middlewares: HubstorageMiddleware, HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
10: 2015-04-05 20:33:07 INFO Enabled item pipelines:
11: 2015-04-05 20:33:07 INFO Spider opened
12: 2015-04-05 20:33:07 INFO Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
13: 2015-04-05 20:33:07 INFO Using crawlera at http://paygo.crawlera.com:8010?noconnect (user: canopy)
14: 2015-04-05 20:33:07 INFO Setting spider download delay to 0. It's default CrawleraMiddleware behavior, to preserve original delay set CRAWLERA_PRESERVE_DELAY = True in settings.
15: 2015-04-05 20:33:07 INFO TelnetConsole starting on 6023
16: 2015-04-05 20:33:20 INFO Closing spider (banned)
17: 2015-04-05 20:34:02 INFO Dumping Scrapy stats: More
18: 2015-04-05 20:34:02 INFO Spider closed (banned)
19: 2015-04-05 20:34:02 INFO (TCP Port 6023 Closed)
20: 2015-04-05 20:34:02 INFO Main loop terminated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment