Skip to content

Instantly share code, notes, and snippets.

@JoaoGFarias
Last active October 10, 2017 06:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save JoaoGFarias/61602f8f68e57e686f96b800517039dd to your computer and use it in GitHub Desktop.
Save JoaoGFarias/61602f8f68e57e686f96b800517039dd to your computer and use it in GitHub Desktop.
2017-10-10 03:19:09 [scrapy.core.engine] INFO: Closing spider (closespider_pagecount)
2017-10-10 03:19:09 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.brandsmartusa.com/samsung/215588/galaxy+j7+prime+smartphone.htm> (referer: None) ['cached']
<extractor.BrandsMartUSAExtractor.BrandsMartUSAExtractor object at 0x10f559ac8>
name: Galaxy J7 Prime 5.5", 8MP Rear/5MP Front Camera, Octa Core Processor, 16GB Memory, 3GB RAM, 3300 mAh Battery, Android 6.0.1 Marshmallow Smartphone - Gold
model:
price: $149.00
brand: Samsung
OS: Android 6.0 "Marshmallow"
color: Gold
network: MetroPCS
SIM_card_slot: Dual SIM
back_camera: 13.0 Megapixels (MP)
ROM: 16 GB
RAM: 3GB
screen_size: 5.5"
screen_resolution: 1920 x 1080
battery_capacity:
battery_voltage:
weight: 6.0 oz
dimensions: 5.9 in x 3.0 in x 0.3 in
----------------------
None
2017-10-10 03:19:10 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.brandsmartusa.com/samsung/215588/galaxy+j7+prime+smartphone.htm> (referer: None)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in <genexpr>
return (_set_referer(r) for r in result or ())
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
return (r for r in result or () if _filter(r))
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
return (r for r in result or () if _filter(r))
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/spiders/Anchor_Spider.py", line 19, in parse
self.add_page(response)
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/spiders/Smartphone_Spider.py", line 94, inadd_page
page.insertCrawlerInfo(self.name)
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/database/Page.py", line 110, in insertCrawlerInfo
self.save_smartphone_data(cursor, self.smartphone_data)
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/database/Page.py", line 76, in save_smartphone_data
values = smartphone_data.get_data_as_tuple()
AttributeError: 'NoneType' object has no attribute 'get_data_as_tuple'
2017-10-10 03:19:10 [scrapy.statscollectors] INFO: Dumping Scrapy stats:
{'downloader/request_bytes': 563,
'downloader/request_count': 2,
'downloader/request_method_count/GET': 2,
'downloader/response_bytes': 25538,
'downloader/response_count': 2,
'downloader/response_status_count/200': 2,
'finish_reason': 'closespider_pagecount',
'finish_time': datetime.datetime(2017, 10, 10, 6, 19, 10, 197076),
'httpcache/hit': 2,
'log_count/DEBUG': 4,
'log_count/ERROR': 1,
'log_count/INFO': 7,
'memusage/max': 76435456,
'memusage/startup': 76435456,
'response_received_count': 2,
'scheduler/dequeued': 1,
'scheduler/dequeued/memory': 1,
'scheduler/enqueued': 1,
'scheduler/enqueued/memory': 1,
'spider_exceptions/AttributeError': 1,
'start_time': datetime.datetime(2017, 10, 10, 6, 19, 9, 433662)}
2017-10-10 03:19:10 [scrapy.core.engine] INFO: Spider closed (closespider_pagecount)
2017-10-10 03:39:32 [scrapy.core.scraper] ERROR: Spider error processing <GET https://www.brandsmartusa.com/headphones/on+sale+items/_/N-103504> (referer: https://www.brandsmartusa.com/headphones/bluetooth+headphones/_/N-102991)
Traceback (most recent call last):
File "/usr/local/lib/python3.6/site-packages/scrapy/utils/defer.py", line 102, in iter_errback
yield next(it)
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/offsite.py", line 29, in process_spider_output
for x in result:
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/referer.py", line 339, in <genexpr>
return (_set_referer(r) for r in result or ())
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/urllength.py", line 37, in <genexpr>
return (r for r in result or () if _filter(r))
File "/usr/local/lib/python3.6/site-packages/scrapy/spidermiddlewares/depth.py", line 58, in <genexpr>
return (r for r in result or () if _filter(r))
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/spiders/Anchor_Spider.py", line 19, in parse
self.add_page(response)
File "/Users/joaofarias/Documents/UFPE/jgfd-tpshc_IF962-proj1/crawler/spiders/Smartphone_Spider.py", line 91, inadd_page
page.smartphone_data = extractor.extract(response)
File "../extractor/BrandsMartUSAExtractor.py", line 23, in extract
smartphone.price = price[price.rfind('$'):]
AttributeError: 'NoneType' object has no attribute 'rfind'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment