Created
December 21, 2012 22:24
-
-
Save stav/4356269 to your computer and use it in GitHub Desktop.
Scrapy partial response downloader middleware
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
class PartialResponse(object): | |
""" Downloader middleware to only return the first n bytes | |
""" | |
def process_response(self, request, response, spider): | |
max_size = getattr(spider, 'response_max_size', 0) | |
if max_size and len(response.body) > max_size: | |
h = response.headers.copy() | |
h['Content-Length'] = max_size | |
response = response.replace( | |
body=response.body.encode('utf-8')[:max_size], | |
encoding='utf-8', | |
headers=h) | |
return response |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
that still downloads the whole file from the server. It would be more interesting if it could actually save bandwidth…