Skip to content

Instantly share code, notes, and snippets.

@stav
Created December 21, 2012 22:24
Show Gist options
  • Save stav/4356269 to your computer and use it in GitHub Desktop.
Save stav/4356269 to your computer and use it in GitHub Desktop.
Scrapy partial response downloader middleware
class PartialResponse(object):
""" Downloader middleware to only return the first n bytes
"""
def process_response(self, request, response, spider):
max_size = getattr(spider, 'response_max_size', 0)
if max_size and len(response.body) > max_size:
h = response.headers.copy()
h['Content-Length'] = max_size
response = response.replace(
body=response.body.encode('utf-8')[:max_size],
encoding='utf-8',
headers=h)
return response
@wetneb
Copy link

wetneb commented Apr 4, 2016

that still downloads the whole file from the server. It would be more interesting if it could actually save bandwidth…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment