Skip to content

Instantly share code, notes, and snippets.

@a1Gupta
Created June 22, 2017 10:54
Show Gist options
  • Save a1Gupta/8fe6f2ad561033093f463001b32c8403 to your computer and use it in GitHub Desktop.
Save a1Gupta/8fe6f2ad561033093f463001b32c8403 to your computer and use it in GitHub Desktop.
def queryset_iterator(queryset, chunksize=1000):
'''''
Iterate over a Django Queryset ordered by the primary key
This method loads a maximum of chunksize (default: 1000) rows in it's
memory at the same time while django normally would load all rows in it's
memory. Using the iterator() method only causes it to not preload all the
classes.
Note that the implementation of the iterator
does not support ordered query sets.
'''
pk = 0
last_pk = queryset.order_by('-pk').values_list('pk', flat=True).first()
if last_pk is not None:
queryset = queryset.order_by('pk')
while pk < last_pk:
for row in queryset.filter(pk__gt=pk)[:chunksize]:
pk = row.pk
yield row
gc.collect()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment