Skip to content

Instantly share code, notes, and snippets.

@aykut
Last active August 29, 2015 14:19
Show Gist options
  • Save aykut/5819c88aaa4081be1e7d to your computer and use it in GitHub Desktop.
Save aykut/5819c88aaa4081be1e7d to your computer and use it in GitHub Desktop.
QuerySet chunker
def queryset_chunker(queryset, order_by='-pk', chunk_size=5000):
"""
Takes lazy queryset and chunks it by given chunk_size.
:param queryset: type of queryset. This is a lazy queryset to
split into chunks
:type queryset: QuerySet
:param chunk_size: size of smaller pieces
:type chunk_size: int
"""
if not queryset.query.order_by:
if queryset.query.distinct:
order_by = queryset.query.distinct_fields[0]
queryset = queryset.order_by(order_by)
for pivot in xrange(0, queryset.count(), chunk_size):
yield queryset[pivot:pivot + chunk_size]
@aykut
Copy link
Author

aykut commented Apr 15, 2015

Actually, it wouldn't throw error in case of an empty queryset is passed. xrange will return empty generator.
Ex:

In [1]: list(xrange(0,0,1000))
Out[1]: []

gc.collect() seems unnecessary, here is an example:

In [1]: users=User.objects.all()

In [2]: users.count()
Out[2]: 47603

In [3]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:import gc
:
:
:def queryset_chunker(queryset, chunk_size=5000):
:    """
:    Takes lazy queryset and chunks them by given chunk_size.
:
:    :param queryset: type of queryset. This is a lazy queryset to
:        split into chunks
:    :type queryset: QuerySet
:    :param chunk_size: size of smaller pieces
:    :type chunk_size: int
:    """
:
:    queryset = queryset.order_by('-pk')
:    for pivot in xrange(0, queryset.count(), chunk_size):
:        yield queryset[pivot:pivot + chunk_size]
:        print gc.collect()
:
:<EOF>

In [4]: for chunk in queryset_chunker(users):
    pass
   ...:
0
0
0
0
0
0
0
0
0
0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment