Skip to content

Instantly share code, notes, and snippets.

@ryanpitts
Created October 21, 2011 19:34
Show Gist options
  • Star 54 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save ryanpitts/1304725 to your computer and use it in GitHub Desktop.
Save ryanpitts/1304725 to your computer and use it in GitHub Desktop.
GROUP BY and Select MAX from each group in Django, 2 queries
'''
given a Model with:
category = models.CharField(max_length=32, choices=CATEGORY_CHOICES)
pubdate = models.DateTimeField(default=datetime.now)
<other fields>
Fetch the item from each category with the latest pubdate.
'''
model_max_set = Model.objects.values('category').annotate(max_pubdate=Max('pubdate')).order_by()
q_statement = Q()
for pair in model_max_set:
q_statement |= (Q(category__exact=pair['category']) & Q(pubdate=pair['max_pubdate']))
model_set = Model.objects.filter(q_statement)
@formido
Copy link

formido commented Sep 6, 2013

I used this idea, thanks. :)

@Phlip
Copy link

Phlip commented Mar 21, 2015

the number of pub_dates is going to go up forever, so this query will get slow, and its generated SQL statement longer, each time a new pub_date appears

@Phlip
Copy link

Phlip commented Mar 21, 2015

I went with:

qs = MyModel.objects.all()
latest_dates = qs.values('frib', 'frob').annotate(latest_created_at=Max('created_at'))
qs = qs.filter(created_at__in=latest_dates.values('latest_created_at')).order_by('-created_at')

the last order_by() is optional, but it stands to reason it's needed.

@shenqihui
Copy link

@Phlip 👍

@keithcallenberg
Copy link

@Phlip In your filter operation your only check is that the model instance has the same created_at date as one of the values in the list. But wouldn't this break down if any instance from one category had the same date/datetime as the max for a different category? It would be included even if it wasn't the max for its category.

E.g.

id category pubdate
1 1 20150101
2 1 20150525
3 2 20140901
4 2 20150101

So even though the pubdate for id=1 is less than id=2, it will be included because it has the same pubdate as id=4. IDs 1, 2 and 4 will all be included.

I guess if you have a datetime (as opposed to date, or some other less granular value) you may feel more confident that they wouldn't share the exact same datetime, but I think it's worth pointing out this assumption/limitation.

@ns5d
Copy link

ns5d commented Jun 25, 2017

MyModel.objects.all().order_by('id', '-date').distinct('id')

@seanucd2000
Copy link

Thanks @ns5d for posting this line here. Learning django and this made my CBV work perfectly in my use. Was fiddling with a less than ideal solution before, but this was worked beautifully. (did need to finally move my dev db from sqlite to postgres for distinct functionality)

@lucascnunes
Copy link

My savior. Thanks, I've spent 3 days looking for something like it

@TomStobbe
Copy link

Thanks @ns5d, exactly what I needed. Worked perfectly

@devashish-patel
Copy link

God bless you..:)

@michel4j
Copy link

michel4j commented Sep 16, 2018

You can use sub-queries to achieve the same thing in a single query as follows:

    sq = Item.objects.filter(category=OuterRef('category')).order_by('-pubdate')  # deferred execution
    Item.objects.filter(pk=Subquery(sq.values('pk')[:1]))

Alternatively, If you make Category a model like:

   category    = models.ForeignKey(Category, on_delete=models.SET_NULL, related_name='items')
   pubdate     = models.DateTimeField(default=datetime.now)

You can then do it in one query irrespective of database backend as follows:

    Item.objects.annotate(_sel=Max('category__items__pubdate')).filter(pubdate=F('_sel'))

I haven't tested which of them is faster, but both execute a single query on the database backend and work with every backend supported by Django, including SQLite.

@JakLiao
Copy link

JakLiao commented Sep 19, 2018

Thanks, that's awesome.

@Zakui
Copy link

Zakui commented Nov 7, 2018

Thanks, exactly what I needed :)

@MewX
Copy link

MewX commented Dec 17, 2018

Awesome! Thanks a lot!

@evdoks
Copy link

evdoks commented Mar 27, 2020

@michel4j Took me some time to adapt to my needs, but in the end it worked! Amazing solution!

@rluts
Copy link

rluts commented Apr 9, 2020

It is very slow for large tables. Use the version provided by @ns5d

@pythonist77
Copy link

Thanks @ryanpitts for this trick.

@Elawphant
Copy link

MyModel.objects.all().order_by('id', '-date').distinct('id')

Works only on PostGres?

@ns5d
Copy link

ns5d commented Jan 1, 2021

@Elawphant In theory, if Django supports the database, and the database supports GROUP BY and DISTINCT (these are basic operations), then everything should work.

@William-Wildridge
Copy link

Thank you Michel. Nice solution if PostGres isn't an option.

@rodrialbanese
Copy link

Thanks a lot. This worked for sqlite where distinct isn't an option.

@Prashanyh
Copy link

Prashanyh commented Oct 22, 2021

Any Idea...???

class AllLogout(models.Model):

user = models.ForeignKey(Account,on_delete= models.CASCADE)
login_time = models.DateTimeField(null=True,)
logout_time = models.DateTimeField(null=True)


def __str__(self):
    return str(self.user) + ': ' + str(self.logout_time)

this is my model, i want output like this
one day user login&logout multiple times but i need day1 user login,logout time again day 2 same user login, logout time
i tried with this but not working

data1 = Account.objects.filter(Role='Agent').values_list('id', flat=True)

res = []

for agentname in data1:
agentdata = AllLogout.objects.filter(user_id=agentname).order_by('login_time').distinct()
res.append(agentdata1)

output like this:
agent1 login 2021-10-20 17:09:50.957295 logout 2021-10-21 21:45:54.955938
agent 3 login 2021-10-20 12:57:57.923346 logout 2021-10-20 12:58:20.246713
day 2
agent1 login 2021-10-21 12:58:40.147530 logout 2021-10-21 12:59:02.528687
agent 3 login 2021-10-21 21:45:54.858950 logout 2021-10-21 14:01:22.189149

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment