Skip to content

Instantly share code, notes, and snippets.

@n1k0
Created May 2, 2010 16:29
Show Gist options
  • Save n1k0/387260 to your computer and use it in GitHub Desktop.
Save n1k0/387260 to your computer and use it in GitHub Desktop.

Hey, Django rookie here.

I have this model, comments are managed with the django_comments contrib:

class Fortune(models.Model):
    author = models.CharField(max_length=45, blank=False)
    title = models.CharField(max_length=200, blank=False)
    slug = models.SlugField(_('slug'), db_index=True, max_length=255, unique_for_date='pub_date')
    content = models.TextField(blank=False)
    pub_date = models.DateTimeField(_('published date'), db_index=True, default=datetime.now())
    votes = models.IntegerField(default=0)
    comments = generic.GenericRelation(
        Comment,
        content_type_field='content_type',
        object_id_field='object_pk'
    )

I want to retrieve Fortune objects with a supplementary nb_comments value for each, counting their respectve number of comments ; I try this query:

>>> Fortune.objects.annotate(nb_comments=models.Count('comments'))

From the shell:

>>> from django_fortunes.models import Fortune
>>> from django.db.models import Count
>>> Fortune.objects.annotate(nb_comments=Count('comments'))
[<Fortune: My first fortune, from NiKo>, <Fortune: Another One, from Dude>, <Fortune: A funny one, from NiKo>]
>>> from django.db import connection
>>> connection.queries.pop()
{'time': '0.000', 'sql': u'SELECT "django_fortunes_fortune"."id", "django_fortunes_fortune"."author", "django_fortunes_fortune"."title", "django_fortunes_fortune"."slug", "django_fortunes_fortune"."content", "django_fortunes_fortune"."pub_date", "django_fortunes_fortune"."votes", COUNT("django_comments"."id") AS "nb_comments" FROM "django_fortunes_fortune" LEFT OUTER JOIN "django_comments" ON ("django_fortunes_fortune"."id" = "django_comments"."object_pk") GROUP BY "django_fortunes_fortune"."id", "django_fortunes_fortune"."author", "django_fortunes_fortune"."title", "django_fortunes_fortune"."slug", "django_fortunes_fortune"."content", "django_fortunes_fortune"."pub_date", "django_fortunes_fortune"."votes" LIMIT 21'}

Below is the properly formatted sql query:

SELECT "django_fortunes_fortune"."id", 
       "django_fortunes_fortune"."author", 
       "django_fortunes_fortune"."title", 
       "django_fortunes_fortune"."slug", 
       "django_fortunes_fortune"."content", 
       "django_fortunes_fortune"."pub_date", 
       "django_fortunes_fortune"."votes", 
       COUNT("django_comments"."id") AS "nb_comments" 
FROM "django_fortunes_fortune" 
LEFT OUTER JOIN "django_comments" 
    ON ("django_fortunes_fortune"."id" = "django_comments"."object_pk") 
GROUP BY "django_fortunes_fortune"."id", 
         "django_fortunes_fortune"."author", 
         "django_fortunes_fortune"."title", 
         "django_fortunes_fortune"."slug", 
         "django_fortunes_fortune"."content", 
         "django_fortunes_fortune"."pub_date", 
         "django_fortunes_fortune"."votes" 
LIMIT 21

Can you spot the problem? Django won't LEFT JOIN the django_comments table with the content_type data (which contains a reference to the fortune one).

How could I remedy this and make Django doing the expected query?

Hint: I'm using Django 1.2-DEV

Thanks in advance for your help.

@thoas
Copy link

thoas commented May 2, 2010

I think It doesn't work because there isn't a physical foreign key constraint between a fortune and a comment: it's just a proxy with the content type and the primary key (object_pk) of the object.

The solution could be a counter cache column in fortune table which is incremented with signals (comment_was_posted, post_delete, etc.) or an extra parameter.

I have to go back to my PHP now :p

@n1k0
Copy link
Author

n1k0 commented May 2, 2010

thoas> Well, yes and no, you could perfectly do something like:

SELECT "django_fortunes_fortune"."id", 
       "django_fortunes_fortune"."author", 
       "django_fortunes_fortune"."title", 
       COUNT("django_comments"."id") AS "nb_comments" 
FROM "django_fortunes_fortune" 
    LEFT OUTER JOIN "django_comments" 
        ON ("django_fortunes_fortune"."id" = "django_comments"."object_pk") 
    LEFT OUTER JOIN "django_content_type" 
        ON ("django_comments"."content_type_id" = "django_content_type"."id") 
GROUP BY "django_fortunes_fortune"."id", 
         "django_fortunes_fortune"."author", 
         "django_fortunes_fortune"."title", 
         "django_fortunes_fortune"."slug", 
         "django_fortunes_fortune"."content", 
         "django_fortunes_fortune"."pub_date", 
         "django_fortunes_fortune"."votes" 
LIMIT 21

This query will ensure that the retrieved comments are for fortunes and not other models potentially sharing the same primary key value. I tried to achieve the query above with the Django ORM but without any success, my skills and knowledge of t are far too limited. Would you have more success?

To me this is between a bug and an possible enhancement, so I wonder if I should open an issue on Django's trac... with no clue at all regarding a possible implementation for a fix :/

PS: PHP is evil.

@thoas
Copy link

thoas commented May 2, 2010

This query returns nothing in my Database viewer... weird behavior.

I have successfully tested it with an extra parameter, something like that:

from django.utils.datastructures import SortedDict
fortune_list = Fortune.objects.extra(
    select=SortedDict([('comment_count', 'SELECT COUNT(*) FROM django_comments \
                                          WHERE django_comments.object_pk = django_fortunes_fortune.id \
                                          AND django_comments.content_type_id = %s')]),
    select_params=(ContentType.objects.get_for_model(Fortune).id,)
)

I only have tested it in shell_plus of django-command-extensions:

From 'auth' autoload: Permission, Group, User, Message
From 'contenttypes' autoload: ContentType
From 'sessions' autoload: Session
From 'comments' autoload: Comment, CommentFlag
From 'sites' autoload: Site
From 'admin' autoload: LogEntry
From 'django_fortunes' autoload: Fortune
Python 2.6.5 (r265:79063, Apr 18 2010, 19:45:48) 
[GCC 4.2.1 (Apple Inc. build 5646) (dot 1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
(InteractiveConsole)
>>> from django.utils.datastructures import SortedDict
>>> fortune_list = Fortune.objects.extra(
...     select=SortedDict([('comment_count', 'SELECT COUNT(*) FROM django_comments \
...                                           WHERE django_comments.object_pk = django_fortunes_fortune.id \
...                                           AND django_comments.content_type_id = %s')]),
...     select_params=(ContentType.objects.get_for_model(Fortune).id,)
... )
>>> for fortune in fortune_list:
...     print fortune.comment_count
... 
2
1
0
>>> 

James Bennett wrote an excellent article about extra parameters with a comment count example ;)

I hope it helps.

PS: yeah, PHP is evil, but I can't do Python "market reality" :p

@gillesfabio
Copy link

@n1k0
Copy link
Author

n1k0 commented May 2, 2010

thoas> Yeah, nice, but it's raw sql and as I dev using sqlite and plan to host the thing using postgres as the database backend... not sure if it's a good solution, even if it works :/

Gilles> Yes, I saw that and it won't be fixed for 1.2 stable :'(

For the records, I opened a question on stackoverflow.

@thoas
Copy link

thoas commented May 2, 2010

Yeah, you are right raw SQL is always a bad solution, if you find the answer let me know.

@n1k0
Copy link
Author

n1k0 commented May 2, 2010

thoas> I checked out twice but the sql query - even if quite heavy and especially ugly for what it does, works... though on SQLite. What RDBMS are you using?

img

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment