Skip to content

Instantly share code, notes, and snippets.

@EricFries
Last active September 19, 2019 20:32
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save EricFries/cf249617348e1386956ae74f5f441717 to your computer and use it in GitHub Desktop.
Save EricFries/cf249617348e1386956ae74f5f441717 to your computer and use it in GitHub Desktop.

Django Performance Guidelines

ORM

  • Incorrect usage of the Django ORM is a common cause of performance issues. Therefore, it is important to know when querysets are evaluated (i.e., hit the database)

  • Use exists() to check if a queryset has any results in it.

    • Good:
     queryset = Account.objects.filter(some_attribute=True)
     if queryset.exists():
         # do something
    
    • Bad:
     queryset = Account.objects.filter(some_attribute=True)
     if len(queryset):
         # do something
    
  • Consider breaking up complex queries by assigning “intermediate” querysets to variables. This can be very helpful in debugging, and does not impact performance since the intermediate steps do not cause the queryset to be evaluated (see above).

    queryset = Account.objects.filter(some_attribute=True)
    queryset = queryset.exclude(in_production=False)
    queryset = queryset.exclude(theme__old_context=False)
    
  • Avoid looping over querysets. This causes Django to create an object for each item and load it into memory. (sql - Why is iterating through a large Django QuerySet consuming massive amounts of memory? - Stack Overflow)

  • If you cannot avoid looping, consider using iterator()

    Evaluates the queryset (by performing the query) and returns an iterator (see PEP 234 ) over the results. A queryset typically caches its results internally so that repeated evaluations do not result in additional queries. In contrast, iterator() will read results directly, without doing any caching at the queryset level (internally, the default iterator calls iterator() and caches the return value). For a queryset which returns a large number of objects that you only need to access once, this can result in better performance…

  • Use update() to bulk update a queryset, rather than looping, updating each item and saving individually.

    • Be aware that update() does not call save()or trigger pre_save and post_save signals. This can be useful when you need to make updates without running that logic (e.g., in a data migration), but if used in other contexts may bypass important business logic.
    • Reference: Making queries | Django documentation | Django
  • Leverage select_related to follow foreign key relationships in order to select related objects, and reduce the number of queries performed. This will work with ForeignKey and OneToOne relationships, but not ManyToMany.

    • A potential example of select_related usage is a situation where data cannot be retrieved via the ORM and iteration is necessary (e.g., accessing a model property instead of a model database field).

      const display_names = []
      # where display_menu is a calculated property and not a field on Menu and therefore can't be retrieved using the ORM.
      for menu in Menu.objects.select_related('location'):
          display_names.append(menu.location.display_name)
      
    • The above example results in a single query. However, if you were to remove select_related and use all(), a query would occur each time menu.location is called.

    • Be aware that calling select_related without any arguments will fetch all related objects, and may result in a performance decrease.

    • Reference: https://docs.djangoproject.com/en/2.2/ref/models/querysets/#select-related

  • Similar to select_related, prefetch_related can be used to reduce the number queries, but works with ManyToMany relationships. One key difference to note, however, is that the JOIN is done using Python and not SQL.

Common Bugs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment