Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save douglasmiranda/4b6fc0d44d0bd8710f3e093236b090ff to your computer and use it in GitHub Desktop.
Save douglasmiranda/4b6fc0d44d0bd8710f3e093236b090ff to your computer and use it in GitHub Desktop.
Django - Accent-insensitive queries for full-text search (unaccent extension).

Steps in Django for performing full-text search queries using unaccent extension

Disclaimer

This is for full-text queries using SearchVector, SearchQuery and __search lookup, those need some extra steps to be combined with language-specific unaccent extension.

If you're doing simple queries instead of full-text search, take a look at the __unaccent lookup.

Note: I'm not a "Postgres expert". If you are and can make this article better, please leave a comment.

Enable and configure unaccent extension on Postgres

There are multiple ways to do it, you can search and find something you like better than this solution. But one way to do it is to create a migration (if your Postgres user has permission to do so).

Create a migration for enabling the unaccent extension

python manage.py makemigrations --empty <app_name>

It looks like:

# Generated by Django 3.0.2 on 2020-03-15 14:01

from django.db import migrations


class Migration(migrations.Migration):

    dependencies = [
        # ... Dependencies here, something like:
        # ('news', '0009_auto_20200315_1338'),
    ]

    operations = [
    ]

Then you customize, instead of writing raw SQL you can use what Django provides, so it's gonna look like this:

# Generated by Django 3.0.2 on 2020-03-15 13:38

from django.db import migrations
from django.contrib.postgres.operations import UnaccentExtension


class Migration(migrations.Migration):
    """
    Make it possible to perform accent-insensitive queries in PostgreSQL.
    - https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/operations/
    - https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/lookups/
    - https://docs.djangoproject.com/en/3.0/ref/contrib/postgres/
    - https://www.postgresql.org/docs/current/unaccent.html
    """

    dependencies = [
        # your dependencies
        # ("news", "0008_article_date_pinned_until"),
    ]

    operations = [UnaccentExtension()]

Configure your specific language in Postgres to use with unaccent

In this example, I'm using the existing configuration for Portuguese language and creating a new configuration that can be used with unaccent extension when performing full-text queries.

# Generated by Django 3.0.2 on 2020-03-15 14:01

from django.db import migrations


class Migration(migrations.Migration):
    """Accent-insensitive queries in Portuguese language for full-text search.
    https://stackoverflow.com/a/47248109/1808134
    """

    dependencies = [
        # your dependencies
        # ("news", "0009_auto_20200315_1338"),
    ]

    operations = [
        migrations.RunSQL(
            "CREATE TEXT SEARCH CONFIGURATION portuguese_unaccent( COPY = portuguese );"
        ),
        migrations.RunSQL(
            "ALTER TEXT SEARCH CONFIGURATION portuguese_unaccent "
            + "ALTER MAPPING FOR hword, hword_part, word "
            + "WITH unaccent, portuguese_stem;"
        ),
    ]

Example usage

Those queries should return the same results now:

news = News.objects.all().annotate(
    search=SearchVector("title", config="portuguese_unaccent")
).filter(search="Atencao")

news = News.objects.all().annotate(
    search=SearchVector("title", config="portuguese_unaccent")
).filter(search="Atenção")

SearchQuery has the config parameter also.

Related links

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment