Create a gist now

Instantly share code, notes, and snippets.

Django-haystack Whoosh backend with character folding
# -*- coding: utf-8 -*-
"""
Whoosh backend for haystack that implements character folding, as per
http://packages.python.org/Whoosh/stemming.html#character-folding .
Tested with Haystack 2.4.0 and Whooch 2.7.0
To use, put this file on your path and add it to your haystack settings, eg.
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'search_backends.FoldingWhooshEngine',
'PATH': 'path-to-whoosh-index',
},
}
"""
from haystack.backends.whoosh_backend import WhooshEngine, WhooshSearchBackend
from whoosh.analysis import CharsetFilter, StemmingAnalyzer
from whoosh.support.charset import accent_map
from whoosh.fields import TEXT
class FoldingWhooshSearchBackend(WhooshSearchBackend):
def build_schema(self, fields):
schema = super(FoldingWhooshSearchBackend, self).build_schema(fields)
for name, field in schema[1].items():
if isinstance(field, TEXT):
field.analyzer = StemmingAnalyzer() | CharsetFilter(accent_map)
return schema
class FoldingWhooshEngine(WhooshEngine):
backend = FoldingWhooshSearchBackend
@paweloque

I still cannot search using the words without accents like:
search with 'cafe' and get back results like: 'café', 'cafe'.
Do I have to do something additional like changing the index template?

@gregplaysguitar
Owner

@paweloque, no, you should just be able to change the backend. Make sure you reindex the content after doing this.

@dmarcelino

Great stuff, thanks @gregplaysguitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment