Skip to content

Instantly share code, notes, and snippets.

@davebarkerxyz
Last active May 19, 2020 07:43
Show Gist options
  • Star 15 You must be signed in to star a gist
  • Fork 3 You must be signed in to fork a gist
  • Save davebarkerxyz/21fbffd7a7990f5e066c to your computer and use it in GitHub Desktop.
Save davebarkerxyz/21fbffd7a7990f5e066c to your computer and use it in GitHub Desktop.
Rebuild Flask-WhooshAlchemy search indices
#!/usr/bin/env python
import datetime
from app import app, models
import whoosh
import flask_whooshalchemy
"""
Rebuild all Whoosh search indices
Useful after manually importing data (side-stepping the SQLAlchemy ORM
and automatic Whoosh index updates)
If this is intended as a full rebuild, you should consider deleting the
Whoosh search database (as specified in app.config["WHOOSH_BASE"])
before running the rebuild. This will ensure that no old/stale
data is left in the search indices (this process doesn't delete removed
data, only recreated search entries for current data).
"""
program_start = datetime.datetime.utcnow()
def log(message):
logtime = datetime.datetime.utcnow()
logdiff = logtime - program_start
print("{0} (+{1:.3f}): {2}".format(logtime.strftime("%Y-%m-%d %H:%M:%S"),
logdiff.total_seconds(),
message))
def rebuild_index(model):
"""Rebuild search index of Flask-SQLAlchemy model"""
log("Rebuilding {0} index...".format(model.__name__))
primary_field = model.pure_whoosh.primary_key_name
searchables = model.__searchable__
index_writer = flask_whooshalchemy.whoosh_index(app, model)
# Fetch all data
entries = model.query.all()
entry_count = 0
with index_writer.writer() as writer:
for entry in entries:
index_attrs = {}
for field in searchables:
index_attrs[field] = unicode(getattr(entry, field))
index_attrs[primary_field] = unicode(getattr(entry, primary_field))
writer.update_document(**index_attrs)
entry_count += 1
log("Rebuilt {0} {1} search index entries.".format(str(entry_count), model.__name__))
if __name__ == "__main__":
model_list = [models.Product,
models.Commodity,
models.Category,
models.Page]
for model in model_list:
rebuild_index(model)
@amumtaz
Copy link

amumtaz commented Aug 14, 2015

I am new to Flask/Whoosh and working in virtualenv (Python27) on indexing some text. I keep getting the following error when I run the script:

File "rebuildwhooshindex.py", line 4, in
import lib
ImportError: No module named lib

Any suggestions?

@dhamaniasad
Copy link

You can just remove the import lib line and the script will work. As you can see, lib is not being used anywhere in the program.

@davebarkerxyz
Copy link
Author

@amumtaz @dhamaniasad Oops, my bad. Lib was a project specific package where I was holding frozen libs to avoid polluting our CI & production servers' site-packages. I'd probably do things differently this time around. I've removed the offending line.

@chrisphyffer
Copy link

chrisphyffer commented May 30, 2017

Thank you very much Dav, this snippet is very helpful. :)

@tanatarca
Copy link

Hi! I have a problem here, my model seems to not have the attribute pure_whoosh... I did create the index before, I have realized whoosh queries without problem, so I don't know where this issue comes from.

@SollyTaylor
Copy link

model.pure_whoosh.primary_key_name should be added by whoosh_index function
in the flask_whooshalchemy.py file,
whoose_index function calls _create_index where the pure_whoose and whoosh_primary_key are intentionally added to the model

model.pure_whoosh = _Searcher(primary_key, indx)
model.whoosh_primary_key = primary_key

I wrote some code like by flask script sth like:

wa.whoosh_index(app, Contract)
rebuild_index(Contract, app)

which eventually rebuilt the indexes.

@aravergar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment