Skip to content

Instantly share code, notes, and snippets.

@kalv
Created September 13, 2010 16:09
Show Gist options
  • Save kalv/577522 to your computer and use it in GitHub Desktop.
Save kalv/577522 to your computer and use it in GitHub Desktop.

Playing with keywords on message

Chris and myself worked on removing keywords on the basis that it was messy and with the aim to reduce the size of the database.

This was prompted while fixing the bug where whitespace in messages were creating a lot of empty keywords.

results

We created new code to regexp to search message content. All tests were passing. But before we committed we run some benchmarks on realistic data (1.1m messages on development machine).

This was the results to just running 3 searches:

Name                          user       system     total        real
Old keywords search           0.100000   0.010000   0.110000     (0.116258)
New keywords by regexp search 0.010000   0.000000   0.010000     (16.192722)

So it was very slow. Regexp queries on string based fields that are large are not good.

We are going to leave the keywords in place, fix issues that we have (whitespace bug) and rather prefer having a larger database than a slow one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment