Created
May 7, 2014 23:59
-
-
Save chancyk/4e109852b052ad08d21c to your computer and use it in GitHub Desktop.
Profile of zope.index.TextIndex.apply
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Line # Hits Time Per Hit % Time Line Contents | |
============================================================== | |
66 @profile | |
67 def apply(self, querytext, start=0, count=None): | |
68 8734 34597 4.0 0.3 parser = QueryParser(self.lexicon) | |
69 8734 1622182 185.7 14.1 tree = parser.parseQuery(querytext) | |
70 8645 2041557 236.2 17.7 results = tree.executeQuery(self.index) | |
71 8645 12093 1.4 0.1 if results: | |
72 8645 675324 78.1 5.9 qw = self.index.query_weight(tree.terms()) | |
73 | |
74 # Hack to avoid ZeroDivisionError | |
75 8645 12596 1.5 0.1 if qw == 0: | |
76 qw = 1.0 | |
77 | |
78 8645 9764 1.1 0.1 qw *= 1.0 | |
79 | |
80 2623759 2421427 0.9 21.0 for docid, score in six.iteritems(results): | |
81 2615114 2049659 0.8 17.8 try: | |
82 2615114 2636516 1.0 22.9 results[docid] = score/qw | |
83 except TypeError: | |
84 # We overflowed the score, perhaps wildly unlikely. | |
85 # Who knows. | |
86 results[docid] = 2**64 // 10 | |
87 | |
88 8645 7268 0.8 0.1 return results |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
TF-IDF:0.8 canopy creation with 5000 records.
tfidf blocking...
tfidf blocking... name
Index created: 0.27
INFO:dedupe.tfidf:Canopy: TF-IDF:0.8name
Canopy Keys: 9.09
tfidf blocking... address
Index created: 0.22
INFO:dedupe.tfidf:Canopy: TF-IDF:0.8address
Canopy Keys: 9.79