Last active
October 21, 2015 18:14
-
-
Save joshdcollins/df2f3e1597fd08de360d to your computer and use it in GitHub Desktop.
SOLR Config - AutoPhrasing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
cel-2000 | |
CEL-2000 | |
CEL 2000 | |
CEL2000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
document with an entity_name of 'CEL-2000' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<fieldType name="text_autophrase" class="solr.TextField" positionIncrementGap="100"> | |
<analyzer type="index"> | |
<tokenizer class="solr.KeywordTokenizerFactory" /> | |
<filter class="solr.LowerCaseFilterFactory" /> | |
<filter class="com.lucidworks.analysis.AutoPhrasingTokenFilterFactory" phrases="autophrases.txt" includeTokens="true" replaceWhitespaceWith="_" /> | |
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> | |
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> | |
</analyzer> | |
<analyzer type="query"> | |
<tokenizer class="solr.KeywordTokenizerFactory" /> | |
<filter class="solr.LowerCaseFilterFactory" /> | |
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> | |
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> | |
</analyzer> | |
</fieldType> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
webapp=/solr path=/autophrase params={q="cel-2000"&defType=dismax&qf=entity_name^100.0+content+entity_author&pf=entity_name+content&rows=100&wt=json&debugQuery=true} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<requestHandler name="/autophrase" class="solr.SearchHandler"> | |
<lst name="defaults"> | |
<str name="echoParams">explicit</str> | |
<int name="rows">10</int> | |
<str name="df">_text_</str> | |
</lst> | |
<lst name="invariants"> | |
<str name="defType">autophrasingParser</str> | |
</lst> | |
</requestHandler> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
CEL-2000,CEL-SCI,CEL_2000,CEL2000 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- CEL-2000 - pass, but also returns a lot of 'noise' based on '2000' and 'CEL' | |
- "CEL-2000" - pass, only matching record found | |
- CEL 2000 - fail (no results) | |
- "CEL 2000" - pass, only matching record found | |
- CEL2000 - fail (no results) | |
- "CEL2000" - fail (no results) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment