Skip to content

Instantly share code, notes, and snippets.

@defp
Forked from tigerneil/elasticsearch-ik-mmseg.rmd
Created August 29, 2014 08:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save defp/2c81adc4553ec9c9ac73 to your computer and use it in GitHub Desktop.
Save defp/2c81adc4553ec9c9ac73 to your computer and use it in GitHub Desktop.
create two directory: `/config/mmseg`, `/plugins/analysis-mmseg`
1. got to [`https://github.com/medcl/elasticsearch-rtf/tree/master/config/mmseg`](https://github.com/medcl/elasticsearch-rtf/tree/master/config/mmseg)`, download the files:
chars.dic, units.dic, words-my.dic,words.dic, mv them to `/config/mmseg`
2. go to [`https://github.com/medcl/elasticsearch-rtf/tree/master/plugins/analysis-mmseg`](https://github.com/medcl/elasticsearch-rtf/tree/master/plugins/analysis-mmseg), download the jar:
elasticsearch-analysis-mmseg-1.2.2.jar, mv them to `/plugins/analysis-mmseg`
3. add the following configuration to `elasticsearch.yml`:
index:
analysis:
tokenizer:
mmseg_maxword:
type: mmseg
seg_type: max_word
mmseg_complex:
type: mmseg
seg_type: complex
mmseg_simple:
type: mmseg
seg_type: simple
analyzer:
ik:
alias:
- ik_analyzer
type: org.elasticsearch.index.analysis.IkAnalyzerProvider
ik_max_word:
type: ik
use_smart: false
ik_smart:
type: ik
use_smart: true
mmseg:
alias:
- mmseg_analyzer
type: org.elasticsearch.index.analysis.MMsegAnalyzerProvider
mmseg_maxword:
type: custom
filter:
- lowercase
tokenizer: mmseg_maxword
mmseg_complex:
type: custom
filter:
- lowercase
tokenizer: mmseg_complex
mmseg_simple:
type: custom
filter:
- lowercase
tokenizer: mmseg_simple
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment