Skip to content

Instantly share code, notes, and snippets.

@igor-kupczynski
Created October 22, 2014 21:59
Show Gist options
  • Save igor-kupczynski/383ea2b1cbdb21d41e62 to your computer and use it in GitHub Desktop.
Save igor-kupczynski/383ea2b1cbdb21d41e62 to your computer and use it in GitHub Desktop.
How to implement good search on product name in Elasticsearch. http://igor.kupczynski.info/2014/10/22/search-for-product-name.html
analyzer:
generic_name_analyzer:
type: "custom"
tokenizer: "icu_tokenizer"
filter: ["word_split", "icu_folding", "english_stop"]
trigram_name_analyzer:
type: "custom"
tokenizer: "icu_tokenizer"
filter: ["icu_folding", "english_stop", "trigram_filter"]
filter:
word_split:
type: "word_delimiter"
preserve_original: 1
english_stop:
type: "stop"
stopwords: "_english_"
trigram_filter:
type: "ngram"
min_gram: 3
max_gram: 3
_type: file
fileName: TheSmallYellowDog.txt
---
_type: file
fileName: black cat.txt
---
_type: file
fileName: สวัสดี ผมมาจากกรุงเทพฯ.png
file:
properties:
fileName:
type: multi_field
fields:
fileName:
type: string
analyzer: generic_name_analyzer
trigram:
type: string
analyzer: trigram_name_analyzer
#!/bin/bash
export ELASTICSEARCH_ENDPOINT="http://localhost:9200"
# Create indexes
curl -XPUT "$ELASTICSEARCH_ENDPOINT/play" -d '{
"settings": {
"analysis": {
"analyzer": {
"generic_name_analyzer": {
"type": "custom",
"tokenizer": "icu_tokenizer",
"filter": [
"word_split",
"icu_folding",
"english_stop"
]
},
"trigram_name_analyzer": {
"type": "custom",
"tokenizer": "icu_tokenizer",
"filter": [
"icu_folding",
"english_stop",
"trigram_filter"
]
}
},
"filter": {
"word_split": {
"type": "word_delimiter",
"preserve_original": 1
},
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"trigram_filter": {
"type": "ngram",
"min_gram": 3,
"max_gram": 3
}
}
}
},
"mappings": {
"file": {
"properties": {
"fileName": {
"type": "multi_field",
"fields": {
"fileName": {
"type": "string",
"analyzer": "generic_name_analyzer"
},
"trigram": {
"type": "string",
"analyzer": "trigram_name_analyzer"
}
}
}
}
}
}
}'
# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"file"}}
{"fileName":"TheSmallYellowDog.txt"}
{"index":{"_index":"play","_type":"file"}}
{"fileName":"black cat.txt"}
{"index":{"_index":"play","_type":"file"}}
{"fileName":"สวัสดี ผมมาจากกรุงเทพฯ.png"}
'
# Do searches
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"match": {
"fileName": "ผม"
}
}
}
'
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
"query": {
"bool": {
"should": [
{
"match": {
"fileName": {
"query": "yellowish",
"boost": 3
}
}
},
{
"match": {
"fileName.trigram": {
"query": "yellowish",
"minimum_should_match": "50%",
"boost": 1
}
}
}
]
}
}
}
'
# Auto generated by Found's Play-tool at 2014-10-22T23:59:19+02:00
version: 0
title: Searching for Product Name in Elasticsearch
description: "How to implement good search on product name in Elasticsearch. http://igor.kupczynski.info/2014/10/22/search-for-product-name.html"
---
query:
match:
fileName: "ผม"
---
query:
bool:
should:
- match:
fileName:
query: "yellowish"
boost: 3
- match:
fileName.trigram:
query: "yellowish"
minimum_should_match: "50%"
boost: 1
@kodeine
Copy link

kodeine commented Jan 16, 2016

What if we want to match multi fields. Yellow Dog / Brown Dog
Can you make a revision?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment