Last active
August 29, 2015 13:56
-
-
Save penguinco/8860850 to your computer and use it in GitHub Desktop.
elasticsearch session memo
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@naokiainoya from リクルートテクノロジーズ | |
tech stack: AWS, node.js, elaticsearch, dynamo db | |
build own push notification service. | |
too detailed talk… | |
summary: | |
Elasticsearch used in (flexible)rule-based notification routing. | |
requirements: | |
- can handle many notification. (tens of millions notification) | |
- scan feature (they need full search result(=full candidates to be notified)) | |
Q/A: | |
- how to autoscale elasticsearch cluster? | |
- node fail during rebalance will be problematic. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kentaro yoshida from リブセンス | |
elasticsearch experience: from 2013 summer | |
theme: switching to elasticsearch with minimum effort. | |
- use elasticsearch-river-jdbc | |
- caveat: unstable | |
So they creates yamabiko(original river). | |
- it let you reliable sync MySQL/amazon RDS/MariaDV/PerconaServer to elasticsearch. | |
yamabiko's unique feature: | |
- detect (MySQL) record deletion. | |
demo: yamabiko and head-plugin | |
and also describing geo search, mapping, kuromoji. | |
caution: dynamic mapping will problematic. | |
why they loves elasticsearch: | |
- official rpm was provided. | |
- rest-api | |
- facet | |
- array field is useful for tag search | |
- flexible indexing | |
caveat: | |
- Query DSL will confusing. | |
- less flexible than SQL(e.g. group_by join ) | |
- etc… | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
@tady_jp from zigen | |
they use for search engine. not for kibana. | |
elasticsearch benefits: | |
- well automated sharding/replication | |
- easy to scale. | |
- active plugin dev community. | |
elasticsearch information sources: | |
- documentation | |
- books in english | |
- stack overflow | |
what modern search engine is: | |
- including full text search | |
- multi column index | |
- boosting | |
- facet | |
- grouping | |
- suggester | |
- autocomplete | |
- geo search | |
search function often treat as optional in website. | |
but if you add search, it will ploblematic. | |
- poor ranking | |
- poor matching | |
- etc... | |
So we testing or ensuring search result quality by spec: | |
tool: elasticsearch-ruby gem, rails | |
demo: how to build full-featured restaurant search. | |
including | |
- mapping | |
- analyzer | |
- kuromoji | |
tips: | |
- set doc['id'] to _id field | |
- use multi_field | |
- stored field required by highlighter | |
demo: testing with elasticsearch instead of stubbing elasticsearch. | |
- 東京都(tokyo prefecture) should not matched for 京都(kyoto prefecture) | |
- prepare example doc in test db | |
- querying | |
- check result. | |
- it will passed if you use kuromoji instead of ngram. | |
- ensure ranking(boosting) | |
- prepare example doc in test db | |
- querying | |
- check result doc *order*. | |
- it will passed if you tweak boost value. | |
etc…(suggester) | |
=> several requirement can be tested with spec! | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
toyama-san from ipros | |
- elastic search in production | |
- performance tuning | |
- routing is important | |
- how to grab logs from fluentd | |
intro: | |
environment: AWS git Elasticsearch | |
using plugging elasticsearch-cloud-aws | |
demonstrate how easy to use that plugin. | |
note: distinguish cluster by ec2 instance tag.(not security group) | |
describing multicast disabled in AWS to ensure why they use plugin. | |
discovery.zen.ping.multicast.enabled: false (recommended) | |
routing: | |
very important for performance sake. | |
balancing doc by routing. (they balanced by user_id) | |
and querying with routing lets you reduce shard access(=avoid full-scan). | |
routing across types.(routing strategy is same in all types.) | |
it is good because sequential queries by a user will be related to same user_id. | |
logstash: | |
fit into time series | |
- easy to backup | |
- short term logging(weeks or months) | |
not fit into yearly level logging | |
split in index level pros/cons | |
…(i can't keep up this part… too fast speech…) | |
request: improve backup functions. | |
fluentd -> elastic search | |
- use td-logger(ruby gem) | |
- all log come through fluentd | |
fluentd benefits: | |
- built-in retry(elasticsearch can down/unreachable/gc pause.) | |
note: use record_reformer(fluentd plugin) to convert time field | |
OR mapping: | |
- use tire(ruby gem) | |
- same interface of active-record | |
- if you start at now, you should use other gem because tire renames to retire. | |
they handles 100GB logs to search. | |
Q/A | |
- they use m1.large x2 | |
- how to delete old data? | |
- add only. add machines. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment