更新: | 2013-09-22 |
---|---|
バージョン: | 0.0.2 |
作者: | @voluntas |
URL: | http://voluntas.github.io/ |
Python: | 2.7.5 |
---|---|
Elasticsearch: | 0.90.5 |
redis: | 2.6.16 |
- Django で全文検索機能を使う
- 全文検索には Elasticsearch を使う
- Django の検索フレームワークには Haystack を使う
- 検索のインデックス作成は Celery を使ったキューによる非同期処理を行う
- キューには Redis を使用する
- 秒間 10000 程度の更新負荷に耐えられるかどうか確認する
- 外部キーのインデックス
- 日本語全文検索
url: | http://www.elasticsearch.org/ |
---|
ダウンロードして、とりあえず動かすだけならとても簡単
$ curl -O https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.tar.gz $ tar xvfz elasticsearch-0.90.5.tar.gz $ cd elasticsearch-0.90.5 $ ./bin/elasticsearch -f [2013-09-22 22:36:42,191][INFO ][node ] [The Night Man] version[0.90.5], pid[22901], build[c8714e8/2013-09-17T12:50:20Z] [2013-09-22 22:36:42,192][INFO ][node ] [The Night Man] initializing ... [2013-09-22 22:36:42,199][INFO ][plugins ] [The Night Man] loaded [], sites [] [2013-09-22 22:36:44,092][INFO ][node ] [The Night Man] initialized [2013-09-22 22:36:44,092][INFO ][node ] [The Night Man] starting ... [2013-09-22 22:36:44,181][INFO ][transport ] [The Night Man] bound_address {inet[/0:0:0:0:0:0:0:0%0:9300]}, publish_address {inet[/192.0.2.1:9300]} [2013-09-22 22:36:47,243][INFO ][cluster.service ] [The Night Man] new_master [The Night Man][gjXA_KlvQ7aGmg2zoiRmPQ][inet[/192.0.2.1:9300]], reason: zen-disco-join (elected_as_master) [2013-09-22 22:36:47,277][INFO ][discovery ] [The Night Man] elasticsearch/gjXA_KlvQ7aGmg2zoiRmPQ [2013-09-22 22:36:47,288][INFO ][http ] [The Night Man] bound_address {inet[/0:0:0:0:0:0:0:0%0:9200]}, publish_address {inet[/192.0.2.1:9200]} [2013-09-22 22:36:47,289][INFO ][node ] [The Night Man] started [2013-09-22 22:36:47,307][INFO ][gateway ] [The Night Man] recovered [0] indices into cluster_state
url: | http://redis.io/ |
---|
ダウンロードして、ビルドして、テストして、とりあえず動かすだけならとても簡単
$ curl -O http://download.redis.io/releases/redis-2.6.16.tar.gz $ tar xvfz redis-2.6.16.tar.gz $ cd redis-2.6.16 $ make $ make test $ ./src/redis-server [22722] 22 Sep 22:25:10.242 # Warning: no config file specified, using the default config. In order to specify a config file use ./src/redis-server /path/to/redis.conf [22722] 22 Sep 22:25:10.243 * Max number of open files set to 10032 _._ _.-``__ ''-._ _.-`` `. `_. ''-._ Redis 2.6.16 (00000000/0) 64 bit .-`` .-```. ```\/ _.,_ ''-._ ( ' , .-` | `, ) Running in stand alone mode |`-._`-...-` __...-.``-._|'` _.-'| Port: 6379 | `-._ `._ / _.-' | PID: 22722 `-._ `-._ `-./ _.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | http://redis.io `-._ `-._`-.__.-'_.-' _.-' |`-._`-._ `-.__.-' _.-'_.-'| | `-._`-._ _.-'_.-' | `-._ `-._`-.__.-'_.-' _.-' `-._ `-.__.-' _.-' `-._ _.-' `-.__.-' [22722] 22 Sep 22:25:10.244 # Server started, Redis version 2.6.16 [22722] 22 Sep 22:25:10.244 * The server is now ready to accept connections on port 6379
url: | https://www.djangoproject.com/ |
---|
requirements.txt:
Django==1.5.4 celery==3.0.23 django-celery==3.0.23 celery-haystack==0.7.2 django-haystack==2.1.0 redis==2.8.0
$ pip install -r requirements.txt
settings.py にいくつか設定が必要、まずは最低限の設定
- redis と elasticsearch はローカルで動かしてる
INSTALLED_APPS = (
...
'djcelery',
'haystack',
'celery_haystack',
..
)
# haystack
HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': 'http://127.0.0.1:9200/',
'INDEX_NAME': 'haystack',
},
}
HAYSTACK_SIGNAL_PROCESSOR = 'celery_haystack.signals.CelerySignalProcessor'
# celery
import djcelery
djcelery.setup_loader()
BROKER_URL = 'redis://127.0.0.1:6379/4'
from django.db import models
class Note(models.Model):
title = models.CharField(max_length=255, blank=False, null=False)
author = models.CharField(max_length=255, blank=False, null=False)
text = models.TextField(blank=False, null=False)
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
Celery を使って更新する場合は Haystack の SearchIndex ではなく celery_haystack の CelerySearchIndex を継承させること
import datetime
from haystack import indexes
from celery_haystack.indexes import CelerySearchIndex
from .models import Note
class NoteIndex(CelerySearchIndex, indexes.Indexable):
text = indexes.CharField(document=True, use_template=True)
title = indexes.IntegerField(model_attr='title')
author = indexes.IntegerField(model_attr='author')
created_at = indexes.DateTimeField(model_attr='created_at')
updated_at = indexes.DateTimeField(model_attr='updated_at')
def get_model(self):
return Note
def index_queryset(self, using=None):
return self.get_model().objects.filter(updated_at__lte=datetime.datetime.now())
$ python manage.py syncdb
$ python manage.py rebuild_index
$ python manage.py celery worker --loglevel=info
$ python manage.py runserver
- Django documentation | Django documentation | Django
- https://docs.djangoproject.com/en/1.5/
- django/django
- https://github.com/django/django
- Open Source Distributed Real Time Search & Analytics | Elasticsearch
- http://www.elasticsearch.org/
- elasticsearch/elasticsearch
- https://github.com/elasticsearch/elasticsearch
- Celery - Distributed Task Queue — Celery 3.0.23 documentation
- http://docs.celeryproject.org/en/latest/index.html
- celery/celery
- https://github.com/celery/celery
- Haystack - Search for Django
- http://haystacksearch.org/
- toastdriven/django-haystack
- https://github.com/toastdriven/django-haystack
- Django — Celery 3.0.23 documentation
- http://docs.celeryproject.org/en/latest/django/
- celery/django-celery
- https://github.com/celery/django-celery
- celery-haystack — celery-haystack 0.7.2 documentation
- http://celery-haystack.readthedocs.org/en/latest/
- andymccurdy/redis-py
- https://github.com/andymccurdy/redis-py