ksauzz/solr-memo.md

## solr-memo.md

      
    Raw
  

              solr-memo.md
            
          
    Solrざっくりメモ


Solr Wiki by Apache本家
Solr Reference by Lucid Works

機能


indexing

フィールド(name, type, etc...)
variaus data types(integer, long, date...)


search

pagination
range query
grouping
join
distributed search(shard)
variaus response format(json, xml, python, ruby...)


Schema


Schema Rest API
Schema Docs

Indexing

XML, JSON, CSVデータをPOSTすることでIndexingが可能
<add>
  <doc>
    <field name="employeeId">05991</field>
    <field name="office">Bridgewater</field>
    <field name="skills">Perl</field>
    <field name="skills">Java</field>
  </doc>
  [<doc> ... </doc>[<doc> ... </doc>]]
</add>
Queryの種類

管理コンソールのqueryで色々試してみると理解しやすい。

q: 検索クエリ. fieldname:valueの形式. *:*とか title:hogeとか
fq: field query. fq=popularity:[10 TO *]とか
fl: レスポンスに含めるフィールド (カンマ区切り)
sort: ソート対象フィールド
start, rows: ページング用パラメータ(要するにoffset, limit)
wt: レスポンスのフォーマット(xml or json)
indent: インデントの有無(true or false)
debugQuery: デバッグ(true or false)

TODO: range query, data type
http://wiki.apache.org/solr/SolrQuerySyntax
Field


[field type]: dynamic field
data type: integer, dateなどなど

Shards

shardsパラメータで検索対象のsolrノードを指定する事で、分散サーチを実現している。
uniqueId.hashCode() % numServers.による分散。coordinationノードが各shardへの問い合わせ、マージを実行。そのため、startの数値が大きいとパフォーマンスが劣化する。後述。


shardsの数はURLの長さ制約をうける
The number of shards is limited by number of characters allowed for GET method's URI; most Web servers generally support at least 4000 characters, but many servers limit URI length to reduce their vulnerability to Denial of Service (DoS) attacks.


offsetがパフォーマンスに影響する。
Makes it more inefficient to use a high "start" parameter. For example, if you request start=500000&rows=25 on an index with 500,000+ docs per shard, this will currently result in 500,000 records getting sent over the network from the shard to the coordinating Solr instance. If you had a single-shard index, in contrast, only 25 records would ever get sent over the network. (Granted, setting start this high is not something many people need to do.)


join未サポート


docs