ivanahuckova/es_research.md

## es_research.md

      
    Raw
  

              es_research.md
            
          
    Elasticsearch

Elasticsearch API that we use in Grafana

/_msearch POST

Multi search API. Executes several searches with a single API request.
Query parameters that we add:
HEADER

search_type: query_then_fetch (default options, so maybe we don't have to include it). In this case documents are scored using local term and document frequencies for the shard. This is usually faster but less accurate.
max_concurrent_shard_requests: value based on setting (defaults to 5). Maximum number of concurrent shard requests that each sub-search request executes per node. Defaults to 5.
ignore_unavailable: true If false, the request returns an error if it targets a missing or closed index. Defaults to false.
index: based on settings Optional, string or array of strings of indices to search. Supports wildcards (*). Specify multiple targets as an array.

BODY


query Query docs

bool - Bool docs. A query that matches documents matching boolean combinations of other queries. to combine multiple queries in a logical fashion.
We use:

bool > filter > range > @timestamp : gte, lte, format
bool > filter > query_string > analyze_wildcard: true, query: "test: value"


from Optional, starting offset for returned hits. Defaults to 0.


size: based on limit Options, number of hits to return.


aggregations Aggs docs
An aggregation summarizes your data as metrics, statistics, or other analytics. Elasticsearch organizes aggregations into three categories:


Metric aggregations that calculate metrics, such as a sum or average, from field values.


Bucket aggregations that group documents into buckets, also called bins, based on field values, ranges, or other criteria.


Pipeline aggregations that take input from other aggregations instead of documents or fields


You can run aggregations as part of a search by specifying the search API's aggs parameter.
/_mapping GET

Retrieves mapping definitions for one or more indices. For data streams, the API retrieves mappings for the stream’s backing indices. Mapping is the process of defining how a document, and the fields it contains, are stored and indexed.
Used in:

testDataSource - when configuring data source to test if provided index is correct
metricFindQuery - to get variables
getTagKeys - to get tag keys for adhoc queries
useFields - for autocomplete/providing options in query editor

Building the query

Currently we build query at 2 places:
frontend

We have query (lucene query), array of aggs, array of metrics and timefield. We pass this and
We have query builder that parses ElasticsearchQuery into json.
export interface ElasticsearchQuery extends DataQuery {
  alias?: string;
  query?: string;
  bucketAggs?: BucketAggregation[];
  metrics?: MetricAggregation[];
  timeField?: string;
}
backend

We have processQuery.
We should check if we could/should use libraries to build queries:

https://github.com/aquasecurity/esquery
github.com/imishinist/esquery

Parsing the response

On backend we return only time + value in dataFrame format
Questions

API part of the requests - some settings change query/add headers
How to compose queries? Are there any libraries that could help us?
Write snapshot tests to compare that queries on frontend and backend are the same. The tests should be based on the tests created on fronetend
How to parse results to dataFrames. What dataFrames formats?