Ce TP est a faire sous Linux

0. Setup the environment

  1. Installation de ElasticSearch et Kibana
  • Allez sur le lien :
  • Extraire les deux fichiers
  • Aller dans le dossier ElasticSearch/config
  • ouvrir le fichier jvm.options
  • Definissez les mémoires (exemple avec 300 mégabytes)
  1. Démarrage des services

Pour pouvoir lancer Kibana, il faut d'abord lancer ElasticSearch, pour ça, ouvrez deux terminal (un pour elasticsearch, l'autre pour kibana) Lancer l'application en faisant

$ bin/elasticsearch  

Verifiez que vous avez accès à localhost:9200 Faire la même chose pour Kibana

$ bin/kibana  

Vérifiez que vous avez accès à localhost:5601

1. Premiere manipulation avec les API

Le but de cette partie est d'inserer trois documents suivant :

Document 1:

  • Title : Reactive Streams in Java
  • Year 2019
  • Author: Adam L.Davies
  • Publisher: Apress
  • Language: English

Document 1:

  • Title : Scala Machine Learning Projects
  • Year 2018
  • Author: Md. Rezaul Karim
  • Publisher: Packt
  • Language: English

Document 3:

  • Title : A Beginner's Guide to Scala Orientation and Functional Programming
  • Year 2018
  • Author: John Hunt
  • Publisher: Springer
  • Language: English

Le code pour cela est

# Creation de l'index ebook

PUT ebook

# Insertion des documents

POST ebook/_doc/1
  "Title": "Reactive Streams in Java",
  "Year": 2019,
  "Author": "Adam L. Davis",
  "Publisher": "Apress",
  "Language": "English"

POST ebook/_doc/
  "Title": "Scala Machine Learning Projects",
  "Year": 2018,
  "Author": "Md. Rezaul Karm",
  "Publisher": "Packt",
  "Language": "English"

POST ebook/_doc/3
  "Title": "A Beginner's guide to Scala, Object Orientation and Functionnal Programming",
  "Year": 2018,
  "Author": "John Hunt",
  "Publisher": "Springer",
  "Language": "English"

# Comptage des documents contenue dans ebook

GET ebook/_count

Trouver la requete qui retourne uniquement le deuxième document (Titre : Scala Machine Learning Projetcs)

GET ebook/_search
  "query": {
    "match": {
      "Title": "Scala Machine Learning Projects"
  "size": 1

# Alternative

GET ebook/_search
  "query": {
    "term": {
      "Title.keyword": "Scala Machine Learning Projects"

2. Hands-on Exercice: Mapping and Analysis

Delete ebook indice

# Suppression
DELETE ebook
# On vérifie la suppression
GET ebook

La vérification de la suppression de ebook a un affichage suivant

  "error" : {
    "root_cause" : [
        "type" : "index_not_found_exception",
        "reason" : "no such index [ebook]",
        "resource.type" : "index_or_alias",
        "" : "ebook",
        "index_uuid" : "_na_",
        "index" : "ebook"
    "type" : "index_not_found_exception",
    "reason" : "no such index [ebook]",
    "resource.type" : "index_or_alias",
    "" : "ebook",
    "index_uuid" : "_na_",
    "index" : "ebook"
  "status" : 404

Re-créez l'indice ebook avec un meilleur mapping. On doit pouvoir avoir une aggrégation et une recherche full-text des auteurs et du titre

PUT ebook
  "mappings": {
    "properties": { 
      "Title": {
        "type": "text"
      "Year": {
        "index": false,
        "type": "date",
        "format": "yyyy"
      "Author": {
        "type": "keyword"
      "Publisher": {
        "index": false,
        "type": "text"

Reindexer les trois documents

# Redindexing documents
POST ebook/_doc/1
  "Title": "Reactive Streams in Java",
  "Year": 2019,
  "Author": "Adam L. Davis",
  "Publisher": "Apress",
  "Language": "English"

POST ebook/_doc/
  "Title": "Scala Machine Learning Projects",
  "Year": 2018,
  "Author": "Md. Rezaul Karm",
  "Publisher": "Packt",
  "Language": "English"

POST ebook/_doc/3
  "Title": "A Beginner's guide to Scala, Object Orientation and Functionnal Programming",
  "Year": 2018,
  "Author": "John Hunt",
  "Publisher": "Springer",
  "Language": "English"

Mettre à jour le mapping de l'ebook avec un nouveau champs nommé Date avec le format suivant : "yyyy-MM-dd'T'HH:mm:ss"

PUT ebook/_mapping
  "properties": {
    "Date": {
      "type" : "date",
      "format": "yyyy-MM-dd'T'HH:mm:ss"

Ajouter la date actuelle pour les trois documents dans le champs date

3. Hands-On Exercise: Queries

Lancer une requete sur la base kibana_sample_dat_log qui trouve les documents qui ont une valeur de host exactement

GET kibana_sample_data_logs/_search
  "query": {
    "match": {
      "host": ""

# Ou

GET kibana_sample_data_logs/_search
  "query": {
    "term": {
      "host": ""

Lancer une requete sur l'indice kibana_sample_dat_log qui trouve les documents qui matchent les deux valeurs en même temps parmis chrome, linux, mozilla

GET kibana_sample_data_logs/_search
  "query": {
    "bool": {
      "should": [
        {"match": {
          "agent": "chrome"
        {"match": {
            "agent": "linux"
        {"match": {
            "agent": "mozilla"
      "minimum_should_match": 2

ou bien

POST kibana_sample_data_logs/_search
  "query": {
    "match": {
      "agent": {
        "query": "chrome linux mozilla",
        "minimum_should_match": 2

Execute a matchquery on kibana_sample_data_logs indice that hits documents matching oneterms of “chrome”and “safari” in agentfield.

  • Find two others queries that hits exactly the same result (same documents and same scores)
POST kibana_sample_data_logs/_search
  "query": {
    "match": {
      "agent": "chrome safari"

POST kibana_sample_data_logs/_search
  "query": {
    "bool": {
      "should": [
          "match": {
            "agent": "chrome"
          "match": {
            "agent": "safari"
  • Find another query that hits the same result with score ignoring
POST kibana_sample_data_logs/_search
  "query": {
    "constant_score": {
      "filter": {
        "terms": {
          "agent": [
      "boost": 1

4. Aggregations

All aggregations should be executed on kibana_sample_data_logsindex

  1. Find the average of memory for all requests
GET kibana_sample_data_logs/_search?size=0
  "aggs": {
    "avg_mem": {
      "avg": {
        "field": "memory"
GET kibana_sample_data_logs/_search?size=0
  "aggs": {
    "nbr_clientip": {
      "cardinality": {
        "field": "clientip"

1001 is the expected result :

  "took" : 29,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    "max_score" : null,
    "hits" : [ ]
  "aggregations" : {
    "nbr_clientip" : {
      "value" : 1001
  1. Find the most 3 ip used per month
GET kibana_sample_data_logs/_search?size=0
  "aggs": {
    "monthly_agg": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "month"
      "aggs": {
        "occurence_ip": {
          "terms": {
            "field": "clientip",
            "size": 3
  1. Add a pipeline aggregationthat compute the month with the most requests (documents) (Not Finished)
GET kibana_sample_data_logs/_search?size=0
  "aggs": {
    "monthly_aggs": {
      "date_histogram": {
        "field": "timestamp",
        "calendar_interval": "month"
      "aggs": {
        "requests_count": {
          "value_count": {
            "field": "timestamp"
    "most_requests_month" :{
      "max_bucket": {
        "buckets_path": "monthly_aggs>requests_count"
  "size": 1
  1. Add a pipeline aggregation named "monthly_max_daily_ avg" that computes the month with the most daily memory average
Copy link

Noobzik commented Mar 19, 2021

GET ebook/_search
  "aggs": {
    "Aggregate-Title": {
      "terms": {
        "field": "Title"
    "Aggregate-Author": {
      "term": {
        "field": "Author"

Passage du text en keyword pour faire fonctionner les aggregations, reste a savoir comment lancer la recherche

