Skip to content

Instantly share code, notes, and snippets.

View cdahlqvist's full-sized avatar

Christian Dahlqvist cdahlqvist

  • Independent
  • Valencia, Spain
View GitHub Profile
@cdahlqvist
cdahlqvist / bulk_rejections.md
Last active April 5, 2023 06:27
rally-bulk-rejections-track

Bulk Rejections Test

This Rally track is used to test the relationship between bulk indexing rejections and the following parameters:

  • Number of concurrent clients indexing into Elasticsearch
  • Number of shards actively being indexed into
  • Number of data nodes in the cluster
  • Size of bulk requests

The track contains a number of challenges, each indexing into an index with a set number of shards using a increasing number of concurrent client connections and two different bulk sizes.

@cdahlqvist
cdahlqvist / ingest_pipeline_delay
Last active February 14, 2023 21:06
Ingest pipeline definition for measuring ingest delay based on @timestamp field
# Ingest pipeline that records the timestamp the event was processed (`@received`)
# by the ingest pipeline and calculates the difference in milliseconds compared to
# the event timestamp (`@timestamp`).
POST _scripts/calculate_ingest_delay
{
"script": {
"lang": "painless",
"source": "SimpleDateFormat sdf = new SimpleDateFormat(\"yyyy-MM-dd'T'HH:mm:ss.SSS'Z'\"); ctx.ingest_delay = (sdf.parse(ctx['received']).getTime() - sdf.parse(ctx['@timestamp']).getTime()) / 1000.0"
}
Start a Riak 2.0 cluster. This has been tested against Riak 2.0.0pre11.
First set up bucket types (note you can name these as you like for your domain (and add other properties)
$ rel/riak/bin/riak-admin bucket-type create maps '{"props":{"datatype":"map"}}'
maps created
$ rel/riak/bin/riak-admin bucket-type create sets '{"props":{"datatype":"set"}}'
sets created
$ rel/riak/bin/riak-admin bucket-type create counters '{"props":{"datatype":"counter"}}'
counters created
@cdahlqvist
cdahlqvist / epoch_prefixed_md5_identifier.conf
Last active July 3, 2020 03:50
Logstash config showing how to create a document identifier built from MD5 hash prefixed by hex formatted epoch date
input {
generator {
lines => ['2011-04-19T03:44:01.103Z testlog1',
'2011-04-19T03:44:02.035Z testlog2',
'2011-04-19T03:44:03.654Z testlog3',
'2011-04-19T03:44:03.654Z testlog3']
count => 1
}
}
@cdahlqvist
cdahlqvist / gdpr_access_controls.txt
Last active March 30, 2020 08:45
Securing GDPR Personal Data with Access Controls
# Tested with version 6.2.x of the Elastic Stack
# Add index templates
PUT _template/identity_store
{
"index_patterns": ["identity_store"],
"settings": {
"number_of_shards": 1
},
#/bin/bash
TIMESTAMP=$(date +%s)
ES_HOST=$1
REPOSITORY=$2
INDEX_NAME=$3
SNAPSHOT_ID=$4
NEW_INDEX_NAME=$5
@cdahlqvist
cdahlqvist / README.md
Created April 23, 2017 10:38
Access log index size test

Access log size test

This gist contains supporting files for evaluating Elasticsearch index sizes for web access logs.

Prerequisites

  • Machine with Linux or Mac OS X.
  • Local Elasticsearch 5.3.x instance accessible via 127.0.0.1:9200
  • The local Elasticsearch 5.3.x instance must have the geoip and useragent ingest plugins installed
  • Local installation of Filebeat 5.3.x with environment variable FILEBEAT_HOME pointing to the directory containing the filebeat binary.
@cdahlqvist
cdahlqvist / create_repositories.sh
Created November 26, 2018 11:48
Frozen indices benchmark
#/bin/bash
echo $(date) "Create snapshot repositories"
curl -X PUT "localhost:9200/_snapshot/elasticlogs-nofm" -H 'Content-Type: application/json' -d'
{
"type": "fs",
"settings": {
"location": "/data/snapshots/elasticlogs-nofm"
}
{
"trigger": {
"schedule": {
"interval": "10s"
}
},
"input": {
"http" : {
"request" : {
"host" : "127.0.0.1:9200",
@cdahlqvist
cdahlqvist / filter_logs.conf
Created October 23, 2018 20:34
HTTP log replayer
input {
stdin {}
}
filter {
grok {
match => { "message" => [ '%{IP:ip}" %{GREEDYDATA:a}',
'%{IP:ip1}, %{IP:ip}" %{GREEDYDATA:a}' ] }
}