Skip to content

Instantly share code, notes, and snippets.

View EikeDehling's full-sized avatar

Eike Dehling EikeDehling

  • Textkernel
  • Amsterdam
View GitHub Profile
@EikeDehling
EikeDehling / gist:a015a5137ac5d99dc850
Created March 30, 2015 13:21
Elasticsearch problematic mapping (causes endless mapping refresh in our cluster)
{
"postings-0": {
"mappings": {
"posting": {
"_all": {
"enabled": false
},
"properties": {
[2015-03-31 10:02:59,815][DEBUG][indices.cluster ] [Strobe] [postings-5360000000] parsed mapping [posting], and got different sources
original:
{"posting":{"_all":{"enabled":false},"properties":{"blogurl":{"type":"string","index_analyzer":"bc_indexing_analyzer","search_analyzer":"bc_query_analyzer","fielddata":{"format":"disabled"}},"body":{"type":"string","index_analyzer":"bc_indexing_analyzer","search_analyzer":"bc_query_analyzer","fielddata":{"format":"disabled"},"path":"just_name","fields":{"_text_":{"type":"string","index_analyzer":"bc_indexing_analyzer","search_analyzer":"bc_query_analyzer","fielddata":{"filter.frequency.min":"2","filter.regex.pattern":"^(?!(?:(?:maandag)|(?:dinsdag)|(?:woensdag)|(?:donderdag)|(?:vrijdag)|(?:zaterdag)|(?:zondag)|(?:citaat)|(?:christa)|(?:4sq)|(?:I)|(?:a)|(?:aan)|(?:about)|(?:ae)|(?:af)|(?:after)|(?:ajax-coach)|(?:al)|(?:all)|(?:alle)|(?:alleen)|(?:als)|(?:also)|(?:an)|(?:and)|(?:andere)|(?:anders)|(?:any)|(?:april)|(?:augustus)|(?:august)|(?:auteur)|(?:as)|(?:a
{
"posting": {
"_all": {
"enabled": false
},
"properties": {
"blogurl": {
"type": "string",
--- a/src/main/java/org/elasticsearch/index/mapper/core/AbstractFieldMapper.java
+++ b/src/main/java/org/elasticsearch/index/mapper/core/AbstractFieldMapper.java
@@ -24,7 +24,6 @@ import com.carrotsearch.hppc.cursors.ObjectCursor;
import com.carrotsearch.hppc.cursors.ObjectObjectCursor;
import com.google.common.base.Objects;
import com.google.common.collect.ImmutableList;
-
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.FieldType;
@EikeDehling
EikeDehling / filebeat.yml
Created February 3, 2017 13:39
Basic filebeat config
filebeat.prospectors:
- input_type: log
paths:
- ./random_apache_log
output.elasticsearch:
hosts: ["127.0.0.1:9200"]
@EikeDehling
EikeDehling / random-apache-log.sh
Last active February 17, 2017 12:51
Bash script to generate a random (apache) log line every random seconds
#!/usr/bin/env bash
while true
do
random_ip=$(dd if=/dev/urandom bs=4 count=1 2>/dev/null | od -An -tu1 | sed -e 's/^ *//' -e 's/ */./g')
random_size=$(( (RANDOM % 65535) + 1 ))
current_date_time=$(date '+%d/%b/%Y:%H:%M:%S %z')
echo "$random_ip - - [$current_date_time] \"GET /data.php HTTP/1.1\" 200 $random_size" | tee -a 'random_log'
@EikeDehling
EikeDehling / random-apache-log.cmd
Last active March 10, 2017 15:12
Windows batch file for generating a random apache log
@echo off
Setlocal EnableDelayedExpansion
for /L %%n in (1,0,5) do (
SET /A N1=!RANDOM! * 255 / 32768
SET /A N2=!RANDOM! * 255 / 32768
SET /A N3=!RANDOM! * 255 / 32768
SET /A N4=!RANDOM! * 255 / 32768
@EikeDehling
EikeDehling / titanic.py
Last active March 11, 2017 11:42
Some experiments for kaggle titanic survivors machine learning competition (https://www.kaggle.com/c/titanic)
import pandas
from sklearn import linear_model, svm, tree, naive_bayes
from sklearn.model_selection import cross_val_score
import numpy as np
data = pandas.read_csv('train.csv')
def preprocess(data):
data['Fare'] = data['Fare'].fillna(data['Fare'].mean())
@EikeDehling
EikeDehling / install-elasticsearch.sh
Last active May 10, 2017 07:24 — forked from gourneau/ElasticSearch.sh
Script to install elasticsearch (Latest 5.x) on a ubuntu machine (16.04LTS)
# Install latest OpenJDK
sudo apt-get update
sudo apt-get install openjdk-8-jre-headless
# Install elastic
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -
sudo apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
sudo apt-get update
sudo apt-get install elasticsearch
@EikeDehling
EikeDehling / install-elastic-5.6-debian.sh
Last active September 14, 2017 09:45
Shell script to install elastic 5.6 (as root) on debian
# Install elasticsearch
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | apt-key add -
apt-get install apt-transport-https
echo "deb https://artifacts.elastic.co/packages/5.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-5.x.list
apt-get update
apt-get install elasticsearch
# Configure memory settings
mkdir -p /etc/systemd/system/elasticsearch.service.d
echo -e "[Service]\nLimitMEMLOCK=infinity" > /etc/systemd/system/elasticsearch.service.d/elasticsearch.conf