Skip to content

Instantly share code, notes, and snippets.

@karussell
karussell / backup.sh
Created July 10, 2011 20:05
Backup ElasticSearch with rsync
# TO_FOLDER=/something
# FROM=/your-es-installation
DATE=`date +%Y-%m-%d_%H-%M`
TO=$TO_FOLDER/$DATE/
echo "rsync from $FROM to $TO"
# the first times rsync can take a bit long - do not disable flusing
rsync -a $FROM $TO
# now disable flushing and do one manual flushing
@nchammas
nchammas / TRUNCATE and DROP are both minimally logged.sql
Created November 8, 2011 21:32
Demonstration that TRUNCATE and DROP are logged and just as fast as one another.
SET NOCOUNT ON;
USE [tempdb];
CREATE TABLE a_farting_farthing (
an_integer INT DEFAULT (1)
);
INSERT INTO a_farting_farthing
DEFAULT VALUES;
@nherment
nherment / backup.sh
Created February 29, 2012 10:42
Backup and restore an Elastic search index (shamelessly copied from http://tech.superhappykittymeow.com/?p=296)
#!/bin/bash
# herein we backup our indexes! this script should run at like 6pm or something, after logstash
# rotates to a new ES index and theres no new data coming in to the old one. we grab metadatas,
# compress the data files, create a restore script, and push it all up to S3.
TODAY=`date +"%Y.%m.%d"`
INDEXNAME="logstash-$TODAY" # this had better match the index name in ES
INDEXDIR="/usr/local/elasticsearch/data/logstash/nodes/0/indices/"
BACKUPCMD="/usr/local/backupTools/s3cmd --config=/usr/local/backupTools/s3cfg put"
BACKUPDIR="/mnt/es-backups/"
YEARMONTH=`date +"%Y-%m"`
@radu-gheorghe
radu-gheorghe / log_backup.bash
Created July 26, 2012 08:31
Optimize&Backup Elasticsearch index. And restore.
#!/usr/bin/env bash
###############FUNCTIONS############
function prepare {
#optimize the index
echo -n "Optimizing index $INDEX_NAME..."
curl -XPOST "$ADDRESS/$INDEX_NAME/_optimize" 2>/dev/null| grep 'failed":0' >/dev/null
if [ $? -eq 0 ]; then
echo "done"
@dadoonet
dadoonet / backup.sh
Created December 26, 2012 14:50
Backup Elasticsearch node
# Script to be placed in elasticsearch/bin
# Launch it from elasticsearch dir
# bin/backup indexname
# We suppose that data are under elasticsearch/data
# It will create a backup file under elasticsearch/backup
if [ -z "$1" ]; then
INDEX_NAME="dummy"
else
INDEX_NAME=$1
@nathanlws
nathanlws / CustomCase.java
Last active December 18, 2015 14:29
FoundationDB SQL Parser IdentifierCase
//
// See: https://github.com/foundationdb/sql-parser
//
import com.foundationdb.sql.parser.*;
import com.foundationdb.sql.parser.SQLParserContext.*;
public class CustomCase {
public static class ColumnNamePrinter implements Visitor {
@Override
@miketheman
miketheman / zook_grow.md
Created July 22, 2013 21:36
Adding nodes to a ZooKeeper ensemble

Adding 2 nodes to an existing 3-node ZooKeeper ensemble without losing the Quorum

Since many deployments may start out with 3 nodes and so little is known about how to grow a cluster from 3 memebrs to 5 members without losing the existing Quorum, here is an example of how this might be achieved.

In this example, all 5 nodes will be running on the same Vagrant host for the purpose of illustration, running on distinct configurations (ports and data directories) without the actual load of clients.

YMMV. Caveat usufructuarius.

Step 1: Have a healthy 3-node ensemble

@jpountz
jpountz / Recover.java
Last active December 22, 2015 10:48
File to restore a corrupted segment if the stored fields are not corrupted.
// Set codec, dir and segmentName accordingly to the segment you are trying to restore
Codec codec = new Lucene42Codec();
Directory dir = FSDirectory.open(new File("/tmp/test"));
String segmentName = "_0";
IOContext ioContext = new IOContext();
SegmentInfo segmentInfos = codec.segmentInfoFormat().getSegmentInfoReader().read(dir, segmentName, ioContext);
Directory segmentDir;
if (segmentInfos.getUseCompoundFile()) {
segmentDir = new CompoundFileDirectory(dir, IndexFileNames.segmentFileName(segmentName, "", IndexFileNames.COMPOUND_FILE_EXTENSION), ioContext, false);
@Pyrolistical
Pyrolistical / functions.js
Last active December 28, 2017 04:10 — forked from RedBeard0531/functions.js
Mongo map reduce functions to calculate sum, min, max, count, average, population variance, sample variance, population standard deviation, sample standard deviation Public Domain License
function map() {
emit(1, {
sum: this.value, // the field you want stats for
min: this.value,
max: this.value,
count: 1,
diff: 0
});
}
@debasishg
debasishg / gist:8172796
Last active March 15, 2024 15:05
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t