Skip to content

Instantly share code, notes, and snippets.

View oleiade's full-sized avatar
🐫
OCamling

Théo Crevon oleiade

🐫
OCamling
View GitHub Profile
#!/usr/bin/python
# -*- coding : utf-8 -*-
from __future__ import with_statement
import os
from fabric.api import *
from fabric.contrib.files import exists
@oleiade
oleiade / telecomix_dig
Created June 11, 2012 14:39
telecomix dig
; <<>> DiG 9.7.0-P1 <<>> telecomix.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 49117
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;telecomix.org. IN A
;; Query time: 188 msec
@oleiade
oleiade / thread_pool.py
Created September 20, 2012 16:08
Thread pool python
from Queue import Queue
from threading import Thread
class Worker(Thread):
"""Thread executing tasks from a given tasks queue"""
def __init__(self, tasks):
Thread.__init__(self)
self.tasks = tasks
self.daemon = True
self.start()
@oleiade
oleiade / multi_bench.py
Created September 21, 2012 08:28
Benchmark single-process, threads, processes in python procedure
import time
import random
import itertools
from fastset.bitvector import bitvector
from Queue import Queue
from threading import Thread
from multiprocessing import Process
@oleiade
oleiade / MGet.py
Created October 4, 2012 09:18
MGet
def MGet(self, db, keys, fill_cache=True, *args, **kwargs):
def get_or_none(key, context):
try:
res = db.Get(key, fill_cache=fill_cache)
except KeyError:
warning_msg = "Key {0} does not exist".format(key)
context.update({'status': WARNING_STATUS})
self.errors_logger.warning(warning_msg)
res = None
return res
@oleiade
oleiade / bench_leveldb.py
Created November 10, 2012 13:47
python leveldb benchmarks using Hurdl;es
import tempfile
import hurdles
import leveldb
import shutil
from hurdles.tools import extra_setup
common_setup = "import random\n"
@oleiade
oleiade / elevator.md
Created December 6, 2012 14:39
Elevator plan

First Article (Adressing the problem)

Special needs

Here at Botify

  • Server logs data analytics storage (TeraBytes order)
  • Bulk write/read GigaBytes datas loads that would not suit in server's memory
  • Need for persistence
@oleiade
oleiade / spark-thoughts.md
Created December 7, 2012 11:32
Spark thoughts

Spark

Rdd

  • Spark manipulates input datas as RDD, which basically are distributed datasets.
  • RDD transformations (map) are lazy. It's like a roadmap of transformations to operate over dataset. But lazy, still.
  • RDD actions evaluates transformations and reduces in order to generate and return the result.
  • RDD transformations are re-evaluated on each actions by default unless you cache them

tips

@oleiade
oleiade / divide_file.py
Created January 4, 2013 16:49
Gzip file divider in python, based on a lines per file sampling method. Usage: ./divide.py input_file output_dir lines_per_file
#!/usr/bin/env python
import sys
import os
import gzip
fpath = sys.argv[1]
output = sys.argv[2]
parts_size = int(sys.argv[3])
@oleiade
oleiade / compile_leveldb.sh
Last active December 12, 2015 06:39
install_leveldb.sh
#!/bin/sh
SANDBOX_DIR=/tmp/leveldb_install
## Bootstrap a sandbox
create_sandbox() {
if [ ! -d $SANDBOX_DIR ]
then
mkdir -p $SANDBOX_DIR;
fi