Skip to content

Instantly share code, notes, and snippets.

View rampage644's full-sized avatar

Sergei Turukin rampage644

View GitHub Profile
@rampage644
rampage644 / strace-analysis.py
Created August 10, 2014 13:03
Python strace log analysis script
#!/usr/bin/env python
import sys
import re
def main():
regexp = re.compile(r'^(\S+)\((.*)\)\s+=\s+(\d+)$')
whitelist = ['read', 'write', 'fstat', 'lseek', 'fcntl']
opened_fd = {}
@rampage644
rampage644 / seccomp.md
Created August 13, 2014 12:42
Seccomp

Benchmarking

Jump to Results:

Simple experiment showed seccomp-based syscall ~5 times slower than vanila one.

Calling write syscall directly:

const unsigned count = UINT_MAX / 10000;
	unsigned i = 0;
@rampage644
rampage644 / impala-pth
Created August 20, 2014 13:29
Impala Pth
# Gnu Pth as thread library for impalad
In short, it's impossible to use Gnu Pth library with `impalad` "AS IS", i.e. without modification.
Gnu Pth:
* Gnu Pth can't fully replace `pthreads`. It lacks some functions, some entities.
* It doesn't provide versioned symbols
There are some `*.so` libraries (system/thirdparty) which come precompiled and they are linked against versioned symbols. Be prepared to recompile them replace somehow or just do anything. Example:
@rampage644
rampage644 / impalad-summary
Created August 24, 2014 09:59
Impalad summary
## Git repo
Find modified impala [here](https://github.com/rampage644/impala-cut). First, have a look at [this](https://github.com/rampage644/impala-cut/blob/executor/README.md) *README* file.
## Task description
Original task was to prune impalad to some sort of *executor* binary which only executes part of query. Two approaches were suggested: top-down and bottom-up. I used bottom-up approach.
My intention was to write unittest that whill actually test the behavior we need. So, look at `be/src/runtime/plan-fragment-executior-test.cc`. It contains all possible tests (that is, actual code snippets) to run part of query with or without data. Doing so helped me a lot to understand impalad codebase relative to query execution.
@rampage644
rampage644 / week-result.md
Last active August 29, 2015 14:05
Week results

Results

  • Haven't found how to cut-off hardware layer. Virtio lead didn't help.
  • Osv builds very tricky libraries. Impossible to use as is at host.
  • Bottom-up approach seems reasonable for now

01 Sep

Just collecting information about unikernels/kvm and friends. Little osv source code digging with no actual result. Discussions.

@rampage644
rampage644 / osv.md
Last active August 29, 2015 14:07
Osv

OSv + Impala status

  1. I think i get plan-fragment-executor-test run under OSv
  2. But it fails very quickly
  3. Problem is with tcmallocstatic. First, OSv doesn't support sbrk-based memory management. One has to tune tcmallocstatic not to use SbrkMemoryAllocator at all (comment #undef HAVE_SBRK in config.h.in). Second, it still fails with invalid opcode exception.

Issues

tcmallocstatic

@rampage644
rampage644 / impala-build.md
Created November 24, 2014 15:56
Impala build

Building Impala

  • Version: cdh5-2.0_5.2.0
  • OS: Archlinux 3.17.2-1-ARCH x86_64
  • gcc version 4.9.2

Berkeley DB version >= 5

@rampage644
rampage644 / ds-dev.md
Created August 2, 2016 13:21
DS dev process comments

Dataservices spider development process

Disclaimer: Everything described in this document is my personal opinion that doesn't have to be true for everyone.

Common

Key information

@rampage644
rampage644 / talk2907.md
Last active August 18, 2016 08:08
Shub talk

My Shub Talks

29/07 - Introduce workflow manager

Brief intro

First, I'd like to say hello to everyone and thank for coming.

@rampage644
rampage644 / airflow_deploy_design.md
Created October 6, 2015 20:53
Airflow flows deployment

Introduction

This document describes how Airflow jobs (or workflows) get deployed onto production system.

Directory structure

  • HOME directory:/home/airflow
  • DAG directory: $HOME/airflow-git-dir/dags/
  • Config directory: $HOME/airflow-git-dir/configs/
  • Unittest directore: $HOME/airflow-git-dir/tests/. Preferable, discoverable by both nose and py.test
  • Credentials should be accessed by by some library