Skip to content

Instantly share code, notes, and snippets.

View bracki's full-sized avatar

Jan Brauer bracki

View GitHub Profile
@thelinuxkid
thelinuxkid / subprocess_stream.py
Last active December 4, 2023 06:43
Get a Python subprocess' output without buffering. Normally when you want to get the output of a subprocess in Python you have to wait until the process finishes. This is bad for long running processes. Here's a way to get the output unbuffered (in real-time.)
import contextlib
import subprocess
# Unix, Windows and old Macintosh end-of-line
newlines = ['\n', '\r\n', '\r']
def unbuffered(proc, stream='stdout'):
stream = getattr(proc, stream)
with contextlib.closing(stream):
while True:
out = []
@duydo
duydo / elasticsearch_best_practices.txt
Last active December 15, 2021 06:12
Elasticsearch - Index best practices from Shay Banon
If you want, I can try and help with pointers as to how to improve the indexing speed you get. Its quite easy to really increase it by using some simple guidelines, for example:
- Use create in the index API (assuming you can).
- Relax the real time aspect from 1 second to something a bit higher (index.engine.robin.refresh_interval).
- Increase the indexing buffer size (indices.memory.index_buffer_size), it defaults to the value 10% which is 10% of the heap.
- Increase the number of dirty operations that trigger automatic flush (so the translog won't get really big, even though its FS based) by setting index.translog.flush_threshold (defaults to 5000).
- Increase the memory allocated to elasticsearch node. By default its 1g.
- Start with a lower replica count (even 0), and then once the bulk loading is done, increate it to the value you want it to be using the update_settings API. This will improve things as possibly less shards will be allocated to each machine.
- Increase the number of machines you have so
@didip
didip / tornado-nginx-example.conf
Created January 30, 2011 05:19
Nginx config example for Tornado
worker_processes 2;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
}