Skip to content

Instantly share code, notes, and snippets.

View filimonov's full-sized avatar

filimonov filimonov

View GitHub Profile
@filimonov
filimonov / clickhouse_build_centos.sh
Last active January 26, 2024 01:26
Clickhouse build on Centos 6/ Centos 7 / RedHat
# Tested on Centos 6.9 and Centos 7.4. ClickHouse vesion 1.1.54236
# Official build instructions:
# https://github.com/yandex/ClickHouse/blob/master/doc/build.md
# See also https://github.com/redsoftbiz/clickhouse-rpm
# and https://github.com/Altinity/clickhouse-rpm
export THREADS=$(grep -c ^processor /proc/cpuinfo)
# Determine RHEL major version
RHEL_VERSION=`rpm -qa --queryformat '%{VERSION}\n' '(redhat|sl|slf|centos|oraclelinux|goslinux)-release(|-server|-workstation|-client|-computenode)'`
<!-- Put it to ~/.clickhouse-client/config.xml -->
<yandex>
<config-file></config-file>
<stacktrace>false<stacktrace>
<format>Vertical</format>
<format_max_block_size>10000</format_max_block_size>
<insert_format_max_block_size>100000</insert_format_max_block_size>
<progress>false</progress>
<echo>false</echo>
<use_client_time_zone>false</use_client_time_zone>
@filimonov
filimonov / export_mysql_table.sh
Last active March 29, 2020 18:29
mysql to clickhouse (bash)
set -o errexit
TABLE_NAME="${1:?You must pass a TABLE_NAME as first argument}"
STARTID="${2:?You must pass a STARTID as 2nd argument}"
ENDID="${3:?You must pass a ENDID as 3rd argument}"
[[ -z "$4" ]] && LIMIT="" || LIMIT="LIMIT $4"
. logins.sh
INSERT_COMMAND="INSERT INTO clickhouse_table(column1,column2,...) FORMAT TSV"
@filimonov
filimonov / geoip2_bench.pl
Created April 4, 2018 15:48
perl geoip benchmark of particular parts
use strict;
use warnings;
use Time::HiRes;
use GeoIP2::Database::Reader;
my $reader = GeoIP2::Database::Reader->new(
file => '/usr/local/www/maxminddb/GeoIP2-City.mmdb',
locales => [ 'en' ]
);
@filimonov
filimonov / geoip2_bench2.pl
Created April 11, 2018 12:33
GeoIP2::Database::Reader vs MaxMind::DB::Reader benchmark or why shouldn't you use GeoIP2::Database::Reader.
## GeoIP2::Database::Reader vs MaxMind::DB::Reader benchmark or
## why you shouldn't use GeoIP2::Database::Reader
use Benchmark (cmpthese);
use strict;
use warnings;
use GeoIP2::Database::Reader;
use MaxMind::DB::Reader;
@filimonov
filimonov / sample.sql
Last active May 9, 2018 10:09
clickhouse partition key in system.parts : tuple vs simple type
-- adding braces around partition name (even for simple types) allows to do cast safely
:) create table xxx(d Date, a UInt8) Engine = MergeTree() PARTITION by d ORDER BY a;
CREATE TABLE xxx
(
d Date,
a UInt8
)
ENGINE = MergeTree()
PARTITION BY d
@filimonov
filimonov / prewhere.sql
Created May 14, 2018 12:58
Clickhouse prewhere demonstration
SELECT version()
┌─version()─┐
│ 1.1.54380 │
└───────────┘
SELECT *
FROM system.settings
WHERE name LIKE '%prewh%';
┌─name──────────────────────┬─value─┬─changed─┬─description───────────────────────────────────────────────────────────────────────┐
@filimonov
filimonov / runningDifference_group_by.sql
Created July 4, 2018 10:17
clickhouse runningDifference inside groups
DROP TABLE IF EXISTS test;
CREATE TABLE test (sessionid String, timestamp DateTime) ENGINE = Log;
INSERT INTO test SELECT 's1', arrayJoin([now(), now() + 10, now() + 21, now() + 33]);
INSERT INTO test SELECT 's2', arrayJoin([now()+33, now() + 34, now() + 35, now() + 36]);
INSERT INTO test SELECT 's3', arrayJoin([now()+40, now() + 60, now() + 61, now() + 80]);
WITH
groupArray( timestamp ) as all_timestamps_in_group,
@filimonov
filimonov / csv2tsv.py
Last active July 10, 2018 13:24
Reformat CSV to clickhouse-compatible TSV
#!/usr/bin/env python
import csv, sys
csv.register_dialect('clickhouse', escapechar="\\", doublequote=0, quotechar='\'',skipinitialspace = 0,delimiter = '\t', quoting=csv.QUOTE_NONE, lineterminator='\n')
csv.writer(sys.stdout, dialect='clickhouse').writerows(csv.reader(sys.stdin))
# install:
# put in $PATH
# chmod +x csv2tsv.py
# usage:
@filimonov
filimonov / _etc_clickhouse-server_dict_sources.xml
Created August 31, 2018 09:17
ClickHouse dictionaries with connection credentials substituted from another file
<?xml version="1.0"?>
<yandex>
<mysql_config>
<port>3306</port>
<user>user</user>
<password>123</password>
<replica>
<host>mysql_host</host>
<priority>1</priority>