Skip to content

Instantly share code, notes, and snippets.

@javisantana
javisantana / quota.sql
Created Oct 11, 2011
soft quota limit on postgres
View quota.sql
-- You need to create plpython languaje in the database
-- createlang -U postgres plpythonu DATABASE_NAME
-- this small trigger prevents an user exceeds a database size. Once the user exceeds the quota
-- the database size is checked every time an update or insert are performed until the user deletes some data
-- If the server is restarted the quota is checked the fist time user executes some update or insert
CREATE OR REPLACE FUNCTION check_quota() RETURNS trigger AS
$$
c = SD.get('quota_counter', 0)
if c%1000 == 0:
@javisantana
javisantana / license.txt
Last active Aug 9, 2022
NMEA parser in java which does not suck
View license.txt
this software is under the terms of MIT license: http://opensource.org/licenses/MIT
View carto.py
import urllib,sys,csv;
print reduce(lambda p, new: (p[0] + (new - p[0])/p[1], p[1] + 1), map(float, (x['tip_amount'] for x in csv.DictReader(urllib.urlopen(sys.argv[1])))), (0, 1))
View python_code_test_carto.md

Build the following and make it run as fast as you possibly can using Python 3 (vanilla). The faster it runs, the more you will impress us!

Your code should:

All of that in the most efficient way you can come up with.

View gps.java
package hardware;
import java.util.ArrayList;
import android.app.Activity;
import android.content.Context;
import android.location.Location;
import android.location.LocationListener;
import android.location.LocationManager;
import android.location.LocationProvider;
View clickhouse_query_log_replicas.md

When you a clickhouse cluster and you run queries on all the replicas it's not easy to get all the queries ran. I use system.query_log all the time to check timings, errors and so on.

So what I do is create a global query_log:

:) create view query_log_all on cluster my_cluster as select * from remote('10.0.0.1,10.0.0.2', 'system.query_log')

So I can inspect queries in all the replicas with a single query:

View generating_scripts_from_clickhouse.md

Sometimes you have to move data from one table to a different one. You usually use

insert into target select * from source

This works but have several problems:

  1. materialized columns are not properly copied
  2. it's slow
View link
View til_clickhouse_replicas_status.md

Clickhouse has a pretty good endpoint /replicas_status which gives information about the, guess what, replication status. When you are working on a cluster in which you use replication to increase the amount of QPS you usually have a load balancer before, something like this:

                                 +--------------+
                                 |              |
                       +-------->+  clickhouse  |
                       |         |              |
                       |         +--------------+
                       |
View union_all_and_push_down_in_clickhouse.md

Looks like filters are pushed down when filtering an "UNION ALL". Also an example on how to use EXPLAIN in clickhouse and a different view of seeing what is going on with the traces, this lines show how much data clickhouse is reading:

Selected 1 parts by date, 1 parts by key, 2 marks by primary key, 2 marks to read from 1 ranges
Reading approx. 16384 rows with 1 streams

The example