Skip to content

Instantly share code, notes, and snippets.

https://docs.google.com/presentation/d/1vURI9yYJvAm0couPbFeKz09sr8bvQuSzCfcTEBow7ec/edit#slide=id.g5294e95553_0_101
@javisantana
javisantana / clickhouse_osx.md
Last active May 21, 2019 10:00
how to build clickhouse on osx high sierra

IMPORTANT, this worked with CH stable (76629e9) version on 2019/02/14

install requirements

from the original osx build doc page https://clickhouse.yandex/docs/en/development/build_osx.html

build clickhouse

Add /usr/local/include to the default path for gcc (for some reason gcc-7 was not using that folder on osx)

@javisantana
javisantana / mercator_numpy.py
Created January 20, 2018 08:56
mercator projection using numpy
"""
projects a numpy array with (lon, lat) to (x, y) in mercator coordinates using numpy
license: MIT
adapted from https://github.com/mapbox/mercantile
"""
import numpy as np
import math
"""haversine function adapted to work with numpy arrays
Adapted from https://github.com/mapado/haversine/blob/master/haversine/__init__.py
License: MIT
"""
import math
import numpy as np
AVG_EARTH_RADIUS = 6371 * 1000 # m
set nocompatible
set rtp+=~/.vim/bundle/vundle/
call vundle#rc()
let g:ackprg = 'ag --nogroup --nocolor --column'
map <F6> :%s/>\s*</>\r</g<CR>ggVG=
"colorscheme koehler
// license: BSD3
const WEBMERCATOR_R = 6378137.0;
const DIAMETER = WEBMERCATOR_R * 2 * Math.PI;
class Mercator {
static project(lon, lat) {
var x = DIAMETER * lon/360.0;
var sinlat = Math.sin(lat * Math.PI/180.0);
var y = DIAMETER *Math.log((1+sinlat)/(1-sinlat)) / (4*Math.PI);
return { x, y };
}
import urllib,sys,csv;
print reduce(lambda p, new: (p[0] + (new - p[0])/p[1], p[1] + 1), map(float, (x['tip_amount'] for x in csv.DictReader(urllib.urlopen(sys.argv[1])))), (0, 1))
@javisantana
javisantana / python_code_test_carto.md
Last active November 17, 2021 07:27 — forked from jorgesancha/python_code_test_carto.md
Python code test - CARTO

Build the following and make it run as fast as you possibly can using Python 3 (vanilla). The faster it runs, the more you will impress us!

Your code should:

All of that in the most efficient way you can come up with.

@javisantana
javisantana / index.html
Created July 27, 2016 15:38 — forked from naguher/index.html
Web Máster Economia
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title>HOJA DE ESTILO DISEÑO ECONOMIA</title>
<meta name="description" content="Hoja de estilo para diseño del Master Economia">

Sampling to calculate buckets

I ws thinking on using the new TABLESAMPLE feature included in postgres 9.5 in order to calculate clusters for visualization. Obvioulsy doing sampling adds some error to the final clusters but in the other hand provides (theorically) some time improvemens (avoid full scans with SYSTEM method and reduces CPU usage).

I only tested with this table, it should be tested with different data distributions and datasizes but looks promising for some clustering methods.

timing

jenks