Skip to content

Instantly share code, notes, and snippets.

View spalladino's full-sized avatar
🔷

Santiago Palladino spalladino

🔷
View GitHub Profile
@spalladino
spalladino / bigdata.md
Last active January 15, 2024 20:36
Apuntes de clase de la materia Big Data dictada por Daniel Yankelevich en FCEN UBA en 2015 1C

Estructuras de datos probabilísticas

  • No devuelven la respuesta exacta, sino una estimación.
  • En algoritmos de streaming, suele ser mejor una respuesta aproximada en el momento a una exacta mucho más tarde

Set membership

  • Estructura Bloom filter para set membership, usa tiempo constante para add y query
  • Hashing (tomar un fingerprint de todos los datos) suele ser mejor técnica que sampling (leer solamente algunos datos al azar), ya que genera solo falsos positivos, no negativos
@spalladino
spalladino / mbuilder-demo.js
Last active August 29, 2015 14:24
Small mBuilder demo firing external trigger and requesting table data
// Small mBuilder demo firing external trigger and requesting table data
var request = require('request');
var http = require('http');
// Fire external trigger
var symptom = 'headache';
var message = 'Rest well';
var projectId = 278;
@spalladino
spalladino / verboice-demo.js
Created July 4, 2015 18:50
Verboice API demo from node.js
var request = require('request');
var express = require('express');
var http = require('http');
var url = require('url');
// Parse command line options
var program = require('commander');
program
.version("1.0.0")
@spalladino
spalladino / backup-poirot.sh
Created April 9, 2015 14:03
Backups all distinguished Poirot entries in a monthly index and deletes the individual per-day indices
#! /bin/bash
# Invoke using YYYY.MM of indices to backup
# exit on uncaught error
set -e
# non zero exit status on a pipeline causes the whole pipeline to fail
set -o pipefail
function backup_index {
@spalladino
spalladino / es.sh
Created January 21, 2015 15:53
Hit test in elasticsearch
# Create index
curl -XPUT localhost:9200/locations/
# Create mapping with field 'shape' as geo shape
curl -XPUT localhost:9200/locations/_mapping/location -d '
{
"location": {
"properties": {
"name": {
"type": "string"
@spalladino
spalladino / hub.gs
Last active August 29, 2015 14:10
Google Spreadsheets InSTEDD Hub
/**
* Adds a custom menu to the active spreadsheet, containing a single menu item
* for invoking the readRows() function specified above.
* The onOpen() function, when defined, is automatically invoked whenever the
* spreadsheet is opened.
* For more information on using the Spreadsheet API, see
* https://developers.google.com/apps-script/service_spreadsheet
*/
function onOpen() {
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
@spalladino
spalladino / changelog.py
Created November 7, 2014 15:32
Create changelog for a new Cepheid version based on git log and fogbugz data
#! /usr/bin/python
from fogbugz import FogBugz
from datetime import datetime, timedelta
from itertools import groupby
import subprocess
import re
import sys
@spalladino
spalladino / leaflet-nobounce.js
Created August 22, 2014 21:20
Handler for leaflet.js to manage max bounds without bouncing
// Initialize OSM source, based on map.html example
var osmUrl = 'http://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png',
osmAttrib = '&copy; <a href="http://openstreetmap.org/copyright">OpenStreetMap</a> contributors',
osm = L.tileLayer(osmUrl, {maxZoom: 18, attribution: osmAttrib});
// Create map with max bounds
var map = L.map('map')
.setView([50.5, 30.51], 15)
.addLayer(osm)
.setMaxBounds([[50.52979753992208, 30.527229309082035],[50.497049833624224, 30.458564758300785]]);
@spalladino
spalladino / stats.py
Created January 5, 2014 16:31
Count number of lines of code per language per week in git repository using cloc
import csv
import os
import sys
from os.path import isfile
from datetime import date, timedelta
from subprocess import call, check_output
# Script expects that each project is in a folder with the same name in the working directory
PROJECTS = ['my-project', 'another-project']
@spalladino
spalladino / Capfile
Created September 2, 2013 22:13
Hacking an erlang application deployment with Capistrano 3
# Load DSL and Setup Up Stages
require 'capistrano/setup'
# Includes default deployment tasks
require 'capistrano/deploy'
# Loads custom tasks from `lib/capistrano/tasks' if you have any defined.
Dir.glob('lib/capistrano/tasks/*.cap').each { |r| import r }
namespace :deploy do