Skip to content

Instantly share code, notes, and snippets.

View chuwy's full-sized avatar

Anton Parkhomenko chuwy

View GitHub Profile
@debasishg
debasishg / gist:8172796
Last active March 15, 2024 15:05
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
@phred
phred / pedantically_commented_playbook.yml
Last active November 3, 2023 01:55
Very complete Ansible playbook, showing off all the options
---
####
#### THIS IS OLD AND OUTDATED
#### LIKE, ANSIBLE 1.0 OLD.
####
#### PROBABLY HIT UP https://docs.ansible.com MY DUDES
####
#### IF IT BREAKS I'M JUST SOME GUY WITH
#### A DOG, OK, SORRY
####
@djspiewak
djspiewak / streams-tutorial.md
Created March 22, 2015 19:55
Introduction to scalaz-stream

Introduction to scalaz-stream

Every application ever written can be viewed as some sort of transformation on data. Data can come from different sources, such as a network or a file or user input or the Large Hadron Collider. It can come from many sources all at once to be merged and aggregated in interesting ways, and it can be produced into many different output sinks, such as a network or files or graphical user interfaces. You might produce your output all at once, as a big data dump at the end of the world (right before your program shuts down), or you might produce it more incrementally. Every application fits into this model.

The scalaz-stream project is an attempt to make it easy to construct, test and scale programs that fit within this model (which is to say, everything). It does this by providing an abstraction around a "stream" of data, which is really just this notion of some number of data being sequentially pulled out of some unspecified data source. On top of this abstraction, sca

@oyvindholmstad
oyvindholmstad / schema-generator.js
Last active September 21, 2018 06:54
BigQuery JSON schema generator in Javascript and Scala
/*
A script to generate a Google BigQuery-complient JSON-schema from a JSON object.
Make sure the JSON object is complete before generating, null values will be skipped.
References:
https://cloud.google.com/bigquery/docs/data
https://cloud.google.com/bigquery/docs/personsDataSchema.json
https://gist.github.com/igrigorik/83334277835625916cd6
... and a couple of visits to StackOverflow
@runarorama
runarorama / gist:dba2699f064460228315
Last active June 4, 2017 16:28
Finalizers with monadic regions
object SafeIO {
trait Brace[M[_]] extends Monad[M] {
def brace[A,B,C](acquire: M[A])(release: A => M[B], go: A => M[C]): M[C]
def snag[A](m: M[A], f: Throwable => M[A]): M[A]
def lift[A](t: Task[A]): M[A]
}
object Brace {
def apply[M[_]:Brace]: Brace[M] = implicitly[Brace[M]]
@sortega
sortega / ForFree.scala
Created October 3, 2016 16:19
I informally demoed the concept of Free monad to a colleague and he asked for the code. I've added a couple comments to make it standalone.
package experiments
import scala.concurrent.ExecutionContext.Implicits.global
import scala.concurrent.{Await, Future}
import scalaz._
import Scalaz._
import scala.concurrent.duration.Duration
import natural.TypeSafeMap