- Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
- Models and Issues in Data Stream Systems
- Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
- Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
- [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&rep=rep1&t
#!/bin/bash | |
# The script does automatic checking on a Go package and its sub-packages, including: | |
# 1. gofmt (http://golang.org/cmd/gofmt/) | |
# 2. goimports (https://github.com/bradfitz/goimports) | |
# 3. golint (https://github.com/golang/lint) | |
# 4. go vet (http://golang.org/cmd/vet) | |
# 5. race detector (http://blog.golang.org/race-detector) | |
# 6. test coverage (http://blog.golang.org/cover) | |
set -e |
# Install ARCH Linux with encrypted file-system and UEFI | |
# The official installation guide (https://wiki.archlinux.org/index.php/Installation_Guide) contains a more verbose description. | |
# Download the archiso image from https://www.archlinux.org/ | |
# Copy to a usb-drive | |
dd if=archlinux.img of=/dev/sdX bs=16M && sync # on linux | |
# Boot from the usb. If the usb fails to boot, make sure that secure boot is disabled in the BIOS configuration. | |
# Set swedish keymap |
package main | |
import ( | |
"fmt" | |
"log" | |
"sort" | |
"strconv" | |
"strings" | |
"unicode" | |
) |
Each of these commands will run an ad hoc http static server in your current (or specified) directory, available at http://localhost:8000. Use this power wisely.
$ python -m SimpleHTTPServer 8000
#!/usr/bin/ruby | |
# Convert a Markdown README to HTML with Github Flavored Markdown | |
# Github and Pygments styles are included in the output | |
# | |
# Requirements: json gem (`gem install json`) | |
# | |
# Input: STDIN or filename | |
# Output: STDOUT | |
# Arguments: "-c" to copy to clipboard (or "| pbcopy"), or "> filename.html" to output to a file | |
# cat README.md | flavor > README.html |
These steps show two less common interactions with git to extract a single file which is inside a subfolder from a git repository. These steps essentially reduce the repository to just the desired files and should performed on a copy of the original repository (1.).
First the repository is reduced to just the subfolder containing the files in question using git filter-branch --subdirectory-filter
(2.) which is a useful step by itself if just a subfolder needs to be extracted. This step moves the desired files to the top level of the repository.
Finally all remaining files are listed using git ls
, the files to keep are removed from that using grep -v
and the resulting list is passed to git rm
which is invoked by git filter-branch --index-filter
(3.). A bit convoluted but it does the trick.