Skip to content

Instantly share code, notes, and snippets.

@ArtemGr
ArtemGr / alternative, using regex
Created May 21, 2009 16:29
CSV parser in Scala
val pattern = java.util.regex.Pattern.compile ("""(?xs) ("(.*?)"|) ; ("(.*?)"|) (?: \r?\n | \z ) """)
val matcher = pattern.matcher (input)
while (matcher.find) {
val col1 = matcher.group (2)
val col2 = matcher.group (4)
// ...
}
@ssp
ssp / git-extract-file.markdown
Created January 23, 2012 13:21
Extract a single file from a git repository

How to extract a single file with its history from a git repository

These steps show two less common interactions with git to extract a single file which is inside a subfolder from a git repository. These steps essentially reduce the repository to just the desired files and should performed on a copy of the original repository (1.).

First the repository is reduced to just the subfolder containing the files in question using git filter-branch --subdirectory-filter (2.) which is a useful step by itself if just a subfolder needs to be extracted. This step moves the desired files to the top level of the repository.

Finally all remaining files are listed using git ls, the files to keep are removed from that using grep -v and the resulting list is passed to git rm which is invoked by git filter-branch --index-filter (3.). A bit convoluted but it does the trick.

1. copy the repository to extract the file from and go to the desired branch

@fozziethebeat
fozziethebeat / Schisel.scala
Created February 8, 2012 00:05
Schisel, An example of how to run Latent Dirichelte Allocation (via Mallet) from Scala
/*
* Copyright (c) 2012, Lawrence Livermore National Security, LLC. Produced at
* the Lawrence Livermore National Laboratory. Written by Keith Stevens,
* kstevens@cs.ucla.edu OCEC-10-073 All rights reserved.
*
* This file is part of the S-Space package and is covered under the terms and
* conditions therein.
*
* The S-Space package is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as published
@domoritz
domoritz / wos.php
Created March 10, 2012 19:32 — forked from pol/wos.php
Web of Science API access with ruby, python and php libs
<?php
$auth_url = "http://search.isiknowledge.com/esti/wokmws/ws/WOKMWSAuthenticate?wsdl";
$auth_client = @new SoapClient($auth_url);
$auth_response = $auth_client->authenticate();
$search_url = "http://search.isiknowledge.com/esti/wokmws/ws/WokSearchLite?wsdl";
$search_client = @new SoapClient($search_url);
$search_client->__setCookie('SID',$auth_response->return);
$search_array = array(
@jlong
jlong / uri.js
Created April 20, 2012 13:29
URI Parsing with Javascript
var parser = document.createElement('a');
parser.href = "http://example.com:3000/pathname/?search=test#hash";
parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port; // => "3000"
parser.pathname; // => "/pathname/"
parser.search; // => "?search=test"
parser.hash; // => "#hash"
parser.host; // => "example.com:3000"
@remcoder
remcoder / typescript.sublime-build
Created October 1, 2012 19:48
Sublime Text 2 build system for Typescript
{
"selector": "source.ts",
"cmd": ["tsc", "$file"],
"file_regex": "^(.+?) \\((\\d+),(\\d+)\\)(: .+)$",
"line_regex": "\\((\\d+),(\\d+)\\)",
"osx": {
"path": "/usr/local/bin:/opt/local/bin"
@Carreau
Carreau / kernel.js
Created December 13, 2012 20:09
A node.js kernel for IPython notebook. You can see the explanation of the ipynb rendered in http://nbviewer.ipython.org
zmq = require("zmq")
fs = require("fs")
var config = JSON.parse(fs.readFileSync(process.argv[2]))
var connexion = "tcp://"+config.ip+":"
var shell_conn = connexion+config.shell_port
var pub_conn = connexion+config.iopub_port
var hb_conn = connexion+config.hb_port
@debasishg
debasishg / gist:8172796
Last active May 10, 2024 13:37
A collection of links for streaming algorithms and data structures

General Background and Overview

  1. Probabilistic Data Structures for Web Analytics and Data Mining : A great overview of the space of probabilistic data structures and how they are used in approximation algorithm implementation.
  2. Models and Issues in Data Stream Systems
  3. Philippe Flajolet’s contribution to streaming algorithms : A presentation by Jérémie Lumbroso that visits some of the hostorical perspectives and how it all began with Flajolet
  4. Approximate Frequency Counts over Data Streams by Gurmeet Singh Manku & Rajeev Motwani : One of the early papers on the subject.
  5. [Methods for Finding Frequent Items in Data Streams](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.187.9800&amp;rep=rep1&amp;t

Latency numbers every programmer should know

L1 cache reference ......................... 0.5 ns
Branch mispredict ............................ 5 ns                     on recent CPU
L2 cache reference ........................... 7 ns                     14x L1 cache
Mutex lock/unlock ........................... 25 ns
Main memory reference ...................... 100 ns                     20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy ............. 3,000 ns  =   3 µs
Send 2K bytes over 1 Gbps network ....... 20,000 ns  =  20 µs
SSD random read ........................ 150,000 ns  = 150 µs

Read 1 MB sequentially from memory ..... 250,000 ns = 250 µs 4X memory

import scalaz._, Scalaz._, concurrent.{ Future, Task }
import argonaut._, Argonaut._
object ArgonautDemo extends JsonDemo {
val entitySum = DecodeJson(c =>
entityNames.traverseU(c.get[JsonArray](_).map(_.size)).map(_.sum)
)
val resultReads = DecodeJson(c =>
c.get[Json]("delete").map(_ => Result.deletion) ||| (