Instantly share code, notes, and snippets.

Lars larsmans

Created August 15, 2011 13:22
Nonogram/paint-by-numbers solver in SWI-Prolog
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 /* * Nonogram/paint-by-numbers solver in SWI-Prolog. Uses CLP(FD), * in particular the automaton/3 (finite-state/RE) constraint. * Copyright 2011, 2014 Lars Buitinck * Copyright 2014 Markus Triska * Do with this code as you like, but don't remove the copyright notice. */ :- use_module(library(clpfd)).
Created September 28, 2011 15:54
Priority queues (pairing heaps) in Erlang
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 % Copyright (c) 2010-2014, Lars Buitinck % May be used, redistributed and modified under the terms of the % GNU Lesser General Public License (LGPL), version 2.1 or later % Heaps/priority queues in Erlang % Heaps are data structures that return the entries inserted into them in % sorted order. This makes them the data structure of choice for implementing % priority queues, a central element of algorithms such as best-first/A* % search and Kruskal's minimum-spanning-tree algorithm.
Created July 15, 2012 13:25
Hellinger distance for discrete probability distributions in Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 """ Three ways of computing the Hellinger distance between two discrete probability distributions using NumPy and SciPy. """ import numpy as np from scipy.linalg import norm from scipy.spatial.distance import euclidean
Created September 18, 2012 21:00
Inspecting scikit-learn CountVectorizer output with a Pandas DataFrame
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 >>> from pandas import DataFrame >>> from sklearn.feature_extraction.text import CountVectorizer >>> docs = ["You can catch more flies with honey than you can with vinegar.", ... "You can lead a horse to water, but you can't make him drink."] >>> vect = CountVectorizer(min_df=0., max_df=1.0) >>> X = vect.fit_transform(docs) >>> print(DataFrame(X.A, columns=vect.get_feature_names()).to_string()) but can catch drink flies him honey horse lead make more than to vinegar water with you 0 0 2 1 0 1 0 1 0 0 0 1 1 0 1 0 2 2 1 1 2 0 1 0 1 0 1 1 1 0 0 1 0 1 0 2
Created February 14, 2013 13:38
k-means clustering in pure Python
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 #!/usr/bin/python # # K-means clustering using Lloyd's algorithm in pure Python. # Written by Lars Buitinck. This code is in the public domain. # # The main program runs the clustering algorithm on a bunch of text documents # specified as command-line arguments. These documents are first converted to # sparse vectors, represented as lists of (index, value) pairs. from collections import defaultdict
Created March 11, 2013 22:36
Columnwise maximum of scipy.sparse.csc_matrix, in Cython
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 cimport numpy as np def csc_columnwise_max(np.ndarray[np.float64_t, ndim=1] data, np.ndarray[int, ndim=1] indices, np.ndarray[int, ndim=1] indptr, np.ndarray[np.float64_t, ndim=1] out): cdef double mx cdef int n_features = indptr.shape[0] - 1 cdef int i, j
Created July 14, 2013 21:12
k-means feature mapper for scikit-learn
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 from sklearn.base import BaseEstimator, TransformerMixin from sklearn.metrics.pairwise import rbf_kernel class KMeansTransformer(BaseEstimator, TransformerMixin): def __init__(self, centroids): self.centroids = centroids def fit(self, X, y=None): return self
Last active December 23, 2015 21:59 — forked from glouppe/nearest_developers.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 import numpy as np import os import sys from collections import defaultdict from git import Repo from scipy.sparse import csc_matrix path = sys.argv[1] extensions = [".py", ".pyx", ".pxd"]
Created October 21, 2013 17:47
Upgrade Ubuntu 13.04 to Linux Mint 15
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 # Updated version of Jeff Shaffner's instructions, # http://jeffshaffner.wordpress.com/2012/09/27/how-to-convert-ubuntu-12-04-to-linux-mint-13/ # Run with "sudo sh ubuntu-to-mint.sh". set -e set -x apt-get update && apt-get --yes upgrade add-apt-repository "deb http://packages.linuxmint.com/ olivia \
Created December 24, 2013 18:53
Sentiment analysis with scikit-learn
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
 Sentiment analysis experiment using scikit-learn ================================================ The script sentiment.py reproduces the sentiment analysis approach from Pang, Lee and Vaithyanathan (2002), who tried to classify movie reviews as positive or negative, with three differences: * tf-idf weighting is applied to terms * the three-fold cross validation split is different * regularization is tuned by cross validation