Skip to content

Instantly share code, notes, and snippets.

@ashenfad
ashenfad / sample-replacement.clj
Last active September 25, 2015 20:27
Streaming (one-pass) technique for sampling with replacement
;; For a fully fleshed out library, see:
;; https://github.com/bigmlcom/sampling
;; -------------------------------------
;; see http://ashenfad.blogspot.com/2011/06/single-pass-sampling-with-replacement.html
(ns sample.replacement
(:use [clojure.contrib.math :only [expt]]))
(defn- choose [a b]
@ashenfad
ashenfad / reservoir.clj
Last active October 6, 2015 08:58
Reservoir Sampling in Clojure
;; For a fully fleshed out library, see:
;; https://github.com/bigmlcom/sampling
;; -------------------------------------
(ns sample.reservoir
"Functions for sampling without replacement using a reservoir.")
(defn create
"Creates a sample reservoir."
[reservoir-size]
@ashenfad
ashenfad / README.md
Last active December 15, 2015 08:59
BigML Tree - Cover Data

A sunburst visualization of a BigML decision tree built on the cover dataset.

The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number of training instances captured by a node determine its arc length (or its size in radians).

Clicking on a node will zoom in to the subtree. After zooming in,

@ashenfad
ashenfad / README.md
Last active December 15, 2015 08:59
BigML Tree - Adult Data

A sunburst visualization of a BigML decision tree built on the adult census dataset.

The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number of training instances captured by a node determine its arc length (or its size in radians).

Clicking on a node will zoom in to the subtree. After zooming in,

@ashenfad
ashenfad / README.md
Last active December 15, 2015 08:59
BigML Tree - Iris Data

A sunburst visualization of a BigML decision tree built on the iris dataset.

The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number of training instances captured by a node determine its arc length (or its size in radians).

Clicking on a node will zoom in to the subtree. After zooming in,

@ashenfad
ashenfad / README.md
Last active December 15, 2015 08:59
BigML Tree - Pima Diabetes Data

A sunburst visualization of a BigML decision tree built on the Pima diabetes dataset.

The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number of training instances captured by a node determine its arc length (or its size in radians).

Clicking on a node will zoom in to the subtree. After zooming in,

@ashenfad
ashenfad / README.md
Last active December 15, 2015 13:29
BigML Tree - Concrete Data
@ashenfad
ashenfad / README.md
Last active December 15, 2015 18:39
BigML Tree - Abalone Data

A sunburst visualization of a BigML decision tree built on the abalone dataset.

The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number of training instances captured by a node determine its arc length (or its size in radians).

Clicking on a node will zoom in to the subtree. After zooming in,

@ashenfad
ashenfad / README.md
Last active December 16, 2015 14:29
BigML Tree - SMS Spam Detection

A sunburst visualization of a BigML decision tree built on the SMS Spam dataset.

The model uses BigML's upcoming automatic text processing to find important words when separating the spam from the ham.


The initial center circle represents the root of the tree. Each outer circle contains the children of the inner circle's nodes. The number

@ashenfad
ashenfad / README.md
Last active December 19, 2015 15:49
BigML Tree - Titanic Survival