Skip to content

Instantly share code, notes, and snippets.

An idiosyncratic guide to teaching yourself practical machine learning, without links:

  • Find a binary classification dataset; maybe you have one internally.
  • Implement a simple decision tree algorithm, like CART.
  • Write some code to validate your model; produce an ROC curve and understand the tradeoff it embodies.
  • Compare the ROC for your training set with the ROC for a holdout and understand what it means that they differ.
  • Experiment with some hyperparameters: how does the comparison above change as you adjust the depth of the tree or other stopping criteria?
  • Combine your decision tree algorithm with bagging to produce a random forest. How does its ROC compare?
  • Do the same hyperparameter tuning here. (How many trees?) Reflect on overfitting and on the bias/variance tradeoff.
@pheuter
pheuter / sc-dl.js
Created March 5, 2012 20:44
Bookmarklet that generates download link for a Soundcloud upload
(function(d) {
var dl = d.createElement('a');
dl.innerText = 'Download MP3';
dl.href = "http://media.soundcloud.com/stream/"+d.querySelector('#main-content-inner img[class=waveform]').src.match(/\.com\/(.+)\_/)[1];
dl.download = d.querySelector('em').innerText+".mp3";
d.querySelector('.primary').appendChild(dl);
dl.style.marginLeft = '10px';
dl.style.color = 'red';
dl.style.fontWeight = 700;
})(document);