Skip to content

Instantly share code, notes, and snippets.

@Snazz2001
Snazz2001 / ElasticNet.md
Created March 14, 2016 01:09 — forked from shagunsodhani/ElasticNet.md
Notes for "Regularization and variable selection via the elastic net" paper.

Regularization and variable selection via the elastic net

Introduction to elastic net

  • Regularization and variable selection method.
  • Sparse Representation
  • Exihibits grouping effect.
  • Prticulary useful when number of predictors (p) >> number of observations (n).
  • LARS-EN algorithm to compute elastic net regularization path.
  • Link to paper.
@Snazz2001
Snazz2001 / EXAMPLE_WATSON_API_README.md
Created February 22, 2016 18:18 — forked from dannguyen/EXAMPLE_WATSON_API_README.md
Transcribing ProPublica podcast with Python and Watson Speech to Text API

Using IBM Watson Speech to Text API to translate a ProPublica podcast

An example of using the Watson Speech to Text API to translate a podcast from ProPublica: How a Reporter Pierced the Hype Behind Theranos

This is just a simpler demo of the same technique I demonstrate to make automated video supercuts in this repo: https://github.com/dannguyen/watson-word-watcher

The transcription takes just a few minutes (less if you parallelize the requests to IBM) and is free...but it isn't perfect by any means. It doesn't fare super well on proper nouns:

  • Charles Ornstein's last name is transcribed as 'Orenstein'
  • John Carreyrou's last name becomes "John Kerry Roo"
@Snazz2001
Snazz2001 / exploding_boxplot_test
Created January 27, 2016 12:34 — forked from abresler/exploding_boxplot_test
Expoding Boxplot Function & nbastatR test use case
# load_packages -----------------------------------------------------------
packages <-
c('nbastatR', #devtools::install_github("abresler/nbastatR")
'explodingboxplotR', #devtools::install_github("timelyportfolio/explodingboxplotR")
'ggplot2',
'dplyr',
'purrr',
'magrittr')
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Snazz2001
Snazz2001 / springer-free-maths-books.md
Created December 29, 2015 11:38 — forked from bishboria/springer-free-maths-books.md
Springer have made a bunch of books available for free, here are the direct links
@Snazz2001
Snazz2001 / arXiv popularity scoring.ipynb
Last active September 6, 2015 14:23 — forked from nebw/arXiv popularity scoring.ipynb
Popularity scoring for arXiv publications
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Snazz2001
Snazz2001 / min-char-rnn.py
Last active August 29, 2015 14:25 — forked from karpathy/min-char-rnn.py
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level demo. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('data.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
print '%d unique characters in data.' % (len(chars), )

Horizons in Probabilistic Programming and Bayesian Analysis

Representations:

  • Hierarchical models
  • Hidden Markov models
  • Graphical models
  • Non-parametric Bayes (distributions over functions)

Inference Approaches:

@Snazz2001
Snazz2001 / r-pkgs.md
Last active August 29, 2015 14:24 — forked from peterhurford/r-pkgs.md

Notes from reading through R Packages by Hadley Wickham. This is meant to review, not replace, a thorough readthrough. I mainly wrote this as a personal review, since writing summaries and attempting to teach others are some of the best ways to learn things.

Introduction

  • Packages are used to organize code together so that it can be used repeatedly and shared with others.

  • A lot of work with packages is done via the devtools package.

If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?

I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:

  • Statistical knowledge
  • Programming/hacking skills
  • Domain expertise

Statistical knowledge