Skip to content

Instantly share code, notes, and snippets.

View jasonbaldridge's full-sized avatar

Jason Baldridge jasonbaldridge

View GitHub Profile
@jasonbaldridge
jasonbaldridge / music.json
Created May 12, 2012 20:02
Simple JSON example with some musicians, albums, and songs
[{"name":"Radiohead","albums":[{"title":"The King of Limbs","songs":[{"title":"Bloom","length":"5:15"},{"title":"Morning Mr Magpie","length":"4:41"},{"title":"Little by Little","length":"4:27"},{"title":"Feral","length":"3:13"},{"title":"Lotus Flower","length":"5:01"},{"title":"Codex","length":"4:47"},{"title":"Give Up the Ghost","length":"4:50"},{"title":"Separator","length":"5:20"}],"description":"\n\tThe King of Limbs is the eighth studio album by English rock band Radiohead, produced by Nigel Godrich. It was self-released on 18 February 2011 as a download in MP3 and WAV formats, followed by physical CD and 12\" vinyl releases on 28 March, a wider digital release via AWAL, and a special \"newspaper\" edition on 9 May 2011. The physical editions were released through the band's Ticker Tape imprint on XL in the United Kingdom, TBD in the United States, and Hostess Entertainment in Japan.\n "},{"title":"OK Computer","songs":[{"title":"Airbag","length":"4:44"},{"title":"Paranoid Android","length":"6:23"},
@jasonbaldridge
jasonbaldridge / music.xml
Created May 4, 2012 20:43
Simple XML example with some musicians, albums, and songs
<music>
<artist name="Radiohead">
<album title="The King of Limbs">
<song title="Bloom" length="5:15"/>
<song title="Morning Mr Magpie" length="4:41"/>
<song title="Little by Little" length="4:27"/>
<song title="Feral" length="3:13"/>
<song title="Lotus Flower" length="5:01"/>
<song title="Codex" length="4:47"/>
<song title="Give Up the Ghost" length="4:50"/>
@jasonbaldridge
jasonbaldridge / gist:4978840
Created February 18, 2013 17:00
Spell correction code for part of of spelling correction exercise for Applied NLP course: https://github.com/utcompling/applied-nlp/wiki/SpellCorrect-Exercise-part2
package appliednlp.spell
/**
* Note: This is exercise code meant to align with the steps of the exercises
* on spelling correction for the Applied NLP class. The design would of course
* be very different for an actualy spelling corrector.
*
* Part 1: https://github.com/utcompling/applied-nlp/wiki/SpellCorrect-Exercise
* Part 2: https://github.com/utcompling/applied-nlp/wiki/SpellCorrect-Exercise-part2
*
package bcomposes.twitter
import twitter4j._
object StatusStreamer {
def main(args: Array[String]) {
val twitterStream = new TwitterStreamFactory(Util.config).getInstance
twitterStream.addListener(Util.simpleStatusListener)
twitterStream.sample
Thread.sleep(2000)
@jasonbaldridge
jasonbaldridge / topics_gibbs_sg_example.R
Created May 29, 2012 18:44
Gibbs sampler for topic models for artificial data in Steyvers and Griffiths 2007.
## An implementation of Gibbs sampling for topic models for the
## example in section 4 of Steyvers and Griffiths (2007):
## http://cocosci.berkeley.edu/tom/papers/SteyversGriffiths.pdf
##
## Author: Jason Baldridge (jasonbaldridge@gmail.com)
# Functions to parse the input data
words.to.indices = data.frame(row.names=c("r","s","b","m","l"),1:5)
mysplit = function(x) { strsplit(x,"")[[1]] }
word.vector = function(x) { words.to.indices[mysplit(x),] }
// This is code to accompany my blog post about variations on sequence computations:
// http://bcomposes.wordpress.com/2012/02/14/variations-for-computing-results-from-sequences-in-scala/
//
// Jason Baldridge
// A function to display the results of various ways of doing the same thing.
def display (intro: String, wlengths: List[Int], wcaps: List[Boolean]) {
println(intro)
println("Lengths: " + wlengths.mkString(" "))
println("Caps: " + wcaps.mkString(" "))
/**
* R functions to create and manipulate Breeze matrices. This should go into
* Breeze or a sub-project of Breeze eventually.
*/
object RFunc {
import breeze.linalg._
import breeze.stats.distributions._
import breeze.stats.DescriptiveStats._
/**