Skip to content

Instantly share code, notes, and snippets.

Brendan O'Connor brendano

Block or report user

Report or block brendano

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@brendano
brendano / gist:39760
Created Dec 24, 2008
load the MNIST data set in R
View gist:39760
# Load the MNIST digit recognition dataset into R
# http://yann.lecun.com/exdb/mnist/
# assume you have all 4 files and gunzip'd them
# creates train$n, train$x, train$y and test$n, test$x, test$y
# e.g. train$x is a 60000 x 784 matrix, each row is one digit (28x28)
# call: show_digit(train$x[5,]) to see a digit.
# brendan o'connor - gist.github.com/39760 - anyall.org
load_mnist <- function() {
load_image_file <- function(filename) {
@brendano
brendano / autolog.py
Created Oct 10, 2008
python decorators to log all method calls, show call graphs in realtime too
View autolog.py
# Written by Brendan O'Connor, brenocon@gmail.com, www.anyall.org
# * Originally written Aug. 2005
# * Posted to gist.github.com/16173 on Oct. 2008
# Copyright (c) 2003-2006 Open Source Applications Foundation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
@brendano
brendano / .Rhistory
Last active Aug 31, 2019
longrun_bettng_inequality
View .Rhistory
log(c(1.05,.6))
log(c(1.5,.6))
ifelse(runif(1000)>.5, 1.5, .6)
x=ifelse(runif(1000)>.5, 1.5, .6)
mean(x)
prod(x)
x=ifelse(runif(10)>.5, 1.5, .6)
x
y=replicate(100000,{x=ifelse(runif(10)>.5, 1.5, .6); prod(x)})
summary(y)
@brendano
brendano / make_views.py
Created Jan 2, 2011
Publish Zotero papers as HTML and symlinks
View make_views.py
#!/usr/bin/env python
# From your Zotero database and file storage,
# creates a simple HTML table, and directory full of symlinks,
# for quick-and-dirty web or Dropbox viewing.
# Installation: place in your Zotero folder
# e.g. ~/Documents/zotero/
# And run it
# e.g. python ~/Documents/zotero/make_views.py
@brendano
brendano / .RData
Last active Jun 27, 2019
google ngram books plot
@brendano
brendano / example.ipynb
Last active Apr 30, 2019
l1 generative implementation with liblbfgs (owl-qn)
View example.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@brendano
brendano / xlsx2tsv.py
Created Nov 7, 2008
xlsx2tsv: python command-line script to convert xlsx (Excel "OOXML") into tab-separated values
View xlsx2tsv.py
#!/usr/bin/env python
"""
xlsx2tsv filename.xlsx [sheet number or name]
Parse a .xlsx (Excel OOXML, which is not OpenOffice) into tab-separated values.
If it has multiple sheets, need to give a sheet number or name.
Outputs honest-to-goodness tsv, no quoting or embedded \\n\\r\\t.
One reason I wrote this is because Mac Excel 2008 export to csv or tsv messes
up encodings, converting everything to something that's not utf8 (macroman
@brendano
brendano / corenlp_client_example.py
Last active Jul 15, 2018
example of using corenlp server from python
View corenlp_client_example.py
"""
example of using corenlp server from python
This code requires server to already be running: https://stanfordnlp.github.io/CoreNLP/corenlp-server.html
To start server:
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
To call it, e.g.:
curl --data "The man wanted to go to work." 'http://localhost:9000/?properties={%22annotators%22%3A%22tokenize%2Cssplit%2Cpos%2Cdepparse%22%2C%22outputFormat%22%3A%22conllu%22}'
View bla.csv
We can make this file beautiful and searchable if this error is corrected: It looks like row 6 should actually have 30 columns, instead of 9. in line 5.
http://brenocon.com/confsize,,,"Numer of paper submissions and acceptances for various conferences over time. Trying to only select full-length or ""main"" research papers, though others are sometimes included.
The first several columns are the main data. Sources on the right. Sometimes I tried to put in original source data in different columns. Sometimes data contradicts",,,,,,,,,,,,,,,,,,,,,,,,,,
area,conference,year,submit,accept,accept rates - some are messy or partial from copy-and-paste,joint,attendance,source/notes,other notes,other notes,other notes,other notes,other notes,other notes,,,,,,,,,,,,,,,
,,,,,,,,,https://dl.acm.org/citation.cfm?id=2766462&picked=source&preflayout=tabs,https://dl.acm.org/citation.cfm?id=312624,the ACM table:,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,Year,Submitted,Accepted,Rate,,,,,,,,,,,,,,,
webir,sigir,1999,135,33,,,,https://dl.acm.org/citation.cfm?id=312624,,22nd annual,SIGIR '99,135,33,24%,,,,,,,,,,,,,,,
webir,sigir,2000,??,??,,,,https://dl.acm.org/citation.cfm?id=345508&picked
@brendano
brendano / log_logistic.py.md
Last active Apr 6, 2018
numerically stable implementation of the log-logistic function
View log_logistic.py.md

Binary case

This is just the middle section of Bob Carpenter's note for evaluating log-loss via the binary logistic functoin https://lingpipe-blog.com/2012/02/16/howprevent-overflow-underflow-logistic-regression/

The logp function calculates the negative cross-entropy:

    dotproduct( [y, 1-y],  [logP(y=1), logP(y=0)] )

where the input s is the beta'x log-odds scalar value. The trick is to make this numerically stable for any choice of s and y.

You can’t perform that action at this time.