Skip to content

Instantly share code, notes, and snippets.

@andreasvc
andreasvc / gmanevert.user.js
Last active August 29, 2015 14:05
GreaseMonkey script: vertical split for Gmane 'news' interface.
// ==UserScript==
// @name Gmane vertical frames
// @namespace andreas@unstable.nl
// @include http://news.gmane.org/*
// @include http://thread.gmane.org/*
// @version 1
// @grant none
// ==/UserScript==
// The default GMane 'news' view has horizontal panes which wastes lots of screen space;
@andreasvc
andreasvc / TopicModeling.ipynb
Created October 23, 2014 20:51
Topic Modeling with gensim. Load in ipython notebook or view online: http://nbviewer.ipython.org/gist/andreasvc/66fe7547b05569c9a273
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andreasvc
andreasvc / jsoneq.py
Last active August 29, 2015 14:09
Unordered equality test of JSON data
"""Convert JSON to an immutable representation so that equality can be tested
without regard for order."""
import json
class decoder(json.JSONDecoder):
# http://stackoverflow.com/questions/10885238/python-change-list-type-for-json-decoding
def __init__(self, list_type=list, **kwargs):
json.JSONDecoder.__init__(self, **kwargs)
# Use the custom JSONArray
# requires sidsl:
# git clone https://github.com/simongog/sdsl-lite.git
# cd sdsl-lite
# ./install.sh $HOME/.local
# uses pv to display progress (not essential)
# http://www.ivarch.com/programs/pv.shtml
all: fm-index indices
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@andreasvc
andreasvc / lineidx.py
Last active August 29, 2015 14:26
Benchmark of indexing of line offsets in text file.
"""Benchmark of indexing of line offsets in text file.
Usage example:
>>> index = indexfile_iter('1027.txt')
>>> index[5]
115
>>> import bisect
>>> bisect.bisect(index, 115) - 1
5
Mary is really very happy --> is Mary really very happy?
0.50 S --> NP VP [(0, 1)]
0.50 S --> VP_2 NP [(0, 1, 0)]
1.00 VP --> V VP|<ADV> [(0, 1)]
1.00 VP_2 --> V VP|<ADV> [(0,), (1,)]
0.50 VP|<ADV> --> ADV VP|<ADV> [(0, 1)]
0.50 VP|<ADV> --> ADV ADJ [(0, 1)]
0.50 ADJ --> Epsilon ['happy']
0.50 ADJ --> Epsilon ['sad']
0.50 ADV --> Epsilon ['really']
maxlen 15 unfolded False arity marks True binarized collinize right h=1 v=1 tailmarker markovize rank > 3 estimator dop1
python -u runexp.py 56312.42s user 296.32s system 96% cpu 16:20:51.28 total
../disco-dop/interp0.dop
labeled f-measure : 76.12293144208039
unlabeled f-measure : 79.19621749408984
../disco-dop/interp1.dop
labeled f-measure : 68.59122401847574
unlabeled f-measure : 72.74826789838338
../disco-dop/interp2.dop
labeled f-measure : 69.07723459647092
@andreasvc
andreasvc / array_bench.pyx
Created November 17, 2012 14:45
Benchmark array creation
import time
import numpy as np
cimport numpy as np
from libc.stdlib cimport malloc, free
from cpython.array cimport array, clone
cdef long N = 1000000
cdef double* ptr
cdef array ar, template = array('d')