Skip to content

Instantly share code, notes, and snippets.

@dimazest
dimazest / run.sh
Created March 28, 2014 13:23
cw14 presentation run script
cd phd-buildout/notebooks
time ../bin/corpora serafin plain-lsa --swda corpora/swda -v -j 2 --limit 10000 | less
@dimazest
dimazest / readme.rst
Last active August 29, 2015 13:57
fowler installation
python3.3 -m venv .env
source .env/bin/activate

 python3 ez_setup.py
 python3.3 get-pip.py

.env/local/bin/pip install numpy
.env/local/bin/pip install numexpr

Keybase proof

I hereby claim:

  • I am dimazest on github.
  • I am dimazest (https://keybase.io/dimazest) on keybase.
  • I have a public key whose fingerprint is C846 779A CCA3 0C10 B929 D763 B293 7D50 8BF7 FC77

To claim this, I am signing this object:

@dimazest
dimazest / index.py
Created March 12, 2014 12:19
Ordered indexer
class Index(OrderedDict):
"""An indexer.
For every unseen index, a unique ID id is assigned.
"""
def __getitem__(self, key):
"""Assign an ingeger ID for a unseen item."""
try:
return super().__getitem__(key)
#!/opt/local/bin/python2.7
"""My awesome script."""
import opster
@opster.command()
def hello():
"""A hello word implementation."""
print 'Hello world!'
@view.parallel(block=False)
def cosine_similarity(rows):
from sklearn.metrics import pairwise
print('hi!')
return [len(rows)]
@functools.lru_cache(maxsize=None)
def compute_tensor_dot(word, other_word=None):
[dimazest@mac tools]$ bin/fowler.corpora-py t.py
usage: t.py <command> [options]
commands:
dat Dialog act tagging
help Show help for a given help topic or a help overview.
similarity Word similarity
[dimazest@mac tools]$ bin/fowler.corpora-py t.py dat -h
usage: t.py dat <command> [options]

Input data

I hope this doesn't make me sick..

First step: get all the 7 grams (3 words before, 3 words after)