Skip to content

Instantly share code, notes, and snippets.

View geovedi's full-sized avatar

jim geovedi geovedi

View GitHub Profile
@odashi
odashi / mert.py
Last active May 1, 2016 14:17
Minimum error-rate training for statistical machine translation
#!/usr/bin/python3
import math
import random
import sys
from argparse import ArgumentParser
from collections import defaultdict
from util.functions import trace
def parse_args():
@lepture
lepture / emoji.py
Created March 10, 2012 15:54
emoji support in python
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# Copyright (c) 2012, lepture.com
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions
# are met:
#
# * Redistributions of source code must retain the above copyright
from numpy.random import choice as random_choice, randint as random_randint, rand
MAX_INPUT_LEN = 40
AMOUNT_OF_NOISE = 0.2 / MAX_INPUT_LEN
CHARS = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .")
def add_noise_to_string(a_string, amount_of_noise):
"""Add some artificial spelling mistakes to the string"""
if rand() < amount_of_noise * len(a_string):
# Replace a character with a random character
random_char_position = random_randint(len(a_string))
@odashi
odashi / bleu.py
Last active September 20, 2019 06:46
BLEU calculator
# usage (single sentence):
# ref = ['This', 'is', 'a', 'pen', '.']
# hyp = ['There', 'is', 'a', 'pen', '.']
# stats = get_bleu_stats(ref, hyp)
# bleu = calculate_bleu(stats) # => 0.668740
#
# usage (multiple sentences):
# stats = defaultdict(int)
# for ref, hyp in zip(refs, hyps):
# for k, v in get_bleu_stats(ref, hyp).items():
@basaundi
basaundi / multi_bleu.py
Last active September 20, 2020 07:28
python rewrite of Moses' multi-bleu.perl; usable as a library
#!/usr/bin/env python
# Ander Martinez Sanchez
from __future__ import division, print_function
from math import exp, log
from collections import Counter
def ngram_count(words, n):
if n <= len(words):
@odashi
odashi / chainer_encoder_decoder.py
Last active January 22, 2021 14:03
Training and generation processes for neural encoder-decoder machine translation.
#!/usr/bin/python3
import datetime
import sys
import math
import numpy as np
from argparse import ArgumentParser
from collections import defaultdict
from chainer import FunctionSet, Variable, functions, optimizers
@mangecoeur
mangecoeur / a-conda-workon-tool.md
Last active February 9, 2021 14:53
A "virtualenv activate" for Anaconda environments

A "virtualenv activate" for Anaconda environments

I've been using the Anaconda python package from continuum.io recently and found it to be a good way to get all the complex compiled libs you need for a scientific python environment. Even better, their conda tool lets you create environments much like virtualenv, but without having to re-compile stuff like numpy, which gets old very very quickly with virtualenv and can be a nightmare to get correctly set up on OSX.

The only thing missing was an easy way to switch environments - their docs suggest running python executables from the install folder, which I find a bit of a pain. Coincidentally I came across this article - Virtualenv's bin/activate is Doing It Wrong - which desribes a simple way to launch a sub-shell with certain environment variables set. Now simple was the key word for me since my bash-fu isn't very strong, but I managed to come up with the script below. Put this in a text file called conda-work

@fperez
fperez / TwitterGraphs.ipynb
Last active October 25, 2021 22:21
Exploring graph properties of the Twitter stream with twython, networkx and IPython
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@mbollmann
mbollmann / state_transfer_lstm.py
Created June 18, 2016 08:59
StateTransferLSTM for Keras 1.x
# Source:
# https://github.com/farizrahman4u/seq2seq/blob/master/seq2seq/layers/state_transfer_lstm.py
from keras import backend as K
from keras.layers.recurrent import LSTM
class StateTransferLSTM(LSTM):
"""LSTM with the ability to transfer its hidden state.
This layer behaves just like an LSTM, except that it can transfer (or
@entron
entron / imdb_cnn_kim_small_embedding.py
Last active September 16, 2023 16:23
Keras implementation of Kim's paper "Convolutional Neural Networks for Sentence Classification" with a very small embedding size. The test accuracy is 0.853.
'''This scripts implements Kim's paper "Convolutional Neural Networks for Sentence Classification"
with a very small embedding size (20) than the commonly used values (100 - 300) as it gives better
result with much less parameters.
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_cnn.py
Get to 0.853 test accuracy after 5 epochs. 13s/epoch on Nvidia GTX980 GPU.
'''
from __future__ import print_function