Skip to content

Instantly share code, notes, and snippets.

View ikegami-yukino's full-sized avatar

IKEGAMI Yukino ikegami-yukino

View GitHub Profile
@tily
tily / detect_rhyme.rb
Created July 31, 2011 13:36
日本語テキストから脚韻を抽出する (母音版)
#!/usr/bin/env ruby
# Usage: ruby detect_rhyme.rb /path/to/file.txt num
# Example: ruby detect_rhyme.rb 坊ちゃん.txt 3
require 'MeCab'
def main(args)
path, num = args[0], args[1].to_i
rhyme = {}
File.open(path).each do |line|
node_list = get_node_list(line)
@phelrine
phelrine / hirakana.sh
Created August 30, 2011 18:01
漢字ひらがな変換
#!/bin/sh
mecab --node-format="%f[7] " | nkf -w --hiragana | sed 's/EOS//g'
@stober
stober / softmax.py
Created March 1, 2012 03:05
Softmax in Python
#! /usr/bin/env python
"""
Author: Jeremy M. Stober
Program: SOFTMAX.PY
Date: Wednesday, February 29 2012
Description: Simple softmax function.
"""
import numpy as np
npa = np.array
@ttezel
ttezel / gist:4138642
Last active May 7, 2024 13:34
Natural Language Processing Notes

#A Collection of NLP notes

##N-grams

###Calculating unigram probabilities:

P( wi ) = count ( wi ) ) / count ( total number of words )

In english..

{
"IAB1": "Arts & Entertainment",
"IAB1-1": "Books & Literature",
"IAB1-2": "Celebrity Fan/Gossip",
"IAB1-3": "Fine Art",
"IAB1-4": "Humor",
"IAB1-5": "Movies",
"IAB1-6": "Music",
"IAB1-7": "Television",
"IAB2": "Automotive",
@aflc
aflc / edit_distance.cpp
Last active May 4, 2019 18:38
fast implementation of the edit distance (levenshtein distance).
// Copyright (c) 2013 Hiroyuki Tanaka
// Released under the MIT license
#include <stdint.h>
#include <cstdlib>
#include <cstring>
#include <string>
#include <map>
#include <vector>
#include <iostream>
@Mekajiki
Mekajiki / Hiragana2Phoneme.java
Last active October 8, 2023 22:48
ひらがなを音声認識アプリケーションJuliusで使われている音素表現(.htkdic)に変換する
package net.mekajiki;
import com.ibm.icu.text.Transliterator;
import java.util.ArrayList;
import java.util.List;
public class Hiragana2Phoneme {
public static String hiragana2Phoneme(String text) {
return romaji2Phoneme(hiragana2Romaji(text));
}
@rezoo
rezoo / caffe.md
Last active November 4, 2021 15:28

Caffe tutorial

この文章ではCNN実装であるCaffeを用いて,特徴ベクトルの抽出やパラメータの学習を行うための方法について説明する.

Caffeでサポートされている機能

以下の作業を行いたいのであれば,Caffeを用いることが望ましい.

  • CNNを利用した画像の多クラス分類
  • CNNによる特徴ベクトルの抽出
  • CNNの転移学習
  • Stacked Auto Encoder
@neubig
neubig / lstm-lm.py
Last active August 23, 2017 09:18
This is a minimal implementation of training for a language model using long short-term memory (LSTM) neural networks
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# This is a simplified implementation of the LSTM language model (by Graham Neubig)
#
# LSTM Neural Networks for Language Modeling
# Martin Sundermeyer, Ralf Schlüter, Hermann Ney
# InterSpeech 2012
#
# The structure of the model is extremely simple. At every time step we
@raine
raine / .gitconfig
Created February 27, 2015 10:12
git add with grep
[alias]
grep-add = "!sh -c 'git ls-files -m -o --exclude-standard | grep $1 | xargs git add' -"
grep-add-patch = "!sh -c 'git add -p `git ls-files -m -o --exclude-standard | grep $1`' -"