Skip to content

Instantly share code, notes, and snippets.

@edobashira
edobashira / decompose.cc
Created April 23, 2011 03:43
Using an explicit state table to extract internal paths from composed FST (OpenFst)
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
@edobashira
edobashira / boolean-arc.cc
Created April 3, 2011 10:34
boolean-weight for OpenFst
// boolean-arc.cc
// Compile with: g++ -fPIC boolean-arc.cc -o boolean-arc.so -shared –O2
// Make sure boolean-arc.so is on the LD_LIBRARY_PATH
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
@edobashira
edobashira / kaldi-arc.cc
Created August 10, 2012 05:39
Shim code to create shared object o register Kaldi lattice with OpenFst command line tools
//Compile something like (assumes the Kaldi/OpenFst headers are on the include path)
// g++ -g kaldi-arc.cc -o lattice4-arc.so \
// -shared -I/path/to/kaldi/src -DHAVE_ATLAS -fPIC
//Add the dir containing lattice4-arc.so to the LD_LIBRARY_PATH
#include <fst/const-fst.h>
#include <fst/edit-fst.h>
#include <fst/vector-fst.h>
#include <fst/script/register.h>
#include <fst/script/fstscript.h>
@edobashira
edobashira / utf8helpers.cc
Created April 1, 2011 06:08
In C++ convert UT8 to UTF16 and vice versa
wstring UTF8ToUTF16(const string& utf8) {
wstring utf16;
utf16.reserve(utf8.size());
for ( size_t i = 0; i < utf8.size(); ++i ) {
unsigned char ch0 = utf8[i];
if ( (ch0 & 0x80) == 0x00 ) {
utf16 += ((ch0 & 0x7f));
} else {
if ((ch0 & 0xe0) == 0xc0) {
unsigned char ch1 = utf8[++i];
@edobashira
edobashira / gist:6549668
Last active December 22, 2015 23:49
PhiMatcher skip matching
//Compile something like this
//g++ main.cc -o main -O2 -lfst
#include <iostream>
#include <fst/compose.h>
#include <fst/vector-fst.h>
using namespace std;
using namespace fst;
@edobashira
edobashira / Makefile
Created June 19, 2013 05:53
Simple tool to convert Kaldi lattice to HTK SLF format using Kaldi and OpenLat
KALDI_ROOT=
OPENLAT_ROOT=
all: lattice-to-htk
ifndef KALDI_ROOT
$(error KALDI_ROOT is not set)
endif
ifndef OPENLAT_ROOT
@edobashira
edobashira / Makefile
Created August 12, 2012 05:45
Example showing how to register a custom Fst type with OpenFst
CXX = g++
CC = g++
CXXFLAGS = -c -fPIC -Isrc/include/ -g -O2
LDFLAGS = -L/usr/local/lib/ -ldl -lfst -lfstscript
objs = mmap-fst.o
all : mmap-fst
@edobashira
edobashira / kaldi.bc
Created August 8, 2012 12:33
Kaldi bash completions
_sgmm2-gselect()
{
local cur prev opts filters len pprev
COMPREPLY=()
cur="${COMP_WORDS[COMP_CWORD]}"
prev="${COMP_WORDS[COMP_CWORD-1]}"
if (( $COMP_CWORD > 2)) ; then
pprev="${COMP_WORDS[COMP_CWORD-2]}"
else
pprev="NULL"
@edobashira
edobashira / chrome_speech_input
Created April 16, 2011 00:19
Minimal Speech HTML example
<!DOCTYPE html>
<html lang="en">
<head>
<title>Speech recognition test</title>
</head>
<body>
<input type="text" x-webkit-speech />
</body>
</html>
@edobashira
edobashira / StringEx.cs
Created February 25, 2011 04:54
String extension methods - split on whitespace
public static class StringEx
{
public static String[] SplitOnWhiteSpace(this String s)
{
return s.Split(new String[] { " ", "\t", "\f", "\n", "\r", "\v" },
StringSplitOptions.RemoveEmptyEntries);
}
}