Skip to content

Instantly share code, notes, and snippets.

View benui-dev's full-sized avatar
🌱

benui benui-dev

🌱
View GitHub Profile
#!/usr/bin/env ruby -Ku
# encoding: utf-8
require 'rubygems'
require 'dm-core'
DataMapper::Logger.new($stdout, :debug)
DataMapper.setup(:default, 'sqlite3::memory:')
my $cursor = $collection->find(
{ run_mode => 'production' },
{ sort_by => { start_time => 1 } },
)->fields( { system_id => 1, main_stage => { duration => 1 } } );
# Returns everything under "main_stage", not just "duration".
# Is it possible to do nested field spec like this?
use strict;
use warnings;
use Data::Dumper;
# Assume that we've got sentence-aligned stuff
my $sentence_pairs = [
[ [ qw( the book ) ], [ qw( das buch ) ], ],
# [ [ qw( the house ) ], [ qw( das haus ) ], ],
ben@mini:~ $ sudo gem install rocco
Building native extensions. This could take a while...
ERROR: Error installing rocco:
ERROR: Failed to build gem native extension.
/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby extconf.rb
mkmf.rb can't find header files for ruby at /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/ruby.h
Gem files will remain installed in /Library/Ruby/Gems/1.8/gems/rdiscount-1.6.5 for inspection.
# Hack to make Carp::Assert work with Log4perl
# Stops warnings like "Prototype mismatch: sub Foo::DEBUG: none vs () at ...
BEGIN {
use Carp::Assert;
use Carp::Assert::More;
*ADEBUG = *DEBUG;
undef *DEBUG;
};
@benui-dev
benui-dev / MongoDB_Encoding.pl
Created September 3, 2010 06:34
MongoDB and Perl utf8 fun
use strict;
use warnings;
use utf8;
use MongoDB;
my $conn = MongoDB::Connection->new(host => 'unixdeva11', port => 21337);
my $db = $conn->get_database('foo');
my $coll = $db->get_collection('bar');
@benui-dev
benui-dev / gist:563610
Created September 3, 2010 08:31
MongoDB, Perl, UTF8: Trying to demonstrate the confusion when using encoding and MongoDB. Not clear what's going on yet.
# I admit I'm not really sure what's going on here
# UTF8 in Perl still confuses the hell out of me
# But i'm not sure what MongoDB is doing or trying to do either
# It seems that it's treating keys and values differently
# Namely, by encoding values and not encoding keys.
# Can anyone suggest what's I *Should* be doing?
# I want to store UTF8 data and get back UTF8 data
use strict;
require 'pp'
input = <<eos
Hello there
eos
# Randomise order of a-z
rand_characters = ('a'..'z').sort_by { rand }
encoding = Hash.new
# Transliterate-hacked from Perl
# http://blog.naver.com/PostView.nhn?blogId=mokomoji&logNo=130013133481
$KCODE = 'UTF8'
class String
# I think in the original the text was forced to cp949...
def split_korean
# ㄱ ㄲ ㄴ ㄷ ㄸ ㄹ ㅁ ㅂ ㅃ ㅅ ㅆ ㅇ ㅈ ㅉ ㅊ ㅋ ㅌ ㅍ ㅎ
chosung = [0x3131, 0x3132, 0x3134, 0x3137, 0x3138, 0x3139, 0x3141, 0x3142, 0x3143, 0x3145, 0x3146, 0x3147, 0x3148, 0x3149, 0x314a, 0x314b, 0x314c, 0x314d, 0x314e]
@benui-dev
benui-dev / perceptron.py
Created June 6, 2011 02:36
Playing with making a simple perceptron in Python. Used Foundations of Statistical Natural Language Processing as a reference.
# Ben's Magical Perceptron
def dot_product(a, b):
return sum([a[i]*b[i] for i in range(len(a))])
def decision( x, w, theta ):
return (dot_product(x, w) > theta)