Skip to content

Instantly share code, notes, and snippets.

View pete-rai's full-sized avatar

Pete Rai pete-rai

View GitHub Profile
@pete-rai
pete-rai / cleanse.php
Created February 7, 2018 21:09
A text cleansing function that is useful for preparing strings prior to lexical analysis.
<?php
function cleanse ($text)
{
$text = iconv ('UTF-8', 'ASCII//TRANSLIT//IGNORE', $text); // accented character to 'normal'
$text = preg_replace ('/[\r\n\s\t]+/xms', ' ' , $text); // normalise whitespace to one space
$text = preg_replace ('/[^\w\s]+/xms' , '' , $text); // remove all punctuation
return strtolower (trim ($text)); // lowercase and trimmed
}
@pete-rai
pete-rai / loglikelihood.php
Created February 7, 2018 21:00
Log-likelihood is a statistical technique that helps identify significant words in a given body of text when compared with a wider corpus. More information at: https://github.com/pete-rai/words-of-our-culture#log-likelihood
<?php
// for more info see : http://ucrel.lancs.ac.uk/llwizard.html
// $n1 = total words in corpus 1 (usually the normative corpus)
// $n2 = total words in corpus 2
// $o1 = observed count for the word in corpus 1 (usually the normative corpus)
// $o2 = observed count for the word in corpus 2
function logLikelihood ($n1, $o1, $n2, $o2)
@pete-rai
pete-rai / pretty.js
Created February 7, 2018 20:56
A simple, concise, clean, no dependencies JSON pretty print output function
/*
use the following in your css file
.json-key { color: red; }
.json-value { color: blue; }
.json-string { color: green; }
set your output like this:
@pete-rai
pete-rai / polly.js
Last active February 26, 2020 14:27
A wrapper around Amazon Polly to make text-to-speech in nodejs super simple.
'use strict'
/*
first install the following node modules:
npm install aws-sdk
npm install stream
npm install speaker