Skip to content

Instantly share code, notes, and snippets.

View fauxneticien's full-sized avatar

Nay San fauxneticien

View GitHub Profile
@fauxneticien
fauxneticien / formants_in_csv.praat
Created October 5, 2016 10:39
Generate a CSV file of formant data using Praat
# User Praat formant tracker with default settings
#
# Arguments
# input_file: Full file path to location of .wav file
# output_file: Full file path and file name, .csv file recommended
#
#
# Sample usage from command line (OS X)
# /Applications/Praat.app/Contents/MacOS/Praat --run "/Users/Nay/get_formants_as_csv.praat" "/Users/Nay/Desktop/sandboxes/b001_file01.wav" "/Users/Nay/Desktop/sandboxes/b001_file01.csv"
#
@fauxneticien
fauxneticien / split_media_by_eaf_tier.R
Created January 25, 2017 06:24
Split a media file associated with an ELAN file according to some tier
#!/usr/bin/env Rscript
# Split a media file associated with an ELAN file according to some tier
#
# Dependencies: ffmpeg, tidyverse (R Package)
#
# Usage: RScript split_media_by_eaf_tier.R eaf_file tier_name
# Example: RScript split_media_by_eaf_tier.R example.eaf tier_A
# Load packages, install 'tidyverse' (it includes both xml2 and stringr in it)
@fauxneticien
fauxneticien / docker_on_ubuntu-16.04.sh
Created February 21, 2017 08:17
Install Docker on Ubuntu 16.04
sudo apt-get update
sudo apt-key adv --keyserver hkp://p80.pool.sks-keyservers.net:80 --recv-keys 58118E89F3A912897C070ADBF76221572C52609D
sudo apt-add-repository 'deb https://apt.dockerproject.org/repo ubuntu-xenial main'
sudo apt-get update
apt-cache policy docker-engine
sudo apt-get install -y docker-engine
docker info
@fauxneticien
fauxneticien / search-n-download-alveo-list.R
Created May 10, 2017 09:48
Download files with names matching pattern from an Alveo list
############################## 1. Set up ######################################
# Make sure you have the necessary packages installed (see section 2).
# The last number from the list URL for example 1045 for:
# https://app.alveo.edu.au/item_lists/1045
alveo_list_id <- 904
# Regex pattern to search for 'speaker16.wav$' means 'ending with speaker16.wav'
alveo_search_pattern <- "speaker16.wav$"
@fauxneticien
fauxneticien / eafs_vbind.sh
Last active June 26, 2017 07:13
Vertically bind all .eafs in a folder into a single XML file
#/bin/bash
export SEARCH_FOLDER=/git-repos/asr-daan/komnzo_text
time find "$SEARCH_FOLDER" -name "*.eaf" |
parallel -j 4 '
grep -v "^<?xml" {} |
sed "s%<ANNOTATION_DOCUMENT \(.*\)>%<ANNOTATION_DOCUMENT \1 SRC=\"{}\">%g"
' |
awk 'BEGIN{print "<?xml version=\"1.0\" encoding=\"UTF-8\"?><EAFS_DB>"}
@fauxneticien
fauxneticien / chunker.py
Last active July 13, 2017 01:08
Parse backslash-coded lexicon using a defined grammar
#!/usr/bin/python
# Usage: pass in a grammar from a text file, or define them chunk-by-chunk as following arguments
# python chunker.py < lexicon.txt "xml" $(cat grammar.txt)
# python chunker.py < lexicon.txt "json" "examples:{<text><translation>}" "headword:{<lx><ps><examples>*}"
import cStringIO, json, sys, xmltodict, xml.dom.minidom
from toolz.functoolz import pipe
from nltk.toolbox import ToolboxData
from xml.etree.ElementTree import ElementTree
@fauxneticien
fauxneticien / .block
Created September 18, 2017 04:46 — forked from mbostock/.block
Equirectangular (Plate Carrée)
license: gpl-3.0
height: 480
@fauxneticien
fauxneticien / add-50-rusers.R
Last active October 25, 2017 12:45
Generate shell script to add 50 users into rocker/verse
# Generates a script ~/init_users.sh, which when run
# adds 50 users who can log into the Docker image running rocker/verse
# https://hub.docker.com/r/rocker/verse/
library(purrr)
library(stringr)
install.packages("glue")
library(glue)
@fauxneticien
fauxneticien / ecuder.js
Created October 19, 2017 05:35
Ecuder — the opposite of reduce
/*
// Helper to split an initial object into a nested array
// by Mark Ellison <https://github.com/tyrannomark> mostly!
ecuder([x => x.split(" ")])("one two")
ecuder([x => x.split(" "), x => x.split("")])("one two")
ecuder([x => x.split(/\.\s?/), x => x.split(" "), x => x.split("")])("sentence one. sentence two")
*/
function ecuder(funcs_array, initial_obj) {
@fauxneticien
fauxneticien / mhsF0_df.R
Last active October 25, 2017 05:43
Read in pitch tracks from .wav files as an R data frame
# This is a tidyverse-friendly wrapper function for wrassp::mhsF0
#
# Usage:
#
# dir(path = ".", pattern = "*.wav", full.names = TRUE) %>% mhsF0_df
# dir(path = ".", pattern = "*.wav", full.names = TRUE) %>% mhsF0_df(beginTime = 1, gender = "f")
#
# For list of arguments to wrassp::mhsF0, see https://www.rdocumentation.org/packages/wrassp/versions/0.1.4/topics/mhsF0
mhsF0_df <- function(fileList, ...) {