Skip to content

Instantly share code, notes, and snippets.

View JRMeyer's full-sized avatar
👋

Josh Meyer JRMeyer

👋
View GitHub Profile

What the BookCorpus?

So in the midst of all these Sesame Streets characters and robots transforming automobile era of "contextualize" language models, there is this "Toronto Book Corpus" that points to this kinda recently influential paper:

Yukun Zhu, Ryan Kiros, Rich Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler. 2015. "Aligning books and movies: Towards story-like visual explanations by watching movies and reading books." In Proceedings of the IEEE international conference on computer vision, pp. 19-27.

Why do I even care, there's no translations there?

Some might know my personal pet peeve on collecting translation datasets but this BookCorpus has no translations, so why do I even care about it?

@sshh12
sshh12 / MelSpec4Mic.py
Created February 17, 2019 19:34
Live mic -> live melspectrogram plot
import cv2
import numpy as np
import pyaudio
import librosa
import librosa.display
import matplotlib.pyplot as plt
import time
rate = 16000
chunk_size = rate // 4

Beyond librispeech: About the amount of spoken content stored in Librivox

Overview

Given that LibriVox contains enough of english content for a speech processing corpus, LibriSpeech, to be built from it, I've wondered how much content LibriVox has in languages other than English.

I've downloaded the JSON API contents of Librivox, separated the audiobooks according to their language, and summed up their lengths, obtaining a language breakdown expressed in spoken time.

This gave results of over 60 thousand hours for english, thousands of hours each for German, Dutch, French, Spanish, and hundreds of hours for other languages.

@darencard
darencard / gnuplot_quickstart.md
Created August 31, 2017 14:20
A quick-start guide for using gnuplot for in-terminal plotting

A quick-start guide for using gnuplot for in-terminal plotting

Sometimes it is really nice to just take a quick look at some data. However, when working on remote computers, it is a bit of a burden to move data files to a local computer to create a plot in something like R. One solution is to use gnuplot and make a quick plot that is rendered in the terminal. It isn't very pretty by default, but it gets the job done quickly and easily. There are also advanced gnuplot capabilities that aren't covered here at all.

gnuplot has it's own internal syntax that can be fed in as a script, which I won't get into. Here is the very simplified gnuplot code we'll be using:

set terminal dumb size 120, 30; set autoscale; plot '-' using 1:3 with lines notitle

Let's break this down:

@awni
awni / ctc_decoder.py
Last active June 1, 2024 14:21
Example CTC Decoder in Python
"""
Author: Awni Hannun
This is an example CTC decoder written in Python. The code is
intended to be a simple example and is not designed to be
especially efficient.
The algorithm is a prefix beam search for a model trained
with the CTC loss function.
@MatthiasWinkelmann
MatthiasWinkelmann / extract.py
Last active March 5, 2024 06:19
Extract PNG images from the cifar-100/cifar-10 pickled dataset
# This extracts png images from the
# packed/pickle'd cifar-100 dataset
# available at http://www.cs.toronto.edu/~kriz/cifar.html
#
# No Rights Reserved/ CC0
# Say thanks @whereismatthi on Twitter if it's useful
#
# probably requires python3
# definitely requires PyPNG: pip3 install pypng
@kastnerkyle
kastnerkyle / install_tts.py
Last active December 1, 2021 17:27
Install speech toolkit and create features
from __future__ import print_function
import subprocess
import shutil
import os
import stat
import time
# This script looks extremely defensive, but *should* let you rerun at
# any stage along the way. Also a lot of code repetition due to eventual support
# for "non-blob" install from something besides the magic kk_all_deps.tar.gz
@protrolium
protrolium / ffmpeg.md
Last active June 15, 2024 01:28
ffmpeg guide

ffmpeg

Converting Audio into Different Formats / Sample Rates

Minimal example: transcode from MP3 to WMA:
ffmpeg -i input.mp3 output.wma

You can get the list of supported formats with:
ffmpeg -formats

You can get the list of installed codecs with:

tmux cheatsheet

As configured in my dotfiles.

start new:

tmux

start new with session name: