Skip to content

Instantly share code, notes, and snippets.

View scarecrow1123's full-sized avatar

ananda seelan scarecrow1123

View GitHub Profile

Beyond librispeech: About the amount of spoken content stored in Librivox

Overview

Given that LibriVox contains enough of english content for a speech processing corpus, LibriSpeech, to be built from it, I've wondered how much content LibriVox has in languages other than English.

I've downloaded the JSON API contents of Librivox, separated the audiobooks according to their language, and summed up their lengths, obtaining a language breakdown expressed in spoken time.

This gave results of over 60 thousand hours for english, thousands of hours each for German, Dutch, French, Spanish, and hundreds of hours for other languages.

@fralau
fralau / key-pairs-cmdline.md
Last active December 6, 2023 15:10
Command line: create dictionary from key-value pairs

Converting an arbitrary list of key-value pairs from the command-line into a Python dictionary

Problem

You want to create a series of key-value pairs from the command line, using the argparse library, e.g.:

command par1 par2 --set foo=hello bar="hello world" baz=5

This is typically useful when you want to clearly distinguish":

  1. Ordinary arguments for the command-line utility itself (output, input, format, etc.) from
  2. A set of key-value pairs you want to pass to the python application. This is especially valid when you do not want that set of values to be predetermined, as this can save a lot of code.
@mbinna
mbinna / effective_modern_cmake.md
Last active May 21, 2024 08:25
Effective Modern CMake

Effective Modern CMake

Getting Started

For a brief user-level introduction to CMake, watch C++ Weekly, Episode 78, Intro to CMake by Jason Turner. LLVM’s CMake Primer provides a good high-level introduction to the CMake syntax. Go read it now.

After that, watch Mathieu Ropert’s CppCon 2017 talk Using Modern CMake Patterns to Enforce a Good Modular Design (slides). It provides a thorough explanation of what modern CMake is and why it is so much better than “old school” CMake. The modular design ideas in this talk are based on the book [Large-Scale C++ Software Design](https://www.amazon.de/Large-Scale-Soft

@keunwoochoi
keunwoochoi / freesound_crawler.py
Last active July 11, 2020 06:38
how to crawl freesound
# Keunwoo Choi
# This example crawl snoring sound by searching keyword 'snore'.
from __future__ import print_function
import freesound # $ git clone https://github.com/MTG/freesound-python
import os
import sys
api_key = 'YOUR_API_KEY'
folder = 'data_freesound/' # folder to save
@wagenet
wagenet / glibc.md
Last active May 13, 2024 03:57
glibc Versions

glibc Versions

List of oldest supported version of top 10 Linux Distros and their glibc version according to distrowatch.com.

Summary

Out of all versions with published EOLs, 2.12 is the oldest glibc still active, found in CentOS 6.8.

If CentOS 6 and 7 are eliminated, the oldest glibc is 2.23 in Ubuntu and Slackware.

@bt5e
bt5e / gist:7507535
Last active March 29, 2024 07:38
Markdown subscript and superscript

Testing subscript and superscript

Testing subscript subscript level 2

Testing superscript superscript level 2

@dypsilon
dypsilon / frontendDevlopmentBookmarks.md
Last active May 7, 2024 01:27
A badass list of frontend development resources I collected over time.
@NickLaMuro
NickLaMuro / .tmux.conf
Created January 27, 2012 07:27
My .tmux.conf
# # act like GNU screen
unbind C-b
set -g prefix C-a
# Allow C-A a to send C-A to application
bind C-a send-prefix
# start window index of 1
set -g base-index 1