yweweler/datasets.md

## datasets.md

      
    Raw
  

              datasets.md
            
          
    Datasets

Listing of Data Sets that are free to use under certain circumstances.

Images
Audio

Aligned Data
Treebanks
Other Sounds


Miscellaneous
Other Listings
Text

Images


MNIST 70,000; 28x28x1 images of handwritten digits
CIFAR-10 60,000; 32x32 colour images in 10 classes
CIFAR-100 60,000; 32x32 colour images in 100 classes
STL-10 10 classes. Images are 96x96 pixels, color.
ILSVRC Large Scale Visual Recognition Challenge
SVHDN Street View House Numbers
BelgiumTS Dataset Traffic Signs
GTSRB Traffic Signs

Audio

Aligned Data (Multiple Speakers)


Name
Lang.
S. Rate
Format
Length
Description
Infos


TIMIT
en
16 kHz
float
~5h
Aligned sentences and phonemes.
Paper


Open Speech Data Corpus for German
de
16 kHz
int16
~35h
About 180 native speakers.


Open Source Acoustic Models for German Distant Speech Recognition
de


CMU ARCTIC
en
16 kHz
int16
~14h
18 speaker


Speech Commands Dataset
en
16 kHz
int16
~18h
Stop, Go, Up, Down, ...


CSTR VCTK Corpus
en
48 kHz
int16
~44h
109 speakers


TED-LIUMv1
en
16 kHz
int16
~118h, male: 82h, female: 36h
774 audio talks + transcripts, 666 speakers
Paper


TED-LIUMv2
en
16 kHz
int16
~207h, male: 141h, female: 66h
1495 audio talks + transcripts, 1242 speakers
Paper


LibriSpeech ASR Corpus
en
16 kHz

~1000h
1166 speakers
PDF


Mozilla Common Voice Corpus
en


~500h
~20k speakers
~400k recordings


Dimex1000 Corpus
es


Aligned Data (Single Speaker)


Name
Lang.
S. Rate
Format
Length
Description
Infos


PAVOQUE
de
44.1 kHz
int16
~12h
1 speaker


LJ Speech Dataset
en
22 kHz
in16
~24h
1 speaker


Nancy
en
16 kHz - 96 kHz
int16
~17h
Blizzard Challenge 2011 corpus, 1 speaker
License


Bayers
en


Blizzard Challenge 2013 corpus
License


Usborne
en
44.1 kHz

~6.4h
Blizzard Challenge 2017 corpus
License


Treebanks


Name
Lang.
S. Rate
Format
Length
Description
Infos


TüBa-D/S
de


Other Sounds


Name
S. Rate
Format
Description
Infos


Macaulay Library


Animal sounds and photos


Miscellaneous


Titanic Casualty data
Iris Flowers

Other Database Listings


Kaggle Datasets
NLTK
Open Data StackExchange
deeplearning.net

Text


CommonCrawl Web crawled data
Name	Lang.	S. Rate	Format	Length	Description	Infos
TIMIT	en	16 kHz	float	~5h	Aligned sentences and phonemes.	Paper
Open Speech Data Corpus for German	de	16 kHz	int16	~35h	About 180 native speakers.
Open Source Acoustic Models for German Distant Speech Recognition	de
CMU ARCTIC	en	16 kHz	int16	~14h	18 speaker
Speech Commands Dataset	en	16 kHz	int16	~18h	Stop, Go, Up, Down, ...
CSTR VCTK Corpus	en	48 kHz	int16	~44h	109 speakers
TED-LIUMv1	en	16 kHz	int16	~118h, male: 82h, female: 36h	774 audio talks + transcripts, 666 speakers	Paper
TED-LIUMv2	en	16 kHz	int16	~207h, male: 141h, female: 66h	1495 audio talks + transcripts, 1242 speakers	Paper
LibriSpeech ASR Corpus	en	16 kHz		~1000h	1166 speakers	PDF
Mozilla Common Voice Corpus	en			~500h	~20k speakers	~400k recordings
Dimex1000 Corpus	es
Name	Lang.	S. Rate	Format	Length	Description	Infos
PAVOQUE	de	44.1 kHz	int16	~12h	1 speaker
LJ Speech Dataset	en	22 kHz	in16	~24h	1 speaker
Nancy	en	16 kHz - 96 kHz	int16	~17h	Blizzard Challenge 2011 corpus, 1 speaker	License
Bayers	en				Blizzard Challenge 2013 corpus	License
Usborne	en	44.1 kHz		~6.4h	Blizzard Challenge 2017 corpus	License