Listing of Data Sets that are free to use under certain circumstances.
- MNIST 70,000; 28x28x1 images of handwritten digits
- CIFAR-10 60,000; 32x32 colour images in 10 classes
- CIFAR-100 60,000; 32x32 colour images in 100 classes
- STL-10 10 classes. Images are 96x96 pixels, color.
- ILSVRC Large Scale Visual Recognition Challenge
- SVHDN Street View House Numbers
- BelgiumTS Dataset Traffic Signs
- GTSRB Traffic Signs
Name | Lang. | S. Rate | Format | Length | Description | Infos |
---|---|---|---|---|---|---|
TIMIT | en | 16 kHz | float | ~5h | Aligned sentences and phonemes. | Paper |
Open Speech Data Corpus for German | de | 16 kHz | int16 | ~35h | About 180 native speakers. | |
Open Source Acoustic Models for German Distant Speech Recognition | de | |||||
CMU ARCTIC | en | 16 kHz | int16 | ~14h | 18 speaker | |
Speech Commands Dataset | en | 16 kHz | int16 | ~18h | Stop, Go, Up, Down, ... | |
CSTR VCTK Corpus | en | 48 kHz | int16 | ~44h | 109 speakers | |
TED-LIUMv1 | en | 16 kHz | int16 | ~118h, male: 82h, female: 36h | 774 audio talks + transcripts, 666 speakers | Paper |
TED-LIUMv2 | en | 16 kHz | int16 | ~207h, male: 141h, female: 66h | 1495 audio talks + transcripts, 1242 speakers | Paper |
LibriSpeech ASR Corpus | en | 16 kHz | ~1000h | 1166 speakers | ||
Mozilla Common Voice Corpus | en | ~500h | ~20k speakers | ~400k recordings | ||
Dimex1000 Corpus | es |
Name | Lang. | S. Rate | Format | Length | Description | Infos |
---|---|---|---|---|---|---|
PAVOQUE | de | 44.1 kHz | int16 | ~12h | 1 speaker | |
LJ Speech Dataset | en | 22 kHz | in16 | ~24h | 1 speaker | |
Nancy | en | 16 kHz - 96 kHz | int16 | ~17h | Blizzard Challenge 2011 corpus, 1 speaker | License |
Bayers | en | Blizzard Challenge 2013 corpus | License | |||
Usborne | en | 44.1 kHz | ~6.4h | Blizzard Challenge 2017 corpus | License |
Name | Lang. | S. Rate | Format | Length | Description | Infos |
---|---|---|---|---|---|---|
TüBa-D/S | de |
Name | S. Rate | Format | Description | Infos |
---|---|---|---|---|
Macaulay Library | Animal sounds and photos |
- Kaggle Datasets
- NLTK
- Open Data StackExchange
- deeplearning.net
- CommonCrawl Web crawled data