Celeba dataset as explained here:
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
- img_celeba.gz: contains unaligned "in the Wild" images
- originally, the dataset was released as a .7z archive splitted into 14 subfiles (.7z.001 ... .7z.014).
- the problem is that unpacking .7z on linux is not parallelized on linux (https://unix.stackexchange.com/questions/210671/7-zip-slows-down-over-time-on-ubuntu-but-not-windows)
- it is therefore significantly faster to
- download the files
- extract them on windows