Skip to content

Instantly share code, notes, and snippets.

@leng-yue
Created November 14, 2023 09:26
Show Gist options
  • Save leng-yue/7b06cd380d4db765f5ee561fe8520271 to your computer and use it in GitHub Desktop.
Save leng-yue/7b06cd380d4db765f5ee561fe8520271 to your computer and use it in GitHub Desktop.
Random split audio dataset
FOLDERS=(
"./AiDataTang/"
"./AiShell/"
"./AiShell-3/"
"./Genshin/"
"./LibriTTS_R/"
"./StarRail/"
)
# Clear filelist.txt
echo "" > filelist.txt
for folder in "${FOLDERS[@]}"; do
find $folder -iname "*.wav" -o -iname "*.flac" >> filelist.txt
echo "Listed $folder"
done
# Random choice 100 lines to valid and others to train
shuf filelist.txt | uniq > filelist.shuf.txt
echo "Found $(wc -l filelist.shuf.txt) files, spliting..."
head -n 100 filelist.shuf.txt > filelist.split.valid
tail -n +101 filelist.shuf.txt > filelist.split.train
rm filelist.txt filelist.shuf.txt
echo "Done."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment