Skip to content

Instantly share code, notes, and snippets.

@teles
Last active March 16, 2022 20:51
Show Gist options
  • Save teles/cc1d25a2731edf6504be338d11f444b2 to your computer and use it in GitHub Desktop.
Save teles/cc1d25a2731edf6504be338d11f444b2 to your computer and use it in GitHub Desktop.
Divide um arquivo CSV em várias partes, salvando
#!/bin/bash
# Uso: ./splitcsv.sh "nome-do-arquivo.csv" [numero-de-linhas-por-arquivo]
# Divide um arquivo csv em vários
filename="$1"
total_file_lines=$(cat "$filename" | wc -l)
chunk_size=${2:-10000}
number_of_files_to_create=$(( total_file_lines / chunk_size + 1))
filename_without_extension=$(echo $filename | sed 's/\.[^.]*$//')
mkdir -p "$filename_without_extension"
echo "Arquivo: ${filename}"
echo "Total de linhas: ${total_file_lines}"
echo "Arquivos a serem gerados: ${number_of_files_to_create}"
echo -e "Máximo de linhas por arquivo: ${chunk_size}\n"
for i in $(seq 0 $number_of_files_to_create); do
start=$(( i * chunk_size + 2 ))
end=$(( (i + 1) * chunk_size + 1 ))
new_file_name="${filename_without_extension}/${i}.csv"
sed -n "1p;$start,$end p" "$filename" > "$new_file_name"
echo "Arquivo \"${new_file_name}\" criado. Linhas: ${start}...${end}"
done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment