Skip to content

Instantly share code, notes, and snippets.

@cabarbato
Created January 8, 2021 20:26
Show Gist options
  • Save cabarbato/f18b22ea8442850cf3931874c350ff40 to your computer and use it in GitHub Desktop.
Save cabarbato/f18b22ea8442850cf3931874c350ff40 to your computer and use it in GitHub Desktop.
A script that concatenates all csv's within all subfolders
import os
import csv
input_path = './input'
output_path = './output'
# If you don't have an .env set up, just pass a string here of what you want the first column name to be.
# Ideally it's descriptive of what the folders are organized by. For example, if the folders are all named after
# different months in a calendar year, "months" would be a good string to pass.
folder_type = os.environ["FOLDER_TYPE"]
type_name = folder_type
def parseCSV(subdir, file):
with open(subdir + "/" + file, 'r') as input_file:
lines = csv.reader(input_file)
with open(output_path + "/concat.csv", 'a') as output_file:
csvwriter = csv.writer(output_file)
for line in lines:
csvwriter.writerow([type_name] + line)
output_file.flush()
for subdir, dirs, files in os.walk(input_path):
for file in files:
# If the subdirectory is within the input folder and not at its root,
# replace the first column value with the folder name.
if subdir != input_path: type_name = subdir.replace(input_path + "/", "")
if '.csv' in file: parseCSV(subdir, file)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment