Skip to content

Instantly share code, notes, and snippets.

@zupo
Created June 24, 2013 12:56
Show Gist options
  • Star 67 You must be signed in to star a gist
  • Fork 26 You must be signed in to fork a gist
  • Save zupo/5849843 to your computer and use it in GitHub Desktop.
Save zupo/5849843 to your computer and use it in GitHub Desktop.
Split a folder with many files into subfolders with N files. Usage: python folder_splitter.py path/to/target/folder
# -*- coding: utf-8 -*-
# @author: Peter Lamut
import argparse
import os
import shutil
N = 10 # the number of files in seach subfolder folder
def move_files(abs_dirname):
"""Move files into subdirectories."""
files = [os.path.join(abs_dirname, f) for f in os.listdir(abs_dirname)]
i = 0
curr_subdir = None
for f in files:
# create new subdir if necessary
if i % N == 0:
subdir_name = os.path.join(abs_dirname, '{0:03d}'.format(i / N + 1))
os.mkdir(subdir_name)
curr_subdir = subdir_name
# move file to current dir
f_base = os.path.basename(f)
shutil.move(f, os.path.join(subdir_name, f_base))
i += 1
def parse_args():
"""Parse command line arguments passed to script invocation."""
parser = argparse.ArgumentParser(
description='Split files into multiple subfolders.')
parser.add_argument('src_dir', help='source directory')
return parser.parse_args()
def main():
"""Module's main entry point (zopectl.command)."""
args = parse_args()
src_dir = args.src_dir
if not os.path.exists(src_dir):
raise Exception('Directory does not exist ({0}).'.format(src_dir))
move_files(os.path.abspath(src_dir))
if __name__ == '__main__':
main()
@bonny1992
Copy link

That was exactly was I was searching for! Thank you!

@ace139
Copy link

ace139 commented Apr 20, 2017

Very helpful snippet.

@jjsahalf
Copy link

Thank you very much. It is very helpful.

@andrewteal
Copy link

I'm new to learning Python and want to use this snippet for a large project. Where do I replace my file path in the code for this to run on a specified folder? Also, when copying the directory into the code, is C:\Users... the correct format?

Thanks!

@tarvos21
Copy link

Great, very helpful script!

Hi, @HeresTheTeal, you could place the code into a script in the same folder of all the files, say split.py, then execute python split.py ./, and it will create subfolders and put all others files into them separately, by default, 10 files per folder, which is decided by the variable N in the code.

@kaiqiangh
Copy link

Thanks for sharing. By the way, is that for randomly selecting?

@hyzyla
Copy link

hyzyla commented Dec 9, 2019

With python3, I got error

    subdir_name = os.path.join(abs_dirname, '{0:03d}'.format(i / N + 1))
ValueError: Unknown format code 'd' for object of type 'float'

Fixed by updating line 22:

subdir_name = os.path.join(abs_dirname, '{0:03d}'.format(i // N + 1))

@s4nyam
Copy link

s4nyam commented May 18, 2020

I had 700+ files sorted likewise CommonName - (1), CommonName - (2), CommonName - (3), CommonName - (4), CommonName - (5), CommonName - (6), CommonName - (7)......CommonName - (756), in one folder

Now I have 11 folders as I delected N=70. The only things is , the files are randomly selected. All files are rndomly thrown to folders.

Rest works smoothly.

Thanks a lot for sharing code. Please try to add this function too. Thanks again.

@avabelieve
Copy link

I got the following error:
UnboundLocalError: local variable 'subdir_name' referenced before assignment

Should I increase the indentation of the folllowing?
f_base = os.path.basename(f)
shutil.move(f, os.path.join(subdir_name, f_base))

@s4nyam
Copy link

s4nyam commented Jul 21, 2020

Try debugging it, you are skipping line 22

@odelmalv
Copy link

Thank you! Error got fixed with @hyzyla comment. Thank you. It could be good if files get transfer in order

@jbaldw
Copy link

jbaldw commented Sep 18, 2020

I had over 200,000 images that needed to be put in separate folders for a machine learning program so this was an excellent fix. I had the same unknown format code error that I was able to fix by editing Line 22 thanks to @hyzyla.

@fyse-nassar
Copy link

I had 700+ files sorted likewise CommonName - (1), CommonName - (2), CommonName - (3), CommonName - (4), CommonName - (5), CommonName - (6), CommonName - (7)......CommonName - (756), in one folder

Now I have 11 folders as I delected N=70. The only things is , the files are randomly selected. All files are rndomly thrown to folders.

Rest works smoothly.

Thanks a lot for sharing code. Please try to add this function too. Thanks again.

Adding the below snippet on line 15 will help in sorting the file according to the name.
files.sort()
But this would do a lexicographical sort and not a natural sorting.

@imTHAI
Copy link

imTHAI commented May 8, 2021

It works perfectly, thanks, I needed to split a folder with +450000 French epubs.

@ceres99
Copy link

ceres99 commented May 11, 2021

This just spared me from creating 50000 folders manually and moving each image into them separately. Real life saver! Thanks awfully!

@ajeema
Copy link

ajeema commented Jul 29, 2021

If anyone has sequential images that you want to keep order, replace line 14 with:

files = sorted([os.path.join(abs_dirname, f) for f in os.listdir(abs_dirname)])

@atindra50
Copy link

With python3, I got error

    subdir_name = os.path.join(abs_dirname, '{0:03d}'.format(i / N + 1))
ValueError: Unknown format code 'd' for object of type 'float'

Fixed by updating line 22:

subdir_name = os.path.join(abs_dirname, '{0:03d}'.format(i // N + 1))

Thanks @hyzyla ... This edit worked for me in python3.8
Thanks @zupo ... I was searching for the exact script ...

Thanks

@viswanathanTJ
Copy link

This is awesome. My requirement is exactly similar to this. I want to move upto 200 files into each folder but with file start characters by nesting it.
For example,
If filename starts with 'absolute.txt', 'ab.txt', 'bingo.txt' it would be moved to "a" folder after that inside "a" folder, if files are more than 200 it need to create another folder inside "a" as "aa",ab" and so on, after that inside "aa" if there is more than 200 files inside "aa" folder it would need to create "aaa","aab" and so on.
This will make organize easy to get to required file

@zupo
Copy link
Author

zupo commented Sep 23, 2021

I posted this snippet years back and people still use it, can't believe it. 😄

In any case, if you are on MacOS, checkout my new thing: https://paretosecurity.app/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment