Skip to content

Instantly share code, notes, and snippets.

@qwtel
Last active September 9, 2023 13:43
Show Gist options
  • Star 17 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save qwtel/fd82ab097cbe1db50ded9505f183ccb8 to your computer and use it in GitHub Desktop.
Save qwtel/fd82ab097cbe1db50ded9505f183ccb8 to your computer and use it in GitHub Desktop.
[node.js 8+] Recursively get all files in a directory
// Node 8+
// --------------------------------------------------------------
// No external dependencies
const { promisify } = require('util');
const { resolve } = require('path');
const fs = require('fs');
const readdir = promisify(fs.readdir);
const stat = promisify(fs.stat);
async function getFiles(dir) {
const subdirs = await readdir(dir);
const files = await Promise.all(subdirs.map(async (subdir) => {
const res = resolve(dir, subdir);
return (await stat(res)).isDirectory() ? getFiles(res) : res;
}));
return files.reduce((a, f) => a.concat(f), []);
}
// Usage
getFiles(__dirname)
.then(files => console.log(files))
.catch(e => console.error(e));
// Node 10.10+
// --------------------------------------------------------------
// Updated for node 10+ with even more whizbang:
const { resolve } = require('path');
const { readdir } = require('fs').promises;
async function getFiles(dir) {
const dirents = await readdir(dir, { withFileTypes: true });
const files = await Promise.all(dirents.map((dirent) => {
const res = resolve(dir, dirent.name);
return dirent.isDirectory() ? getFiles(res) : res;
}));
return Array.prototype.concat(...files);
}
// Note that starting with node 11.15.0 you can use
// `files.flat()` instead of `Array.prototype.concat(...files)`
// to flatten the array.
// Node 11+
// --------------------------------------------------------------
// The following version uses async iterators.
const { resolve } = require('path');
const { readdir } = require('fs').promises;
async function* getFiles(dir) {
const dirents = await readdir(dir, { withFileTypes: true });
for (const dirent of dirents) {
const res = resolve(dir, dirent.name);
if (dirent.isDirectory()) {
yield* getFiles(res);
} else {
yield res;
}
}
}
// Usage has changed because the return type is now an
// async iterator instead of a promise:
(async () => {
for await (const f of getFiles('.')) {
console.log(f);
}
})()
// I've written more about async iterators here:
// https://qwtel.com/posts/software/async-generators-in-the-wild/
@syzer
Copy link

syzer commented Apr 29, 2019

Thanks!

@wiill
Copy link

wiill commented Jun 15, 2019

Cheers !

@wiill
Copy link

wiill commented Jun 16, 2019

Quick question, just to be sure about this syntax:
const { promisify } = require('util');

Doest it mean that you only get promisify from the utils module?
Is it equal to:
const promisify = require('util').promisify ?

@qwtel
Copy link
Author

qwtel commented Jun 16, 2019

Doest it mean that you only get promisify from the utils module?
Is it equal to:
const promisify = require('util').promisify ?

Yes. https://ponyfoo.com/articles/es6-destructuring-in-depth

@PraGitHub
Copy link

PraGitHub commented Aug 5, 2019

const dirents = await readdir(dir, { withFileTypes: true });
this line will block the thread right ?

In that case, we could have used readDirSync function. Isn't it?

If my thinking is not right then please correct....

@qwtel
Copy link
Author

qwtel commented Aug 5, 2019

No, you got it backwards: readDirSync will block the thread --- and since JS only has one --- the entire application. await readdir will yield to the event loop, potentially letting other code run.

@PraGitHub
Copy link

PraGitHub commented Aug 6, 2019

async function getFiles(dir) {
const subdirs = await readdir(dir);
const files = await Promise.all(subdirs.map(async (subdir) => {
const res = resolve(dir, subdir);
return (await stat(res)).isDirectory() ? getFiles(res) : res;
}));
return files.reduce((a, f) => a.concat(f), []);
}

In the above function, line 3 gets executed only after the complete execution of line 2 (i.e. when readdir returns ) right ?
If yes, then the thread is blocked isn't it ?

@wiill
Copy link

wiill commented Aug 6, 2019

Others please correct me if I’m wrong, following is based on my recent understanding of promises and async/await world 😅

As the readdir fonction is declared and assigned as the promisified version of the async/node style callback fs.readDir function, second line which calls it does not block thread execution, under the hood it is still an async op for reading entry directory

await keyword before a call to a function which returns a promise (or defined as an async function) allow to wait for and get the result returned by the operation executed by the promise itself, and provide a way for coder to have a simpler code syntax and code style familiar and resembling the traditional synchronous way of writing code. But other operations in the background of this code may be executed while «awaiting » for the result

@mrchief
Copy link

mrchief commented Sep 7, 2019

In the above function, line 3 gets executed only after the complete execution of line 2 (i.e. when readdir returns ) right ?

Right.

If yes, then the thread is blocked isn't it ?

No. When it hits line 2, it sees the await and does 2 things:

  • it creates a function.then(<after await code here>) scope
  • continues executing the next instruction in the event loop

await is an easier version of saying myFunction().then(anotherFunction)

So the desugared version of this is:

async function getFiles(dir) {
	const subdirs = await readdir(dir);
	/* after await code here */
}

// desugared

function getFiles(dir) {
	return readdir(dir).then(function(){
		/* after await code here */
	})
}

As you can see, with the desugared version, there is no blocking whatsoever.

@PraGitHub
Copy link

Thanks mrchief.
I understood how it works now.
By the way, where did you get this information from?
It is very hard to surf for this explaintion...

Thanks again.... :)

@mrchief
Copy link

mrchief commented Sep 7, 2019

There's a good amount of explanation in MDN docs. And if you google for it, you can find many online resources covering the same topic as well.

@hermosillaeveris
Copy link

hermosillaeveris commented Sep 24, 2019

For example "Node 10.10+":

// Node 10.10+
// --------------------------------------------------------------
// Updated for node 10+ with even more whizbang:

const { resolve } = require('path');
const { readdir } = require('fs').promises;

async function getFiles(dir) {
  const dirents = await readdir(dir, { withFileTypes: true });
  const files = dirents.map((dirent) => {
    const res = resolve(dir, dirent.name);
    return dirent.isDirectory() ? getFiles(res) : res;
  });
  return Array.prototype.concat(...files);
}

That doesn't work properly because it returns unresolved promises. This should work:

async function getFiles(dir) {
  const dirents = await readdir(dir, { withFileTypes: true });

  const files = await Promise.all(dirents.map((dirent) => {
    const res = resolve(dir, dirent.name);
    return dirent.isDirectory() ? getFiles(res) : res;
  }));

  return Array.prototype.concat(...files);
}

@qwtel
Copy link
Author

qwtel commented Jan 3, 2020

That doesn't work properly because it returns unresolved promises. This should work:

What an oversight. I've fixed it in the gist. Thanks for posting a correction.

@cleidigh
Copy link

cleidigh commented Jan 16, 2020

@qwtel
Great article/examples. Despite having years of embedded multitasking OS development background, Promises just have so many gotchas and ambiguities from my perspective. Still anyway question:

I'm thoroughly confused about the following statements:

const res = resolve(dir, subdir)

Everything I have read is very clear that Promise.resolve() can only return one value. Using any object/array can obviously get around that.
What paradigm allows the above resolve with two parameters? More importantly resolve is undefined (executing under Thunderbird)
with a directory iterator so I had to do some adaptation, but I don't think the resolve question changes.
Thanks
@cleidigh

@mrchief
Copy link

mrchief commented Jan 16, 2020

@cleidigh resolve here is actually path.resolve which takes multiple paths and returns a single value.

Note the const { resolve } = require('path'); in the snippet.

@cleidigh
Copy link

@mrchief
Thanks! I got completely hung up on the resolve and missed the reference to the path module.
I'm trying to adapt this using a Mozilla Thunderbird DirectoryInterator as opposed to readDir.
I have been trying every possible combination of promise handling , but I am obviously flummoxed.
Removing the unnecessary path combination in my case still fails.
I'm going to keep experimenting but I may post my best if I can't get it. This approach is certainly what I need to use just with some tweaks that Confound me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment