Skip to content

Instantly share code, notes, and snippets.

@isaacs
Forked from laverdet/rimraf.js
Created August 8, 2011 03:05
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save isaacs/1131136 to your computer and use it in GitHub Desktop.
Save isaacs/1131136 to your computer and use it in GitHub Desktop.
rimraf with futures
module.exports = rimraf
var path = require("path")
, fs = require("fs")
// for EBUSY handling
var waitBusy = {}
// for EMFILE handling
var resetTimer = null
, timeout = 0
function rimraf (p, opt, cb_) {
if (typeof cb_ !== "function") cb_ = opt, opt = {}
opt.maxBusyTries = opt.maxBusyTries || 3
rimraf_(p, function cb (er) {
if (er) {
if (er.message.match(/^EBUSY/)) {
// windows is annoying.
if (!waitBusy.hasOwnProperty(p)) waitBusy[p] = opt.maxBusyTries
if (waitBusy[p]) {
waitBusy[p] --
// give it 100ms more each time
var time = (opt.maxBusyTries - waitBusy[p]) * 100
return setTimeout(function () { rimraf_(p, cb) }, time)
}
}
if (er.message.match(/^EMFILE/)) {
setTimeout(function () {
rimraf_(p, cb)
}, timeout ++)
return
}
}
timeout = 0
cb_(er)
})
}
function rimraf_ (p, cb) {
fs.lstat(p, function (er, s) {
if (er) return cb()
if (!s.isDirectory()) return fs.unlink(p, cb)
fs.readdir(p, function (er, files) {
if (er) return cb(er)
asyncForEach(files.map(function (f) {
return path.join(p, f)
}), function (f, cb) {
rimraf(f, opt, cb)
}, function (er) {
if (er) return cb(er)
fs.rmdir(p, cb)
})
})
})
}
// this is the flow control util
function asyncForEach (list, fn, cb) {
if (!list.length) cb()
var c = list.length
, errState = null
list.forEach(function (item, i, list) {
fn(item, function (er) {
if (errState) return
if (er) return cb(errState = er)
if (-- c === 0) return cb()
})
})
}
var path = require('path'),
fs = require('fs'),
Future = require('fibers/future');
// Create future-returning fs functions
var fs2 = {};
for (var ii in fs) {
fs2[ii] = Future.wrap(fs[ii]);
}
// Return a future which just pauses for a certain amount of time
function timer(ms) {
var future = new Future;
setTimeout(function() {
future.return();
}, ms);
return future;
}
var timeout = 0;
var rimraf = module.exports = function(p, opts) {
opts = opts || {};
opts.maxBusyTries = opts.maxBusyTries || 3;
var busyTries = 0;
while (true) {
try {
try {
var stat = fs2.lstat(p).wait();
} catch (ex) {
continue;
}
if (!stat.isDirectory()) return fs2.unlink(p).wait();
var rimrafs = fs2.readdir(p).wait().map(function(file) {
return rimraf(path.join(p, file), opts);
});
Future.wait(rimrafs);
fs2.rmdir(p).wait();
timeout = 0;
return;
} catch (ex) {
if (ex.message.match(/^EMFILE/)) {
timer(timeout += 10).wait();
} else if (ex.message.match(/^EBUSY/) && busyTries < opt.maxBusyTries) {
timer(++busyTries * 100).wait();
} else {
throw ex;
}
}
}
}.future();
@laverdet
Copy link

laverdet commented Aug 8, 2011

Think that continue in the lstat catch should just be a return, no? Otherwise if you run into the mentioned lstat race condition it's an infinite loop of stat'ing a file that doesn't exist.

@isaacs
Copy link
Author

isaacs commented Aug 8, 2011

@laverdet Oh, right. What is that while(true) even doing there?

@laverdet
Copy link

laverdet commented Aug 8, 2011

The while(true) is because it needs to keep retrying the operation in case of failure (EBUSY, EMFILE). But the logic for when to abort trying in failure isn't simple; I can't just do for (var ii = 0; ii < tries; ++ii), so it's handled in the top catch block. The two ways out of the function are returning on line 40, or (re)throwing out in line 47.

@isaacs
Copy link
Author

isaacs commented Aug 8, 2011

Aha, I see. Even in sync-land JS, I'd probably do that with a function that calls itself, but whatever. One pseudo-goto is as good as another :)

@laverdet
Copy link

laverdet commented Aug 8, 2011

Yeah making it self-invoke might be cleaner. I don't usually implement "keep retrying in the case of failure" applications because it tends to sharply increase the severity of the thundering herd problem, so I haven't had a chance to experiment with the pros\cons of different styles. Most of my applications just do something else in the case of failure, and usually I don't handle the error from within the function it occurred.

@isaacs
Copy link
Author

isaacs commented Aug 8, 2011

Oh, actually, since this isn't losing the stack, a recursive solution might suck, due to JS's lack of TCO.

I don't usually implement "keep retrying in the case of failure" applications because it tends to sharply increase the severity of the thundering herd problem

Yep. That's why the EMFILE handler backs off progressively.

An EBUSY would indicate some kind of serious failure, except on Cygwin, where it happens almost every time you try to remove more than 10 or so files in rapid succession. That's why it only tries 3 times, instead of continually getting slower and slower forever.

It'd probably be a good idea for the EMFILE handler to have some kind of limit, as well. You could run into some leaky situations where you open MAX_FILE_DESCRIPTORS files and never close them, and then it keeps trying forever. Since I frequently have to remove fairly large folders (dozens of levels deep, thousands of files in each level, etc.) npm bumps into MAX_FILE_DESCRIPTORS quite often, and usually the right thing to do is just back off. Maybe I'll cap the timeout to 1000, which should be suitable for all non-pathological scenarios.

@isaacs
Copy link
Author

isaacs commented Aug 8, 2011

Oh, I should have asked already, but can I assume you have no problem including this in the rimraf repo under the MIT license?

@laverdet
Copy link

laverdet commented Aug 8, 2011 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment