Skip to content

Instantly share code, notes, and snippets.

@PaulMougel
Last active September 5, 2016 01:13
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save PaulMougel/8111810 to your computer and use it in GitHub Desktop.
Save PaulMougel/8111810 to your computer and use it in GitHub Desktop.
MongoDB document insertion from a CSV stream using async.queue
var csv = require('csv');
var async = require('async');
var fs = require('fs');
var MongoClient = require('mongodb').MongoClient;
MongoClient.connect('mongodb://localhost:27017/csvdb', function(err, db) {
if (err) throw err;
var collection = db.collection('myCSVs');
var queue = async.queue(collection.insert.bind(collection), 5);
csv()
.from.path('./input.csv', { columns: true })
.transform(function (data, index, cb) {
queue.push(data, function (err, res) {
if (err) return cb(err);
cb(null, res[0]);
});
})
.on('error', function (err) {
console.log('ERROR: ' + err.message);
})
.on('end', function () {
queue.drain = function() {
collection.count(function(err, count) {
console.log('Number of documents:', count);
db.close();
});
};
});
});
~/tmp ❯❯❯ head input.csv
a,b,c
1,2,3
4,5,6
1,2,3
1,2,3
1,2,3
1,2,3
1,2,3
1,2,3
1,2,3
~/tmp ❯❯❯ node index.js
Number of documents: 78360
@sobharani
Copy link

I was tried this code.Very help full to me.But its taking more time 4Ldata(1 hour).
can u please share me it is more than 4L +data with less time(5 to 10 min)

@PaulMougel
Copy link
Author

On line 10, try increasing the number of concurrent insert requests (which I set to 5).

@santhiraj
Copy link

Dear Paul

I get below error while running this...

/home/action/workspace/uploadApp/node_modules/mongodb/lib/mongodb/mongo_client.js:475
throw err
^
TypeError: object is not a function
at /home/action/workspace/uploadApp/server.js:12:2
at /home/action/workspace/uploadApp/node_modules/mongodb/lib/mongodb/mongo_client.js:472:11
at process._tickCallback (node.js:415:13)

Can you plz help what would have gone wrong?

rgds
santhiraj

@nalinnarayan
Copy link

same "object is not a function" error here.Any help Paul??

@nalinnarayan
Copy link

Found out what caused the trouble.It is the latest version csv 0.4.It makes use of csv as an object instead of a function.Installing version 0.3.7 helps. I did it using "npm install csv@0.3.7"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment