Skip to content

Instantly share code, notes, and snippets.

@michahell
Last active August 29, 2015 14:24
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save michahell/a55688a3c37938353c3f to your computer and use it in GitHub Desktop.
Save michahell/a55688a3c37938353c3f to your computer and use it in GitHub Desktop.

Finding duplicate Pinboard bookmarks.

Run this Node script against your exported JSON file, which you'll need to enter in the FILEPATH variable.

The keyfor method returns the relevant part of the URL. You can customize it based on the available properties.

var fs = require('fs'), url = require('url');
// path to your JSON file
var FILEPATH = 'format_json.json';
// See https://developer.mozilla.org/en/docs/Web/API/URL for available URL properties
function keyfor(url) {
return url.hostname.replace(/^www\./,'') + url.path + url.search
};
var index = {};
JSON.parse(
fs.readFileSync(FILEPATH, 'utf8')
).forEach(function(item) {
var key = keyfor(url.parse(item.href));
if (!index[key]) index[key] = [];
index[key].push(item);
});
var duplicates = Object.keys(index).map(function(k){
return this[k];
}, index).filter(function(list) {
return list.length > 1;
}).map(function(list) {
return list.map(function(item) {
return item.href;
}).join('\n');
}).join('\n\n');
console.log(duplicates);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment