Skip to content

Instantly share code, notes, and snippets.

@rgsoto
Created December 10, 2019 17:49
Show Gist options
  • Save rgsoto/c8d3ca24425eb5376ce9f7c47df55670 to your computer and use it in GitHub Desktop.
Save rgsoto/c8d3ca24425eb5376ce9f7c47df55670 to your computer and use it in GitHub Desktop.
wikp
#!/usr/local/bin/node
// Returns the paragraphs from a wikipedia link, stripped of reference numbers.
let request = require("request");
let url = process.argv[2];
const jsdom = require("jsdom");
const { JSDOM } = jsdom;
request(url, function(error, response, body) {
//simulate a document object model.
let { document } = (new JSDOM(body)).window;
// Grab all the paragraphs and references.
let paragraphs = document.querySelectorAll("p");
let references = document.querySelectorAll(".reference");
// Remove any references.
references.forEach(function(reference){
reference.remove();
});
// Print out all of the paragraphs.
paragraphs.forEach(function(paragraph) {
console.log(paragraph.textContent);
})
});
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment