Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@mikeal
Last active June 14, 2022 12:36
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save mikeal/e87fae29728fea1761b7 to your computer and use it in GitHub Desktop.
Save mikeal/e87fae29728fea1761b7 to your computer and use it in GitHub Desktop.
The easiest way to get comments out of any code file... seriously?!?
var highlight = require('highlight.js')
var cheerio = require('cheerio')
var strip = ['/', '#', ' ', '*', "<", ">", '-', '\\']
function getComments (str) {
var html = highlight.highlightAuto(str).value
var $ = cheerio.load(html)
var lines = $('span.hljs-comment').map(function(i, el) {return $(this).text();}).get()
return lines.map(function (l) {
while (l.length && strip.indexOf(l[0]) !== -1) {
l = l.slice(1)
}
return l
})
}
module.exports = getComments
@mikeal
Copy link
Author

mikeal commented Feb 27, 2016

After lots of investigation I figured out that this is actually the easiest way in Node.js to get comments out of code files written in any language.

It's pretty ridiculous. It literally involves running the code through a library that spits out HTML with highlight classes, then parsing that html in cheerio and getting the text in the right spans.

There's a few dozen web based code editors and Atom and VS Code are both in JavaScript, but their parsing code is so deeply embedded in each product that it's impossible to rip out. There's also like 3 highlighting libraries in JS, including this one, but the parse trees they create internally are more difficult to use than parsing the HTML output and there's no way to just get the parse tree w/o the HTML so it's not like you save on performance much by not just using the HTML output.

@ahmadnassri
Copy link

curious, did you investigate doing something on the engine level? surely that's more efficient and guaranteed to capture all sorts of comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment