Skip to content

Instantly share code, notes, and snippets.

@DesignByOnyx
Last active September 21, 2022 03:31
Show Gist options
  • Star 3 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save DesignByOnyx/05c2241affc9dc498379e0d819c4d756 to your computer and use it in GitHub Desktop.
Save DesignByOnyx/05c2241affc9dc498379e0d819c4d756 to your computer and use it in GitHub Desktop.
JS Comments REGEX - Failing Cases

This shows use cases where a simple regex like the one on StackOverflow cannot be relied upon for 100% accuracy in detecting comments in code.

Case 1 - comment-like characters within a string:

var foo = "There's no way to tell that this /* is not the beginning of a comment";
var bar = "There's no way to tell that this */ is not the end of a comment";
var baz = "There's no way to tell that this // is not a single line comment";
var buz = "Matters get much worse when there are escaped \" quotes /* inside the string. Definitely need a parser.";
var fiz = `And there is
really no way to // detect that these are
/* not real comments */
within a JS template literal.
`;

Case 2 - the dangling property value

var foo = {
    bar:// regex cannot distinguish "bar://" from "http://"
       "the value for bar is dangling down here - this is valid"
};

Case3 - the commonly used glob pattern: /**/

{
  include: [
    "src/**/*.js
  ],
  exclude: [
    "src/**/*.test..js
  ]
}
@lewispham
Copy link

lewispham commented Sep 2, 2016

It fails on parsing this code x.replace(/\//g, '/');

@DesignByOnyx
Copy link
Author

Thank you for the comment @Tresdin - I have updated the regex to allow for that type of code, and I updated the test page to reflect this change. Thanks for helping improve this amazing regex!!

@kupietools
Copy link

kupietools commented Aug 16, 2019

It deleted a needed closing brace that wasn't part of a comment, immediately preceding a double-slashed comment. In fact, looking at the demo link provided on SO, it also incorrectly removes a comma preceding a double-slashed quote, within the original poster's own test data. And in my own code, it's removing preceding line breaks. It looks like it indiscriminately removes the character before a // comment.

@DesignByOnyx
Copy link
Author

If you read the SO answer in full, I am very clear about how to keep from losing the character in front of double slashes. I've been dealing with people not reading lately and it's getting frustrating. You MUST use backreference $1 in your replacement value. If you use an empty string, you will lose every character immediately preceding a double slash.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment