Created
September 5, 2012 17:36
-
-
Save KamilaBorowska/3640822 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* My evil JavaScript parser tests. Mostly for syntax highlighters. Note that | |
* those tests only require you to specifically mark regexpes to work. | |
* | |
* Following regexpes should be found and other regexpes shouldn't. | |
* | |
* /1/i | |
* /2; / | |
* /[//*]/ | |
* / 3 / | |
* / \// | |
* /4; /\u0069 | |
* /5/i | |
* /6/i | |
* /7/i | |
* /8/i | |
* /9; / | |
* | |
* Everything in this test is valid ECMAScript. If you don't understand why | |
* certain regexp should or should not be matched, read specification. | |
* | |
* 2012 - GlitchMr | |
*/ | |
// Highlighting any language correctly is not easy. Languages themselves are | |
// rather... ambiguous. And I'm not talking about Perl and Ruby (both languages | |
// aren't statistically parsable - the problem is worse in Perl, but Ruby also | |
// has this problem (functions and variables are parsed differently, a +1 could | |
// a variable plus 1 (addition) or function a() called with argument +1). | |
// | |
// But this isn't about those languages - it's simply impossible to parse them | |
// correctly. The problem is usually that highlighters cannot even highlight | |
// commonly known languages like JavaScript. One of problems is called | |
// "automatic semicolon insertion" - many syntax highlighters aren't aware | |
// that you can write JavaScript without semicolons. The second problem is | |
// that certain constructs (parenthesis/function calls, arrays/object access, | |
// dictionaries/blocks, regexpes/division) are simply ambiguous - the parser | |
// chooses one of those depending on whatever it expected infix operator or | |
// expression. | |
// | |
// In this test, regular expressions were choosed because usually editors | |
// highlight them differently and their syntax is ambiguous. | |
// | |
// Please note that even if this test fails it doesn't matter much - usually | |
// nobody will make code to intentionally break syntax highlighters. Unless | |
// they want to obfuscate code, but if they want, I would use other language, | |
// such as Python. | |
// | |
// Do you want easy language to highlight correctly? Well, try Brainfuck then. | |
// for 'return' which requires to be in function | |
(function () { | |
// Only three variables :). YAY! | |
var regular, notreturn, i = {}; | |
// Regular expression in void context | |
// REGEXP, COMMENT | |
/1/i// | |
// DIVIDE BY, -10, DIVIDE BY, i, COMMENT, implied semicolon | |
/-8/i// | |
// TYPEOF, REGEXP, COMMENT, implied semicolon | |
typeof /2; /// | |
// VARIABLE (notreturn), DIVIDE BY, VARIABLE (regular), ";", COMMENT (///) | |
notreturn /-9; /// | |
// Mysteriously many syntax highlighters fail this test, so it makes sense to | |
// include it. If your does, then it's only serious failure in this test and | |
// something likely to show in real code. | |
// -1, DIVIDE BY, -2, DIVIDE BY, -3, DIVIDE BY, -4, SEMICOLON | |
-1 / -2 / -3 / -4; | |
// [//*] is part of ES5 | |
// REGEXP, multiline comment | |
/[//*]//**/ | |
// SEMICOLON | |
; | |
// REGEXP (with space), DIVIDE BY, REGEXP (space and escaped "/" character), COMMENT, implied semicolon | |
// v---------| | | | |
// v------------- | | |
// v--------------------------------- | |
/ 3 // / \// // | |
// IF, "(", REGEXP, COMMENT | |
if(/4; /\u0069//) | |
// ENDIF, REGEXP, semicolon | |
)/5/i; | |
// [][0], multiline comment | |
[][0]/* | |
// end of multiline comment, implied semicolon, ++, REGEXP (with i modifier), source property, semicolon | |
*/++/6/i.source; | |
// [][0], multiline comment, ++, DIVIDE BY, -9, DIVIDE BY, i.source, SEMICOLON | |
[][0]/**/++/-10/i.source; | |
/7/i in/8/i | |
// Return of the regular expression | |
// RETURN, REGEXP, COMMENT, implied semicolon | |
return /9; /// | |
}()) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment