Skip to content

Instantly share code, notes, and snippets.

@fcamblor
Last active December 20, 2019 15:44
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fcamblor/6b99c34c18d6ff354337b8f7d77f6c74 to your computer and use it in GitHub Desktop.
Save fcamblor/6b99c34c18d6ff354337b8f7d77f6c74 to your computer and use it in GitHub Desktop.
Variables extractor

Idea is to provide a working implementation for extractVariablesFrom(str) js function. Basically, this function is intended to extract variable names from a string, considering that variable name should be surrounded with double or triple braces (mixing not allowed) and name should not contain any space. Obviously, variable may appear anywhere in the string (at the start/end typically, or "glued" to another variable).

Background : I translate an application into many languages. A translation can either be a raw string or a handlebar template string. In latter case, I give to my translator the raw template string (with some handlebar variables embedded into it) and I expect them to NOT alter this template string (typically, don't remove braces nor translate variable names). I'm going to use the extractVariablesFrom(str) function to compare the variables I have in the message to translate with the variable I have in the translated message : if I detect differences, then I want to raise a red flag to the translator.

function extractVariablesFrom(str) {
// Sample non-working implementation
var variableNameExtractorRegex = new RegExp(/\{\{\{([^[{}]*)\}\}\}|[^{]\{\{([^{}]*)\}\}[^}]/gi);
var variableNames = [], match;
while(match = variableNameExtractorRegex.exec(str)){ variableNames.push(match[1] || match[2]); }
return variableNames;
}
function expectVariables(name, str, expectedVariableNames) {
var actual = JSON.stringify(extractVariablesFrom(str));
var expected = JSON.stringify(expectedVariableNames);
if(actual !== expected) {
console.error(`Expectation type [${name}] failed for [${str}] : actual=${actual}, expected=${expected}`);
}
}
expectVariables("Simple cases", " {{a}} {{b}} {{c}} ", ["a", "b", "c"]);
expectVariables("Simple cases", " blah {{a}} blah {{b}} blah {{c}} blah ", ["a", "b", "c"]);
expectVariables("Simple cases", " blah {{{a}}} blah {{{b}}} blah {{{c}}} blah ", ["a", "b", "c"]);
expectVariables("Simple cases", " blah {{{a}}} blah {{{b}}} blah {{{c}}} blah ", ["a", "b", "c"]);
expectVariables("Invalid variables cases", " {{ }} ", []);
expectVariables("Invalid variables cases", " {{{ }}} ", []);
expectVariables("Invalid variables cases", " {{{a}} {{b}}} {{c{{ }}d}} {{e}}} ", []);
expectVariables("Starting/ending cases", "{{a}} {{c}}", ["a", "c"]);
expectVariables("Starting/ending cases", "{{{a}}} {{{c}}}", ["a", "c"]);
expectVariables("Glue cases", " {{a}}{{b}} ", ["a", "b"]);
expectVariables("Glue cases", " {{{a}}}{{{b}}} ", ["a", "b"]);
expectVariables("Glue cases", "{{a}}{{b}}", ["a", "b"]);
expectVariables("Glue cases", "{{{a}}}{{{b}}}", ["a", "b"]);
@fcamblor
Copy link
Author

Working solution :

function extractVariablesFrom(str) {
  var variableNameExtractorRegex = new RegExp(
    // Triple braces ... the easy part
    "\{\{\{([^ {}]+)(?=\}\}\})" +
    "|" +
    // Double braces ... the complex one if we want to avoid mixing double & triple braces
      // First char must either be starting line or something different than "{"    
      "(?:^|[^{])" +
      // Then our {{varName matcher...
      "\{\{([^ {}]+)" + 
      // ..including lookahead for }} braces
      // lookahead is important here, otherwise the next matching context will start *after* the
      // character just after the ending braces, meaning that "gluing" variables will not match :
      // Let's take this example "xxx{{foo}}{{bar}}yyy"
      //  => First matching group will be analysed against "x{{foo}}{" string... that will match
      //  Then next processed string will be "{bar}}yyy" which doesn't match a double-brace variable
      "(?=\}\}" +
        // Last char must either be ending line or something different than "}"
        "(?:[^}]|$)" +
      ")",
    "gi");
  var variableNames = [], match;
  while(match = variableNameExtractorRegex.exec(str)){ variableNames.push(match[1] || match[2]); }
  return variableNames;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment