Skip to content

Instantly share code, notes, and snippets.

@sebinsua
Last active October 23, 2020 12:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save sebinsua/3d71bd5afd41a6f09ce9046142d2bbff to your computer and use it in GitHub Desktop.
Save sebinsua/3d71bd5afd41a6f09ce9046142d2bbff to your computer and use it in GitHub Desktop.
Check if triangle strings are valid (e.g. abb, abbccc, abbcccdddd, xyyzzz).
// Check if triangle strings are valid.
//
// e.g.
//
// Valid: abb, abbccc, abbcccdddd, xyyzzz
// Invalid: ab, abbbccc, abbcccddddd, abbccddd
function isTriangleString(str) {
// An empty string or a single character are valid.
if (str.length === 0 || str.length === 1) {
return true;
}
let acceptedChunkSize;
let currentChunkSize;
let previousChar;
for (let idx = 0; idx < str.length; idx++) {
const currentChar = str[idx];
// For the first character we do nothing apart from
// setting up our expectations of the first chunk.
if (idx === 0) {
acceptedChunkSize = 1;
// And describing the current state.
previousChar = currentChar;
currentChunkSize = 1;
continue;
}
// When the current character is different from the previous
// character it implies that a chunk has been finished.
if (previousChar !== currentChar) {
// If the previous chunk was smaller than the accepted size it is invalid.
if (currentChunkSize < acceptedChunkSize) {
return false;
}
// Otherwise, the previous chunk was valid and so we silently
// move to setting up our expectations of the next chunk.
acceptedChunkSize++;
// And describing the current state.
previousChar = currentChar;
currentChunkSize = 1;
continue;
}
// When the current character is the same as the previous character
// we increment the current chunk size, and if it goes higher than
// the accepted chunk size bail as invalid.
currentChunkSize++;
if (currentChunkSize > acceptedChunkSize) {
return false;
}
}
// We must cover the case in which the string appears to be a
// triangle string so far but then completes early.
if (currentChunkSize < acceptedChunkSize) {
return false;
}
return true;
}
@sebinsua
Copy link
Author

@sebinsua
Copy link
Author

sebinsua commented Oct 21, 2020

If we are dealing with extremely large or infinite strings streamed in a character (or a few) at a time, it would probably be a good idea to use an async-generator.

// Check if triangle strings are valid.
//
// e.g.
//
// Valid: abb, abbccc, abbcccdddd, xyyzzz
// Invalid: ab, abbbccc, abbcccddddd, abbccddd
async function isTriangleString(str) {
  let first = true;
  let acceptedChunkSize = 1;
  let currentChunkSize;
  let previousChar;
  for await (let currentChar of str) {
    // For the first character we do nothing apart from
    // describing the current state.
    if (first) {
      previousChar = currentChar;
      currentChunkSize = 1;

      // And ensuring that this block never executes again.
      first = false;
      continue;
    }

    // When the current character is different from the previous
    // character it implies that a chunk has been finished.
    if (previousChar !== currentChar) {
      // If the previous chunk was smaller than the accepted size it is invalid.
      if (currentChunkSize < acceptedChunkSize) {
        return false;
      }

      // Otherwise, the previous chunk was valid and so we silently
      // move to setting up our expectations of the next chunk.
      acceptedChunkSize++;
      // And describing the current state.
      previousChar = currentChar;
      currentChunkSize = 1;
      continue;
    }

    // When the current character is the same as the previous character
    // we increment the current chunk size, and if it goes higher than
    // the accepted chunk size bail as invalid.
    currentChunkSize++;
    if (currentChunkSize > acceptedChunkSize) {
      return false;
    }
  }

  // We must cover the case in which the string appears to be a
  // triangle string so far but then completes early.
  //
  // We only check this if the loop has had at least one iteration.
  if (!first && currentChunkSize < acceptedChunkSize) {
    return false;
  }

  return true;
}

async function* string(str) {
  for (let char of str) {
    yield char;
  }
}

async function test() {
  console.log("isTriangleString", await isTriangleString(string("abbccc")));
}

test();

CodeSandbox

@sebinsua
Copy link
Author

sebinsua commented Oct 21, 2020

It seems there are heuristics to check whether a string can even be a triangle string and that these would speed everything up (e.g. detect invalid lengths given the chunking increments, check first and last elements of each chunk which we should expect to be positioned in particular places, etc.)

Doing extra work in the situation in which the heuristic is not able to invalidate the string would worsen the performance, but I reckon in most situations it'd improve the performance when a string has an incorrectly sized/positioned chunk or a bad total length (which might be common).

To be more clear, there is a sequence here, and on closer inspection it seems to be a triangular number sequence. It should be possible to use the inverse of this to be more efficient at validation when there are likely to be lots of invalid strings.

// Check if triangle strings are valid.
//
// e.g.
//
// Valid: abb, abbccc, abbcccdddd, xyyzzz
// Invalid: ab, abbbccc, abbcccddddd, abbccddd
function isTriangleString(str) {
  const length = str.length;

  // An empty string or a single character are valid.
  if (length === 0 || length === 1) {
    return true;
  }

  // There are some optimizations for checking invalid strings.
  // 1. We can check to see if the length of the string is a triangular number
  //    and fail if it isn't.
  // 2. We can check the first and last character of each chunk in reverse order
  //    since if these are different it implies that either an earlier chunk was
  //    the wrong size or the string has been tampered.
  //
  // The optimizations are to detect invalid strings and mean that:
  // 1. Validating a string which is the wrong size is O(1).
  // 2. Validating a string with bad chunk edges is O(N log N).
  // 3. Validating any other string is O(N + N log N).

  // We can calculate triangular numbers using the equation:
  // m = (n * (n + 1)) / 2
  // And the inverse of this is:
  // n = Math.sqrt(2 * n + 1 / 4) - 1 / 2;
  //
  // See: https://math.stackexchange.com/questions/2041988/how-to-get-inverse-of-formula-for-sum-of-integers-from-1-to-n/2041993#2041993
  const chunkCount = Math.sqrt(2 * length + 1 / 4) - 1 / 2;

  // If the number of chunks is not a whole number, then the length
  // would be impossible to achieve if the string was valid.
  if (chunkCount % 1 !== 0) {
    return false;
  }

  // If the expected chunks for a given length do not have first and last characters
  // that match and are in the expected place then the string is invalid.
  let chunk = chunkCount;
  for (let idx = length - 1; chunk >= 1 && idx >= 0; idx -= chunk && chunk--) {
    if (str[idx] !== str[idx - chunk + 1]) {
      return false;
    }
  }

  let acceptedChunkSize;
  let currentChunkSize;
  let previousChar;
  for (let idx = 0; idx < str.length; idx++) {
    const currentChar = str[idx];

    // For the first character we do nothing apart from
    // setting up our expectations of the first chunk.
    if (idx === 0) {
      acceptedChunkSize = 1;
      // And describing the current state.
      previousChar = currentChar;
      currentChunkSize = 1;
      continue;
    }

    // When the current character is different from the previous
    // character it implies that a chunk has been finished.
    if (previousChar !== currentChar) {
      // If the previous chunk was smaller than the accepted size it is invalid.
      if (currentChunkSize < acceptedChunkSize) {
        return false;
      }

      // Otherwise, the previous chunk was valid and so we silently
      // move to setting up our expectations of the next chunk.
      acceptedChunkSize++;
      // And describing the current state.
      previousChar = currentChar;
      currentChunkSize = 1;
      continue;
    }

    // When the current character is the same as the previous character
    // we increment the current chunk size, and if it goes higher than
    // the accepted chunk size bail as invalid.
    currentChunkSize++;
    if (currentChunkSize > acceptedChunkSize) {
      return false;
    }
  }

  // We must cover the case in which the string appears to be a
  // triangle string so far but then completes early.
  if (currentChunkSize < acceptedChunkSize) {
    return false;
  }

  return true;
}

console.log("isTriangleString", isTriangleString("abbcccdddd"));

CodeSandbox

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment