Skip to content

Instantly share code, notes, and snippets.

@crutchcorn
Created November 28, 2023 08:36
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save crutchcorn/c89b010ed794ec17333d5a80d038baca to your computer and use it in GitHub Desktop.
Save crutchcorn/c89b010ed794ec17333d5a80d038baca to your computer and use it in GitHub Desktop.
A method to generate ignored indexes and do partial replacement in a regex
// Given an input
const input = "header 123 {#custom-id}"
// Provide a tranformation of said input that keeps the same length
// IE: "Capitalizing" a title in a markdown file
const transformedInput = input.toUpperCase();
// However, we don't want to transform this regex
// IE: A custom ID
const ignored = ["{#custom-id}"]
// From this, our output should be:
// "HEADER 123 {#custom-id}"
// Below is the implementation
let ignoredIndexes = new Set();
for (let regex of ignored) {
const matches = input.match(new RegExp(regex, 'gmu'));
if (!matches) continue;
let lastIndex = 0;
for (const match of matches) {
const matchedIndex = input.indexOf(match, lastIndex);
lastIndex = matchedIndex + match.length;
for (let indexOfMatchLength = 0; indexOfMatchLength < match.length; indexOfMatchLength++) {
ignoredIndexes.add(matchedIndex + indexOfMatchLength);
}
}
}
let output = "";
for (let inputIndex = 0; inputIndex < input.length; inputIndex++) {
if (ignoredIndexes.has(inputIndex)) {
output += input[inputIndex];
} else {
output += transformedInput[inputIndex];
}
}
// Finally, output the result:
console.log(output);
@crutchcorn
Copy link
Author

I plan on contributing this to:

https://github.com/Xunnamius/unified-utils/tree/main/packages/remark-capitalize-headings

As a feature that suits my needs.

@tobySolutions
Copy link

This is awesome @crutchcorn, why Regex though? It kinda looks like there could have been another way or maybe it's just cos I'm scared of Regex.

@crutchcorn
Copy link
Author

@tobySolutions good question! There's a few reasons:

  1. The package in question already uses regexes extensively for this style of customization
  2. Regexes can be easily serializable into JSON files, which is where many Remark config files live
  3. While the demo uses a hardcoded custom ID, I need to be able to catch all custom IDs in my headers via this regex: /\{\s*#.*?\}\s*$/

Regexes aren't all that scary once you get used to them :) I even wrote a guide to them here:

https://unicorn-utterances.com/posts/the-complete-guide-to-regular-expressions-regex

@tobySolutions
Copy link

Also, sorry for the disturbance, chatGPT also pointed out that there might be an issue with how the ignored characters was specified in the code:

https://chat.openai.com/share/fe811ff8-199d-48b1-b064-45144ca4151c

@crutchcorn
Copy link
Author

@tobySolutions That's intentional behavior😊

We don't want to escape the regexes, we want to capture them as-written

@tobySolutions
Copy link

Thank you very much!! @crutchcorn

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment