Skip to content

Instantly share code, notes, and snippets.

@Purrrpley
Last active June 14, 2022 11:34
Show Gist options
  • Save Purrrpley/1b5d9ff9033144f96b958463270263b3 to your computer and use it in GitHub Desktop.
Save Purrrpley/1b5d9ff9033144f96b958463270263b3 to your computer and use it in GitHub Desktop.
def despamify(text: str, threshold=5) -> tuple[bool, str]:
last_char = ""
repeat_count = 0
despammed = False
output = ""
for char in text:
if char == last_char:
repeat_count += 1
else:
repeat_count = 0
if repeat_count < threshold:
output += char
else:
despammed = True
last_char = char
return despammed, output
@Purrrpley
Copy link
Author

Purrrpley commented Nov 10, 2021

Some examples:

despamify('Hello')  # (False, 'Hello')
despamify('Hellooooo')  # (False, 'Hellooooo')
despamify('Helloooooo')  # (True, 'Hellooooo')
despamify('Whhhhhhhereees my wooooood')  # (True, 'Whhhhhereees my woooood’)
despamify('Hello!!!!!', 3)  # (True, 'Hello!!!')

@Purrrpley
Copy link
Author

Purrrpley commented Nov 10, 2021

If you really want to use regular expressions though, you can, with something like this:

re.sub(r'(.)\1{5,}', r'\1\1\1\1\1', 'aaaaaabaaaaaa')  # 'aaaaabaaaaa'

may or may not have spent like 10 minutes debugging to save a few seconds of reading the documentation while making that... :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment