Last active
July 11, 2024 14:41
-
-
Save dubpirate/fdea9a67500a46613ad637269320d272 to your computer and use it in GitHub Desktop.
Convert Pascal Case to Snake Case with Python and Regex.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import re | |
def to_snake(pascal:str) -> str: | |
"""Converts a Pascal case string to snake case. | |
""" | |
magic = re.findall('[A-Z]+[a-z]*', pascal) | |
snake = '_'.join(magic) | |
snake = snake.lower() | |
return snake | |
# Working examples: | |
assert to_snake("ID") == "id" | |
assert to_snake("HelloWorld") == "hello_world" | |
assert to_snake("EndsWithAllCapsXML") == "ends_with_all_caps_xml" | |
# Broken Examples: | |
assert to_snake("STARTINGWithAllCapsDoesntWork") == "startingwith_all_caps_doesnt_work" | |
assert to_snake("startingWithLowerCaseDoesntWork") == "with_lower_case_doesnt_work" |
Careful with this implementation, one would assume to_snake() to be idempotent, but to_snake("already_snake") == ''
.
Don't shoot yourself in the foot here. You can find an improved version below. It handles idempotency, uppercases and camel cases correctly (afaik).
👁️ I'm unaware that lower case letters should be discarded when converting from PascaslCase to snake_case, i. e. assert to_snake("startingWithLowerCaseDoesntWork") == "with_lower_case_doesnt_work"
- so the below implementation does NOT do that
import re
def to_snake(s: str) -> str:
# Add an underscore before each uppercase letter that is followed by a lowercase letter
s = re.sub(r'([A-Z]+)([A-Z][a-z])', r'\1_\2', s)
# Add an underscore before each lowercase letter that is preceded by an uppercase letter
s = re.sub(r'([a-z\d])([A-Z])', r'\1_\2', s)
# Convert the entire string to lowercase
s = s.lower()
return s
# Example usage
print(to_snake("PascalCase")) # Output: pascal_case
print(to_snake("pascal_case")) # Output: pascal_case # works, idempotent
print(to_snake("camelCaseExample")) # Output: camel_case_example
print(to_snake("SomeID")) # Output: some_id
print(to_snake("ThisIsASnakeCaseExample")) # Output: this_is_a_snake_case_example
print(to_snake("STARTINGWithAllCapsDoesntWork")) # Output: starting_with_all_caps_doesnt_work # works :)
print(to_snake("startingWithLowerCaseDoesntWork")) # Output: starting_with_lower_case_doesnt_work # see disclaimer
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I wrote this because the other regex examples I found in this StackOverflow question didn't capture the right scenarios, particularly ID to id, and a string ending with all caps (XML).