Last active
March 9, 2020 09:41
-
-
Save dkapitan/89ff20eeed38e6d9757fef9e09e23c3d to your computer and use it in GitHub Desktop.
Utility function to clean string for Python variable names and SQL column names
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
def clean_python_name(s): | |
""" | |
https://gist.github.com/dkapitan/89ff20eeed38e6d9757fef9e09e23c3d | |
Method to convert string to clean string for use | |
in dataframe column names such : | |
i) it complies to python 2.x object name standard: | |
(letter|'_')(letter|digit|'_') | |
ii) my preference to use lowercase and adhere | |
to practice of case-insensitive column names for data | |
Based on | |
https://stackoverflow.com/questions/3303312/how-do-i-convert-a-string-to-a-valid-variable-name-in-python | |
""" | |
import re | |
# Remove leading characters until we find a letter or underscore, and remove trailing spaces | |
s = re.sub('^[^a-zA-Z_]+', '', s.strip()) | |
# Replace invalid characters with underscores | |
s = re.sub('[^0-9a-zA-Z_]', '_', s) | |
return s.lower() |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment