Skip to content

Instantly share code, notes, and snippets.

@dkapitan
Last active March 9, 2020 09:41
Show Gist options
  • Save dkapitan/89ff20eeed38e6d9757fef9e09e23c3d to your computer and use it in GitHub Desktop.
Save dkapitan/89ff20eeed38e6d9757fef9e09e23c3d to your computer and use it in GitHub Desktop.
Utility function to clean string for Python variable names and SQL column names
def clean_python_name(s):
"""
https://gist.github.com/dkapitan/89ff20eeed38e6d9757fef9e09e23c3d
Method to convert string to clean string for use
in dataframe column names such :
i) it complies to python 2.x object name standard:
(letter|'_')(letter|digit|'_')
ii) my preference to use lowercase and adhere
to practice of case-insensitive column names for data
Based on
https://stackoverflow.com/questions/3303312/how-do-i-convert-a-string-to-a-valid-variable-name-in-python
"""
import re
# Remove leading characters until we find a letter or underscore, and remove trailing spaces
s = re.sub('^[^a-zA-Z_]+', '', s.strip())
# Replace invalid characters with underscores
s = re.sub('[^0-9a-zA-Z_]', '_', s)
return s.lower()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment