Skip to content

Instantly share code, notes, and snippets.

@kshirsagarsiddharth
Created December 27, 2022 09:10
Show Gist options
  • Save kshirsagarsiddharth/8c443217be7bec28dae1e8546a9078e8 to your computer and use it in GitHub Desktop.
Save kshirsagarsiddharth/8c443217be7bec28dae1e8546a9078e8 to your computer and use it in GitHub Desktop.
import string
def clean_text(text):
# Create a translation table to remove punctuation and special characters we are replacing space.
translator = str.maketrans('', '', string.punctuation + string.printable.replace(' ','')[62:])
# Use the translate method to remove the characters
clean_text = text.translate(translator)
# Remove leading and trailing whitespace
clean_text = clean_text.strip()
return clean_text
# Test the function
text = "This is a sample text with punctuation (like commas and exclamation points)! It also includes letters (both uppercase and lowercase), numbers (like 123 and 456), and special characters (like # and $). Some people might find it confusing or difficult to read, but with the right tools and techniques, it's easy to clean and analyze this text."
clean_text(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment