Skip to content

Instantly share code, notes, and snippets.

@kshirsagarsiddharth
Created December 27, 2022 09:14
Show Gist options
  • Save kshirsagarsiddharth/15b41be564a3ec48aec56d3127f41175 to your computer and use it in GitHub Desktop.
Save kshirsagarsiddharth/15b41be564a3ec48aec56d3127f41175 to your computer and use it in GitHub Desktop.
import re
def clean_text(text):
# Use a regular expression to remove punctuation and special characters
clean_text = re.sub(r'[^\w\s]', '', text)
# Remove leading and trailing whitespace
clean_text = clean_text.strip()
return clean_text
# Test the function
text = "This is a sample text with punctuation (like commas and exclamation points)! It also includes letters (both uppercase and lowercase), numbers (like 123 and 456), and special characters (like # and $). Some people might find it confusing or difficult to read, but with the right tools and techniques, it's easy to clean and analyze this text."
clean_text(text)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment