Last active
December 18, 2018 13:35
-
-
Save wadeschulz/2efc5fa477086dd7f13aa6f21ccaf658 to your computer and use it in GitHub Desktop.
Python Regex to Preprocess Deidentified Sections of MIMIC-III Notes
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
regex1 = re.compile("\[\*\*(\d*\-\d*\-\d*)\*\*\]") # regex to remove special characters from shifted yyyy-MM-dd format | |
regex2 = re.compile("\[\*\*(\d*\-\d*)\*\*\]") # regex to remove special characters from shifted MM-dd format | |
regex3 = re.compile("\[\*\*(\d*)\*\*\]") # regex to remove special characters from shifted MM or dd format | |
regex4 = re.compile("\[\*\*[^\*]+\*\*\]") # regex to remove remaining de'id fields |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment