Skip to content

Instantly share code, notes, and snippets.

@msoutopico
Last active February 18, 2021 22:04
Show Gist options
  • Save msoutopico/8b9aa3805174349e0c61045ffef820df to your computer and use it in GitHub Desktop.
Save msoutopico/8b9aa3805174349e0c61045ffef820df to your computer and use it in GitHub Desktop.
tagValidation_removePattern
# PISA/PIAAC HTML TAGS BLOCK
(?:<)
# followed by anything that is not
(?!
\s
# space
|(?:<)
# or another opening angle bracket
|(/?(EMPTY|[bui]|sup|sub|strong|strike|small|ul|ol|li|a|em|br\s*/?|img|var|span|div|p|run\d|style|center)(\s+[^>]+)?(?:>))
# or a html tag
|/?[tgxi]\d+/?(?:>)
# or an omegat tag
).*?
(?:>)
| # EMAIL TAG BLOCK:
(?:<)
[\w.-]+@[\w.-]+
(?:>)
| # HASHES BLOCK:
###+
| # BLABLA BLOCK:
[xX]{3,}
@msoutopico
Copy link
Author

msoutopico commented Feb 16, 2021

((\s*|^)#\s.*)?(\n|\Z)|\s+

@msoutopico
Copy link
Author

msoutopico commented Feb 16, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment