Skip to content

Instantly share code, notes, and snippets.

@Alphadelta14
Created January 2, 2017 18:52
Show Gist options
  • Save Alphadelta14/bc87601e416e6395aa5d3eb33b473f92 to your computer and use it in GitHub Desktop.
Save Alphadelta14/bc87601e416e6395aa5d3eb33b473f92 to your computer and use it in GitHub Desktop.
html entities regex
In [18]: expr
Out[18]: re.compile(r'&(zwnj|aring|gt|yen|ograve|Chi|bull|pound|Egrave|trade|Ntilde|upsih|Yacute|Atilde|radic|otimes|aelig|oelig|equiv|Psi|auml|cup|Epsilon|otilde|Eta|lt|rsquo|Icirc|Eacute|Lambda|yacute|Prime|prime|psi|Kappa|rsaquo|Tau|uacute|sigmaf|lrm|lceil|Alpha|cedil|atilde|theta|not|kappa|AElig|oslash|acute|zwj|laquo|dArr|rdquo|ge|Igrave|hArr|micro|lsaquo|euro|shy|sdot|supe|nbsp|lfloor|lArr|Auml|larr|brvbar|Otilde|szlig|clubs|agrave|Ocirc|ndash|Theta|Pi|OElig|Scaron|thetasym|egrave|sub|iexcl|frac12|ordf|sum|prop|Uuml|ntilde|sup|asymp|uml|prod|nsub|reg|Oslash|THORN|yuml|infin|Mu|le|thinsp|ecirc|bdquo|Sigma|Aring|tilde|nabla|mdash|uarr|permil|tau|Ugrave|fnof|Agrave|chi|forall|circ|eth|rceil|iuml|gamma|lambda|harr|rang|xi|dagger|divide|Ouml|Ograve|image|alefsym|rlm|igrave|Yuml|sube|alpha|frasl|ETH|lowast|Nu|plusmn|Euml|sup1|sup2|frac34|Aacute|cent|oline|Beta|perp|emsp|loz|pi|iota|empty|euml|notin|Upsilon|para|epsilon|Delta|weierp|there4|part|icirc|delta|omicron|upsilon|copy|Iuml|Oacute|Xi|ensp|ccedil|Ucirc|cap|ocirc|mu|rarr|scaron|lsquo|isin|Zeta|minus|deg|and|real|ang|hellip|curren|int|ucirc|rfloor|crarr|ugrave|exist|cong|Dagger|oplus|sup3|Acirc|piv|iacute|ni|Phi|Iacute|quot|Uacute|Omicron|ne|iquest|eta|sbquo|Rho|darr|Ecirc|zeta|Omega|nu|sim|sect|phi|diams|macr|frac14|Ccedil|ordm|uArr|beta|rArr|rho|aacute|eacute|Iota|omega|middot|Gamma|times|lang|spades|amp|uuml|thorn|ouml|or|raquo|acirc|ldquo|hearts|sigma|oacute);')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment