Skip to content

Instantly share code, notes, and snippets.

@abevieiramota
Created July 4, 2019 12:26
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save abevieiramota/a86b9157c993b40df5a9c27ac117420c to your computer and use it in GitHub Desktop.
Save abevieiramota/a86b9157c993b40df5a9c27ac117420c to your computer and use it in GitHub Desktop.
# não se preocupa com ocorrências de & como caractere especial
with open('meu_arquivo.xml', 'r') as f:
content = f.read()
with open('meu_novo_arquivo.xml', 'w') as f:
f.write(content.replace('&', ''))
# se preocupa > https://www.tjohearn.com/2018/01/24/safe-ampersand-parsing-in-xml-files/
import re
# captura ocorrências de & que não estão seguidas de amp; lt; ou gt;, configurando uso de & como caractere especial
# há outros casos, o regex pode estar incompleto, ver > https://stackoverflow.com/a/1091953/3662965
c = re.compile(r'&(?!amp;|lt;|gt;)')
with open('meu_novo_novo_arquivo.xml', 'w') as f:
# substitui ocorrências de &, de acordo com o regex, por &, que é & escapado
f.write(c.sub('&', content))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment