Skip to content

Instantly share code, notes, and snippets.

@allanj
Last active March 15, 2022 11:49
Show Gist options
  • Star 2 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save allanj/5ad206f7f4645c0269b68fb2065712f4 to your computer and use it in GitHub Desktop.
Save allanj/5ad206f7f4645c0269b68fb2065712f4 to your computer and use it in GitHub Desktop.
Convert the IOB2 tagging scheme to BIOES tagging scheme
def iob_iobes(tags):
"""
IOB2 (BIO) -> IOBES
"""
new_tags = []
for i, tag in enumerate(tags):
if tag == 'O':
new_tags.append(tag)
elif tag.split('-')[0] == 'B':
if i + 1 != len(tags) and \
tags[i + 1].split('-')[0] == 'I':
new_tags.append(tag)
else:
new_tags.append(tag.replace('B-', 'S-'))
elif tag.split('-')[0] == 'I':
if i + 1 < len(tags) and \
tags[i + 1].split('-')[0] == 'I':
new_tags.append(tag)
else:
new_tags.append(tag.replace('I-', 'E-'))
else:
raise Exception('Invalid IOB format!')
return new_tags
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment