Skip to content

Instantly share code, notes, and snippets.

@mikegerber
Created April 13, 2021 13:20
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save mikegerber/da25d98499f57d165cdeebcfe7e9eb97 to your computer and use it in GitHub Desktop.
Save mikegerber/da25d98499f57d165cdeebcfe7e9eb97 to your computer and use it in GitHub Desktop.
Extract metrics from ocrevalUAtion HTML reports
from bs4 import BeautifulSoup
import glob
METRICS = ["CER", "WER", "WER (order independent)"]
FILES = glob.glob("*.html")
print("file;" + ";".join(METRICS))
for f in FILES:
soup = BeautifulSoup(open(f), "html.parser")
print(f, end="")
for metric in METRICS:
label = soup.find(text=metric).parent
value = label.nextSibling.text.strip()
print(";" + value, end="")
print()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment