Skip to content

Instantly share code, notes, and snippets.

@ShuffleBox
Last active February 22, 2023 05:07
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ShuffleBox/376d706b0abd378e23b5d17373de86a7 to your computer and use it in GitHub Desktop.
Save ShuffleBox/376d706b0abd378e23b5d17373de86a7 to your computer and use it in GitHub Desktop.
Quick Benchmark for Whisper by OpenAI

Whisper by OpenAI ad-hoc benchmarks

Tests run 2/21/2023

PTH benchmark

Compare against VPhill

File used for captions. https://texashistory.unt.edu/ark:/67531/metapth845109/

Language: English

File length = 00:15:50.52

Computer Device Model Time
ArcStation CPU AMD 7900X 12core small.en 1m56s
ArcStation CPU AMD 7900X 12core medium 5m10s
ArcStation CPU AMD 7900X 12core large 10m09s
ArcStation Intel Arc A770 LE 16GB small.en 1m25s
ArcStation Intel Arc A770 LE 16GB medium 2m51s
ArcStation Intel Arc A770 LE 16GB large 4m38s

Oliver Wehrens benchmark

Compare against OpenAI Whisper Benchmark Nvidia Tesla T4 / A100

File used for catptions. https://adn.podigee.com/adswizz/media/podcast_1652_was_jetzt_episode_818856_update_ein_notstand_aber_keine_zweite_pandemie.mp3

Language: German

File length = 00:10:00

Computer Device Model Time
ArcStation CPU AMD 7900X 12core small 1m53s
ArcStation CPU AMD 7900X 12core medium 4m56s
ArcStation CPU AMD 7900X 12core large 9m26s
ArcStation Intel Arc A770 LE 16GB small 1m41s
ArcStation Intel Arc A770 LE 16GB medium 3m14s
ArcStation Intel Arc A770 LE 16GB large 5m18s

Whisper w/ XPU enabled PyTorch

based on Oliver Wehrens script @ https://owehrens.com/openai-whisper-benchmark-on-nvidia-tesla-t4-a100/

import torch
import intel_extension_for_pytorch as ipex
import whisper
import datetime
import json


t0 = datetime.datetime.now()
print(f"Load Model at  {t0}")
model = whisper.load_model('large')
t1 = datetime.datetime.now()
print(f"Loading took {t1-t0}")

print(f"started at {t1}")
output = model.xpu()
output = model.transcribe("audio.mp3")


# show time elapsed after transcription is complete.
t2 = datetime.datetime.now()
print(f"ended at {t2}")
print(f"time elapsed: {t2 - t1}")

with open("transcription.json", "w") as f:
    f.write(json.dumps(output))

CPU only script on vanilla python 3.9

import torch
import whisper
import datetime
import json


t0 = datetime.datetime.now()
print(f"Load Model at  {t0}")
model = whisper.load_model('large')
t1 = datetime.datetime.now()
print(f"Loading took {t1-t0}")

print(f"started at {t1}")
output = model.transcribe("audio.mp3")


# show time elapsed after transcription is complete.
t2 = datetime.datetime.now()
print(f"ended at {t2}")
print(f"time elapsed: {t2 - t1}")

with open("transcription.json", "w") as f:
    f.write(json.dumps(output))

Specs

ArcStation specs - AMD R9 7900X, 32GB RAM, Intel Arc A770 LE 16GB

  • OS - Fedora 38 (in preview)
  • Kernel - 6.2.0-0.rc8.57.fc38.x86_64
  • Intel Compute Runtime - 22.53.25242.13
  • Intel OpenAPI Level Zero - 1.9.4
  • torch - 1.130
  • Intel Extension for PyTorch - intel_extension_for_pytorch-1.13.10+xpu-cp39-cp39-linux_x86_64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment