Based on huggingface repo for performance evaluation, actual benchmark run script placed at repo. How to reproduce performance:
- prepare dataset according to link.
- update
GLUE_DIR
to actual dataset path inrun_inference.sh
. - change env settings, the default setting is using 20 cores;
Inference performance result on Xeon 6148 (2x20 cores), single socket and single thread.