Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
zephyr-7b-alpha | 38 | 72.24 | 56.06 | 40.57 | 51.72 |
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 20.47 | ± | 2.54 |
acc_norm | 19.69 | ± | 2.50 | ||
agieval_logiqa_en | 0 | acc | 31.49 | ± | 1.82 |