Model | AGIEval | GPT4All | TruthfulQA | Bigbench | Average |
---|---|---|---|---|---|
CultMerge-7B-v1 | 45.2 | 77.1 | 78.22 | 49.87 | 62.6 |
Task | Version | Metric | Value | Stderr | |
---|---|---|---|---|---|
agieval_aqua_rat | 0 | acc | 27.17 | ± | 2.80 |
acc_norm | 25.59 | ± | 2.74 | ||
agieval_logiqa_en | 0 | acc | 39.48 | ± | 1.92 |