Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save CultriX-Github/b76cac4fadb466ec7053a3d056bf4e4b to your computer and use it in GitHub Desktop.
Save CultriX-Github/b76cac4fadb466ec7053a3d056bf4e4b to your computer and use it in GitHub Desktop.
Leaderboard made with 🧐 LLM AutoEval (https://github.com/mlabonne/llm-autoeval) using Nous benchmark suite.
Model Average AGIEval GPT4All TruthfulQA Bigbench
FelixChao/WestSeverus-7B-DPO-v2 πŸ“„ 60.98 45.29 77.2 72.72 48.71
CultriX/CombinaTrix-7B πŸ“„ 60.58 45.52 77.42 71.12 48.24
CultriX/OmniTrixAI πŸ“„ 60.35 44.94 77.31 70.62 48.52
mlabonne/NeuralBeagle14-7B πŸ“„ 60.25 46.06 76.77 70.32 47.86
jsfs11/TurdusTrixBeagle-DARETIES-7B πŸ“„ 59.99 44.46 77.81 69.15 48.54
CultriX/SevereNeuralBeagleTrix-7B πŸ“„ 59.82 44.37 77.38 69.59 47.95
CultriX/MergeTrix-7B-v2 πŸ“„ 59.53 44.7 77.66 67.52 48.23
senseable/WestLake-7B-v2 πŸ“„ 59.42 44.27 77.86 67.46 48.09
fblgit/UNA-TheBeagle-7b-v1 πŸ“„ 59.17 42.73 77.12 70.82 46.01
CultriX/MergeTrix-7B πŸ“„ 58.88 44.93 76.85 66.56 47.18
mlabonne/Marcoro14-7B-slerp πŸ“„ 57.67 44.66 76.24 64.15 45.64
microsoft/phi-2 πŸ“„ 44.61 27.96 70.84 44.46 35.17
TheBloke/guanaco-7B-HF πŸ“„ 40.38 23.12 66.85 38.92 32.64
TinyLlama/TinyLlama-1.1B-Chat-v1.0 πŸ“„ 36.32 20.77 54.28 37.84 32.4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment