Skip to content

Instantly share code, notes, and snippets.

@gbaptista
Last active December 31, 2023 21:51
Show Gist options
  • Save gbaptista/a6fe6ac80143372ee3cb0f2d56c7b350 to your computer and use it in GitHub Desktop.
Save gbaptista/a6fe6ac80143372ee3cb0f2d56c7b350 to your computer and use it in GitHub Desktop.
LBPT Score Heatmap Table and CSV
Model Back-and-Forth Conversations Tools (Functions) Polyglotism MMLU ENEM Streaming Latency Pricing
Cohere Command 67.25% 0.00% 13.33% 51.31% 30.56% 95.19% 35.72% 22.50%
Cohere Command Light 61.25% 0.00% 13.33% 38.86% 11.39% 91.84% 99.91% 63.55%
Google Gemini Pro 39.50% 65.00% 100.00% 63.98% 58.33% 11.53% 50.31% 25.30%
Maritaca MariTalk 80.25% 0.00% 66.67% 60.45% 56.39% 8.34% 21.48% 57.45%
Mistral Medium 85.25% 0.00% 83.33% 71.99% 66.67% 87.54% 24.82% 15.10%
Mistral Small 71.75% 0.00% 70.00% 68.86% 60.56% 79.33% 52.88% 33.08%
Mistral Tiny 80.50% 0.00% 56.67% 56.36% 45.28% 74.24% 78.94% 95.55%
OpenAI GPT-3.5 Turbo 86.75% 82.00% 100.00% 63.75% 64.44% 74.37% 74.87% 37.19%
OpenAI GPT-4 Turbo 87.00% 90.50% 100.00% 85.91% 88.89% 93.97% 13.41% 10.24%

Extracted from LBPE Score Report 1.0.0

Model Back-and-Forth Conversations ENEM Polyglotism Latency MMLU Pricing Streaming Tools (Functions)
Cohere Command Light 0.6125 0.11388888888888889 0.13333333333333333 0.9990532739147086 0.3886363636363636 0.6354787012698454 0.9184179374477416 0.0
Cohere Command 0.6725 0.3055555555555556 0.13333333333333333 0.3571500627851347 0.5130681818181818 0.2249974589372398 0.9518729302872222 0.0
Google Gemini Pro 0.395 0.5833333333333334 1.0 0.5031268581129563 0.6397727272727273 0.2529788160967991 0.11533169496989419 0.65
Maritaca MariTalk 0.8025 0.5638888888888889 0.6666666666666666 0.21476252213454625 0.6045454545454545 0.5744808335329065 0.0834039302809052 0.0
Mistral Medium 0.8525 0.6666666666666666 0.8333333333333334 0.24815208399487462 0.7198863636363636 0.1509973685225832 0.8754364214293453 0.0
Mistral Small 0.7175 0.6055555555555555 0.7 0.5288256912597489 0.6886363636363636 0.3308497682956469 0.7932658051385039 0.0
Mistral Tiny 0.8049999999999999 0.4527777777777778 0.5666666666666667 0.7894134059571646 0.5636363636363636 0.9554676109568064 0.7424483518935384 0.0
OpenAI GPT-3.5 Turbo 0.8675 0.6444444444444445 1.0 0.7486852708583881 0.6375 0.37188174525275197 0.7437334564417286 0.82
OpenAI GPT-4 Turbo 0.87 0.8888888888888888 1.0 0.1340939471260517 0.8590909090909091 0.10244268882293754 0.9396831632202628 0.905
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment