gbaptista/01-lbpe-score-heatmap-table.md

## 01-lbpe-score-heatmap-table.md

      
    Raw
  

              01-lbpe-score-heatmap-table.md
            
          
Model
Back-and-Forth Conversations
Tools (Functions)
Polyglotism
MMLU
ENEM
Streaming
Latency
Pricing


Cohere Command
67.25%
0.00%
13.33%
51.31%
30.56%
95.19%
35.72%
22.50%


Cohere Command Light
61.25%
0.00%
13.33%
38.86%
11.39%
91.84%
99.91%
63.55%


Google Gemini Pro
39.50%
65.00%
100.00%
63.98%
58.33%
11.53%
50.31%
25.30%


Maritaca MariTalk
80.25%
0.00%
66.67%
60.45%
56.39%
8.34%
21.48%
57.45%


Mistral Medium
85.25%
0.00%
83.33%
71.99%
66.67%
87.54%
24.82%
15.10%


Mistral Small
71.75%
0.00%
70.00%
68.86%
60.56%
79.33%
52.88%
33.08%


Mistral Tiny
80.50%
0.00%
56.67%
56.36%
45.28%
74.24%
78.94%
95.55%


OpenAI GPT-3.5 Turbo
86.75%
82.00%
100.00%
63.75%
64.44%
74.37%
74.87%
37.19%


OpenAI GPT-4 Turbo
87.00%
90.50%
100.00%
85.91%
88.89%
93.97%
13.41%
10.24%


Extracted from LBPE Score Report 1.0.0

  
## 02-lbpe-score-heatmap-table.csv

          
            Model
            Back-and-Forth Conversations
            ENEM
            Polyglotism
            Latency
            MMLU
            Pricing
            Streaming
            Tools (Functions)

            
              Cohere Command Light
              0.6125
              0.11388888888888889
              0.13333333333333333
              0.9990532739147086
              0.3886363636363636
              0.6354787012698454
              0.9184179374477416
              0.0

            
              Cohere Command
              0.6725
              0.3055555555555556
              0.13333333333333333
              0.3571500627851347
              0.5130681818181818
              0.2249974589372398
              0.9518729302872222
              0.0

            
              Google Gemini Pro
              0.395
              0.5833333333333334
              1.0
              0.5031268581129563
              0.6397727272727273
              0.2529788160967991
              0.11533169496989419
              0.65

            
              Maritaca MariTalk
              0.8025
              0.5638888888888889
              0.6666666666666666
              0.21476252213454625
              0.6045454545454545
              0.5744808335329065
              0.0834039302809052
              0.0

            
              Mistral Medium
              0.8525
              0.6666666666666666
              0.8333333333333334
              0.24815208399487462
              0.7198863636363636
              0.1509973685225832
              0.8754364214293453
              0.0

            
              Mistral Small
              0.7175
              0.6055555555555555
              0.7
              0.5288256912597489
              0.6886363636363636
              0.3308497682956469
              0.7932658051385039
              0.0

            
              Mistral Tiny
              0.8049999999999999
              0.4527777777777778
              0.5666666666666667
              0.7894134059571646
              0.5636363636363636
              0.9554676109568064
              0.7424483518935384
              0.0

            
              OpenAI GPT-3.5 Turbo
              0.8675
              0.6444444444444445
              1.0
              0.7486852708583881
              0.6375
              0.37188174525275197
              0.7437334564417286
              0.82

            
              OpenAI GPT-4 Turbo
              0.87
              0.8888888888888888
              1.0
              0.1340939471260517
              0.8590909090909091
              0.10244268882293754
              0.9396831632202628
              0.905
Model	Back-and-Forth Conversations	Tools (Functions)	Polyglotism	MMLU	ENEM	Streaming	Latency	Pricing
Cohere Command	67.25%	0.00%	13.33%	51.31%	30.56%	95.19%	35.72%	22.50%
Cohere Command Light	61.25%	0.00%	13.33%	38.86%	11.39%	91.84%	99.91%	63.55%
Google Gemini Pro	39.50%	65.00%	100.00%	63.98%	58.33%	11.53%	50.31%	25.30%
Maritaca MariTalk	80.25%	0.00%	66.67%	60.45%	56.39%	8.34%	21.48%	57.45%
Mistral Medium	85.25%	0.00%	83.33%	71.99%	66.67%	87.54%	24.82%	15.10%
Mistral Small	71.75%	0.00%	70.00%	68.86%	60.56%	79.33%	52.88%	33.08%
Mistral Tiny	80.50%	0.00%	56.67%	56.36%	45.28%	74.24%	78.94%	95.55%
OpenAI GPT-3.5 Turbo	86.75%	82.00%	100.00%	63.75%	64.44%	74.37%	74.87%	37.19%
OpenAI GPT-4 Turbo	87.00%	90.50%	100.00%	85.91%	88.89%	93.97%	13.41%	10.24%
Model	Back-and-Forth Conversations	ENEM	Polyglotism	Latency	MMLU	Pricing	Streaming	Tools (Functions)
Cohere Command Light	0.6125	0.11388888888888889	0.13333333333333333	0.9990532739147086	0.3886363636363636	0.6354787012698454	0.9184179374477416	0.0
Cohere Command	0.6725	0.3055555555555556	0.13333333333333333	0.3571500627851347	0.5130681818181818	0.2249974589372398	0.9518729302872222	0.0
Google Gemini Pro	0.395	0.5833333333333334	1.0	0.5031268581129563	0.6397727272727273	0.2529788160967991	0.11533169496989419	0.65
Maritaca MariTalk	0.8025	0.5638888888888889	0.6666666666666666	0.21476252213454625	0.6045454545454545	0.5744808335329065	0.0834039302809052	0.0
Mistral Medium	0.8525	0.6666666666666666	0.8333333333333334	0.24815208399487462	0.7198863636363636	0.1509973685225832	0.8754364214293453	0.0
Mistral Small	0.7175	0.6055555555555555	0.7	0.5288256912597489	0.6886363636363636	0.3308497682956469	0.7932658051385039	0.0
Mistral Tiny	0.8049999999999999	0.4527777777777778	0.5666666666666667	0.7894134059571646	0.5636363636363636	0.9554676109568064	0.7424483518935384	0.0
OpenAI GPT-3.5 Turbo	0.8675	0.6444444444444445	1.0	0.7486852708583881	0.6375	0.37188174525275197	0.7437334564417286	0.82
OpenAI GPT-4 Turbo	0.87	0.8888888888888888	1.0	0.1340939471260517	0.8590909090909091	0.10244268882293754	0.9396831632202628	0.905