Huanzhi Mao
commited on
Commit
•
6569764
1
Parent(s):
e8e0a80
update data.csv
Browse files
data.csv
CHANGED
@@ -1,50 +1,58 @@
|
|
1 |
Rank,Overall Acc,Model,Model Link,Organization,License,AST Summary,Exec Summary,Simple Function AST,Python Simple Function AST,Java Simple Function AST,JavaScript Simple Function AST,Multiple Functions AST,Parallel Functions AST,Parallel Multiple AST,Simple Function Exec,Python Simple Function Exec,REST Simple Function Exec,Multiple Functions Exec,Parallel Functions Exec,Parallel Multiple Exec,Relevance Detection,Cost ($ Per 1k Function Calls),Latency Mean (s),Latency Standard Deviation (s),Latency 95th Percentile (s)
|
2 |
-
1,90.18%,Claude-3.5-Sonnet-20240620 (Prompt),https://www.anthropic.com/news/claude-3-5-sonnet,Anthropic,Proprietary,91.68%,89.50%,86.73%,94.00%,65.00%,72.00%,95.50%,92.50%,92.00%,100.00%,100.00%,100.00%,96.00%,82.00%,80.00%,85.42%,2.2,1.
|
3 |
-
2,88.
|
4 |
-
3,
|
5 |
-
4,
|
6 |
-
5,86.
|
7 |
-
6,
|
8 |
-
7,
|
9 |
-
8,85.
|
10 |
-
9,
|
11 |
-
10,84.
|
12 |
-
11,84.
|
13 |
-
12,
|
14 |
-
13,
|
15 |
-
14,
|
16 |
-
15,
|
17 |
-
16,
|
18 |
-
17,
|
19 |
-
18,
|
20 |
-
19,
|
21 |
-
20,80.
|
22 |
-
21,80.
|
23 |
-
22,
|
24 |
-
23,79.
|
25 |
-
24,
|
26 |
-
25,
|
27 |
-
26,
|
28 |
-
27,
|
29 |
-
28,
|
30 |
-
29,
|
31 |
-
30,
|
32 |
-
31,
|
33 |
-
32,
|
34 |
-
33,
|
35 |
-
34,
|
36 |
-
35,
|
37 |
-
36,
|
38 |
-
37,
|
39 |
-
38,
|
40 |
-
39,
|
41 |
-
40,
|
42 |
-
41,
|
43 |
-
42,
|
44 |
-
43,
|
45 |
-
44,
|
46 |
-
45,
|
47 |
-
46,
|
48 |
-
47,
|
49 |
-
48,
|
50 |
-
49,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
Rank,Overall Acc,Model,Model Link,Organization,License,AST Summary,Exec Summary,Simple Function AST,Python Simple Function AST,Java Simple Function AST,JavaScript Simple Function AST,Multiple Functions AST,Parallel Functions AST,Parallel Multiple AST,Simple Function Exec,Python Simple Function Exec,REST Simple Function Exec,Multiple Functions Exec,Parallel Functions Exec,Parallel Multiple Exec,Relevance Detection,Cost ($ Per 1k Function Calls),Latency Mean (s),Latency Standard Deviation (s),Latency 95th Percentile (s)
|
2 |
+
1,90.18%,Claude-3.5-Sonnet-20240620 (Prompt),https://www.anthropic.com/news/claude-3-5-sonnet,Anthropic,Proprietary,91.68%,89.50%,86.73%,94.00%,65.00%,72.00%,95.50%,92.50%,92.00%,100.00%,100.00%,100.00%,96.00%,82.00%,80.00%,85.42%,2.2,1.38,1.22,2.12
|
3 |
+
2,88.35%,xLAM-7b-fc-r (FC),https://huggingface.co/Salesforce/xLAM-7b-fc-r,Salesforce,cc-by-nc-4.0,89.45%,87.12%,85.82%,94.75%,60.00%,66.00%,93.00%,91.50%,87.50%,96.47%,99.00%,92.86%,88.00%,84.00%,80.00%,85.00%,N/A,N/A,N/A,N/A
|
4 |
+
3,88.18%,GPT-4-0125-Preview (Prompt),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,91.75%,88.10%,88.00%,94.75%,68.00%,74.00%,95.00%,92.00%,92.00%,99.41%,100.00%,98.57%,94.00%,84.00%,75.00%,70.42%,5.26,1.98,1.33,4.47
|
5 |
+
4,87.41%,Claude-3-Opus-20240229 (Prompt),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,88.83%,86.16%,85.82%,93.75%,63.00%,68.00%,94.00%,86.50%,89.00%,97.65%,98.00%,97.14%,92.00%,80.00%,75.00%,80.42%,10.84,4.47,1.6,7.32
|
6 |
+
5,86.53%,Nemotron-4-340b-instruct (Prompt),https://huggingface.co/nvidia/nemotron-4-340b-instruct,NVIDIA,nvidia-open-model-license,87.99%,88.43%,83.45%,90.75%,58.00%,76.00%,92.50%,90.50%,85.50%,98.24%,99.00%,97.14%,96.00%,82.00%,77.50%,78.33%,N/A,7.65,5.65,18.46
|
7 |
+
6,86.24%,yi-large (FC),https://platform.01.ai/,01.AI,Proprietary,88.81%,88.82%,82.73%,92.50%,57.00%,56.00%,92.50%,91.50%,88.50%,95.29%,99.00%,90.00%,94.00%,86.00%,80.00%,75.83%,0.51,4.04,3.7,12.93
|
8 |
+
7,86.00%,GPT-4-turbo-2024-04-09 (Prompt),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,90.73%,86.04%,86.91%,94.00%,66.00%,72.00%,95.00%,91.00%,90.00%,97.65%,97.00%,98.57%,94.00%,80.00%,72.50%,62.50%,5.26,2.53,2.2,5.71
|
9 |
+
8,85.53%,GPT-4-1106-Preview (FC),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,88.35%,81.73%,82.91%,93.00%,56.00%,56.00%,91.50%,92.50%,86.50%,89.41%,95.00%,81.43%,92.00%,78.00%,67.50%,80.42%,5.07,5.92,6.22,17.64
|
10 |
+
9,85.41%,Granite-20b-FunctionCalling (FC),https://huggingface.co/ibm-granite/granite-20b-functioncalling,IBM,Apache-2.0,85.18%,86.56%,82.73%,90.50%,65.00%,56.00%,90.50%,85.50%,82.00%,88.24%,98.00%,74.29%,90.00%,88.00%,80.00%,87.50%,N/A,N/A,N/A,N/A
|
11 |
+
10,84.82%,GPT-4-0125-Preview (FC),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,87.23%,84.76%,80.91%,90.50%,59.00%,48.00%,93.00%,90.50%,84.50%,83.53%,98.00%,62.86%,92.00%,86.00%,77.50%,82.92%,4.81,4.69,5.69,18.17
|
12 |
+
11,84.65%,Meta-Llama-3-70B-Instruct (Prompt),https://llama.meta.com/llama3,Meta,Meta Llama 3 Community,88.49%,85.32%,83.45%,91.00%,60.00%,70.00%,93.00%,91.50%,86.00%,91.76%,95.00%,87.14%,88.00%,84.00%,77.50%,69.17%,1.1,0.18,N/A,N/A
|
13 |
+
12,84.47%,Gorilla-OpenFunctions-v2 (FC),https://gorilla.cs.berkeley.edu/blogs/7_open_functions_v2.html,Gorilla LLM,Apache 2.0,89.11%,81.55%,87.45%,94.75%,65.00%,74.00%,95.00%,87.50%,86.50%,94.71%,94.00%,95.71%,94.00%,70.00%,67.50%,61.25%,0.31,0.05,N/A,N/A
|
14 |
+
13,84.35%,Gemini-1.5-Pro-Preview-0514 (FC),https://deepmind.google/technologies/gemini/pro/,Google,Proprietary,86.38%,83.32%,74.00%,92.00%,22.00%,34.00%,92.00%,91.50%,88.00%,91.76%,96.00%,85.71%,88.00%,76.00%,77.50%,89.58%,0.87,1.93,0.77,3.45
|
15 |
+
14,84.06%,GPT-4-turbo-2024-04-09 (FC),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,86.55%,78.61%,78.18%,91.25%,45.00%,40.00%,90.00%,90.00%,88.00%,82.94%,93.00%,68.57%,88.00%,76.00%,67.50%,88.75%,4.58,4.87,5.26,15.31
|
16 |
+
15,83.82%,Gemini-1.5-Pro-Preview-0409 (FC),https://deepmind.google/technologies/gemini/#introduction,Google,Proprietary,86.03%,82.88%,73.64%,91.75%,19.00%,38.00%,92.50%,90.50%,87.50%,90.00%,96.00%,81.43%,90.00%,74.00%,77.50%,88.75%,0.88,1.94,0.86,3.52
|
17 |
+
16,83.59%,GPT-4o-2024-05-13 (FC),https://openai.com/index/hello-gpt-4o/,OpenAI,Proprietary,85.81%,80.37%,80.73%,88.25%,60.00%,62.00%,90.00%,88.00%,84.50%,86.47%,94.00%,75.71%,78.00%,82.00%,75.00%,81.25%,2.3,2.07,2.51,6.95
|
18 |
+
17,82.47%,FireFunction-v2 (FC),https://huggingface.co/fireworks-ai/firefunction-v2,Fireworks,Apache 2.0,87.14%,80.26%,86.55%,94.50%,61.00%,74.00%,91.00%,89.50%,81.50%,93.53%,94.00%,92.86%,88.00%,72.00%,67.50%,56.67%,N/A,1.0,0.72,1.89
|
19 |
+
18,81.76%,Mistral-Medium-2312 (Prompt),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,84.48%,73.47%,80.91%,90.25%,56.00%,56.00%,92.00%,84.00%,81.00%,65.88%,96.00%,22.86%,76.00%,82.00%,70.00%,88.33%,1.76,2.69,2.14,6.31
|
20 |
+
19,81.76%,Claude-3-Sonnet-20240229 (Prompt),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,87.43%,86.76%,82.73%,92.00%,59.00%,56.00%,89.00%,88.50%,89.50%,93.53%,96.00%,90.00%,92.00%,84.00%,77.50%,51.25%,2.13,1.89,0.6,3.06
|
21 |
+
20,80.59%,Command-R-Plus (Prompt) (Optimized),https://txt.cohere.com/command-r-plus-microsoft-azure,Cohere For AI,cc-by-nc-4.0,84.10%,86.74%,82.91%,89.75%,64.00%,66.00%,88.50%,82.00%,83.00%,92.94%,97.00%,87.14%,90.00%,84.00%,80.00%,54.17%,N/A,N/A,N/A,N/A
|
22 |
+
21,80.47%,Command-R-Plus (Prompt) (Original),https://txt.cohere.com/command-r-plus-microsoft-azure,Cohere For AI,cc-by-nc-4.0,84.34%,86.24%,82.36%,89.75%,62.00%,64.00%,90.00%,81.00%,84.00%,92.94%,98.00%,85.71%,88.00%,84.00%,80.00%,53.75%,1.9,1.31,0.95,3.26
|
23 |
+
22,80.00%,GPT-4o-2024-05-13 (Prompt),https://openai.com/index/hello-gpt-4o/,OpenAI,Proprietary,76.74%,77.62%,83.45%,90.75%,61.00%,70.00%,84.00%,78.50%,61.00%,90.00%,95.00%,82.86%,78.00%,70.00%,72.50%,82.50%,2.68,1.14,0.78,2.66
|
24 |
+
23,79.94%,Functionary-Small-v2.4 (FC),https://huggingface.co/meetkai/functionary-small-v2.4,MeetKai,MIT,83.47%,76.31%,82.36%,91.75%,61.00%,50.00%,88.50%,82.00%,81.00%,78.24%,96.00%,52.86%,82.00%,80.00%,65.00%,67.92%,N/A,2.32,2.57,6.9
|
25 |
+
24,79.88%,Gemini-1.5-Flash-Preview-0514 (FC),https://deepmind.google/technologies/gemini/flash/,Google,Proprietary,81.03%,74.57%,79.64%,91.00%,51.00%,46.00%,93.50%,78.00%,73.00%,81.76%,94.00%,64.29%,90.00%,54.00%,72.50%,79.58%,0.07,1.02,0.55,1.55
|
26 |
+
25,78.88%,Command-R-Plus (FC) (Optimized),https://txt.cohere.com/command-r-plus-microsoft-azure,Cohere For AI,cc-by-nc-4.0,84.47%,77.17%,76.36%,89.75%,37.00%,48.00%,91.00%,88.50%,82.00%,81.18%,95.00%,61.43%,86.00%,74.00%,67.50%,63.75%,N/A,N/A,N/A,N/A
|
27 |
+
26,78.41%,xLAM-1b-fc-r (FC),https://huggingface.co/Salesforce/xLAM-1b-fc-r,Salesforce,cc-by-nc-4.0,80.41%,82.27%,81.64%,88.50%,63.00%,64.00%,87.50%,78.50%,74.00%,80.59%,99.00%,54.29%,92.00%,84.00%,72.50%,62.50%,N/A,N/A,N/A,N/A
|
28 |
+
27,77.53%,Claude-3-Opus-20240229 (FC tools-2024-04-04),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,73.50%,71.27%,82.00%,89.50%,62.00%,62.00%,91.50%,58.50%,62.00%,90.59%,97.00%,81.43%,94.00%,38.00%,62.50%,82.50%,31.3,13.01,4.13,20.66
|
29 |
+
28,76.65%,Claude-instant-1.2 (Prompt),https://www.anthropic.com/news/releasing-claude-instant-1-2,Anthropic,Proprietary,79.41%,77.93%,79.64%,87.00%,60.00%,60.00%,85.50%,83.00%,69.50%,84.71%,94.00%,71.43%,80.00%,82.00%,65.00%,57.50%,0.45,1.19,0.7,2.21
|
30 |
+
29,76.29%,Claude-3.5-Sonnet-20240620 (FC),https://www.anthropic.com/news/claude-3-5-sonnet,Anthropic,Proprietary,72.56%,59.51%,84.73%,91.50%,66.00%,68.00%,92.00%,59.00%,54.50%,97.06%,98.00%,95.71%,88.00%,18.00%,35.00%,78.33%,4.74,3.63,2.88,7.64
|
31 |
+
30,75.41%,Functionary-Medium-v2.4 (FC),https://huggingface.co/meetkai/functionary-medium-v2.4,MeetKai,MIT,81.62%,75.71%,64.00%,88.00%,0.00%,0.00%,90.50%,87.50%,84.50%,68.82%,85.00%,45.71%,84.00%,80.00%,70.00%,74.17%,N/A,4.03,6.03,20.39
|
32 |
+
31,74.65%,Claude-3-Haiku-20240307 (Prompt),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,79.45%,70.49%,85.82%,93.50%,62.00%,72.00%,91.50%,84.50%,56.00%,92.94%,100.00%,82.86%,94.00%,70.00%,25.00%,34.58%,0.18,1.0,0.5,1.72
|
33 |
+
32,73.12%,Hermes-2-Pro-Llama-3-70B (FC),https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-70B,NousResearch,apache-2.0,77.60%,71.15%,80.91%,92.00%,54.00%,46.00%,81.00%,74.00%,74.50%,74.12%,87.00%,55.71%,74.00%,64.00%,72.50%,47.92%,2.45,0.4,N/A,N/A
|
34 |
+
33,71.76%,Claude-2.1 (Prompt),https://www.anthropic.com/news/claude-2-1,Anthropic,Proprietary,66.40%,62.17%,81.09%,88.75%,57.00%,68.00%,76.00%,55.50%,53.00%,71.18%,90.00%,44.29%,84.00%,46.00%,47.50%,83.33%,4.81,3.19,2.16,7.38
|
35 |
+
34,71.24%,Hermes-2-Pro-Llama-3-8B (FC),https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B,NousResearch,apache-2.0,77.62%,70.08%,80.00%,89.00%,58.00%,52.00%,90.50%,73.50%,66.50%,78.82%,97.00%,52.86%,92.00%,62.00%,47.50%,33.33%,0.31,0.05,N/A,N/A
|
36 |
+
35,71.00%,Command-R-Plus (FC) (Original),https://txt.cohere.com/command-r-plus-microsoft-azure,Cohere For AI,cc-by-nc-4.0,80.82%,73.19%,75.27%,84.50%,51.00%,50.00%,90.00%,82.00%,76.00%,81.76%,92.00%,67.14%,88.00%,68.00%,55.00%,24.17%,1.17,2.07,1.65,4.17
|
37 |
+
36,70.12%,Hermes-2-Theta-Llama-3-8B (FC),https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-8B,NousResearch,apache-2.0,76.24%,69.59%,79.45%,88.75%,56.00%,52.00%,89.00%,67.00%,69.50%,82.35%,98.00%,60.00%,92.00%,44.00%,60.00%,30.00%,0.24,0.04,N/A,N/A
|
38 |
+
37,68.82%,Mistral-large-2402 (FC Auto),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,64.77%,60.01%,67.09%,89.50%,5.00%,12.00%,94.50%,25.50%,72.00%,83.53%,99.00%,61.43%,96.00%,8.00%,52.50%,84.17%,2.49,3.06,3.1,8.94
|
39 |
+
38,68.29%,Nexusflow-Raven-v2 (FC),https://huggingface.co/Nexusflow/NexusRaven-V2-13B,Nexusflow,Apache 2.0,66.88%,73.89%,76.00%,80.75%,61.00%,68.00%,86.00%,44.50%,61.00%,67.06%,95.00%,27.14%,92.00%,74.00%,62.50%,57.50%,N/A,2.01,1.26,4.48
|
40 |
+
39,67.29%,Gemini-1.0-Pro-001 (FC),https://deepmind.google/technologies/gemini/#introduction,Google,Proprietary,57.08%,56.74%,79.82%,93.00%,47.00%,40.00%,92.50%,30.50%,25.50%,86.47%,89.00%,82.86%,84.00%,44.00%,12.50%,80.00%,0.13,1.27,0.99,3.39
|
41 |
+
40,65.35%,DBRX-Instruct (Prompt),https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm,Databricks,Databricks Open Model,66.45%,74.92%,61.82%,75.75%,21.00%,32.00%,71.50%,72.50%,60.00%,71.18%,80.00%,58.57%,86.00%,80.00%,62.50%,55.83%,1.26,0.67,0.43,1.51
|
42 |
+
41,65.29%,Snowflake/snowflake-arctic-instruct (Prompt),https://huggingface.co/Snowflake/snowflake-arctic-instruct,Snowflake,apache-2.0,61.10%,80.04%,62.91%,67.50%,46.00%,60.00%,69.00%,59.00%,53.50%,87.65%,91.00%,82.86%,86.00%,74.00%,72.50%,59.58%,N/A,0.98,0.56,2.13
|
43 |
+
42,64.47%,Mistral-large-2402 (FC Any),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,71.50%,64.93%,82.00%,89.50%,61.00%,64.00%,93.50%,31.50%,79.00%,94.71%,95.00%,94.29%,92.00%,8.00%,65.00%,0.00%,1.97,2.07,1.31,4.9
|
44 |
+
43,64.41%,GPT-3.5-Turbo-0125 (FC),https://platform.openai.com/docs/models/gpt-3-5-turbo,OpenAI,Proprietary,75.23%,81.38%,62.91%,63.50%,60.00%,64.00%,66.00%,91.00%,81.00%,93.53%,95.00%,91.43%,80.00%,82.00%,70.00%,2.08%,0.19,1.26,0.74,2.46
|
45 |
+
44,60.76%,Mistral-small-2402 (FC Any),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,65.75%,52.62%,82.00%,90.50%,60.00%,58.00%,96.00%,39.00%,46.00%,96.47%,100.00%,91.43%,92.00%,12.00%,10.00%,0.00%,0.48,1.15,0.75,2.5
|
46 |
+
45,60.24%,Hermes-2-Pro-Mistral-7B (FC),https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B,NousResearch,apache-2.0,70.90%,55.62%,73.09%,81.75%,55.00%,40.00%,80.50%,67.00%,63.00%,56.47%,78.00%,25.71%,70.00%,56.00%,40.00%,10.83%,0.49,0.08,N/A,N/A
|
47 |
+
46,60.18%,Meta-Llama-3-8B-Instruct (Prompt),https://llama.meta.com/llama3,Meta,Meta Llama 3 Community,62.93%,69.95%,54.73%,58.50%,48.00%,38.00%,73.50%,59.00%,64.50%,75.29%,79.00%,70.00%,74.00%,68.00%,62.50%,43.33%,0.24,0.04,N/A,N/A
|
48 |
+
47,59.00%,Claude-3-Sonnet-20240229 (FC tools-2024-04-04),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,44.00%,43.32%,76.00%,86.00%,48.00%,52.00%,88.00%,6.00%,6.00%,85.29%,96.00%,70.00%,88.00%,0.00%,0.00%,81.67%,3.44,3.23,1.47,6.85
|
49 |
+
48,58.18%,FireFunction-v1 (FC),https://huggingface.co/fireworks-ai/firefunction-v1,Fireworks,Apache 2.0,44.25%,39.79%,84.00%,90.25%,61.00%,80.00%,93.00%,0.00%,0.00%,71.18%,95.00%,37.14%,88.00%,0.00%,0.00%,73.33%,N/A,1.58,1.44,4.26
|
50 |
+
49,56.29%,GLM-4-9b-Chat (FC),https://huggingface.co/THUDM/glm-4-9b-chat,THUDM,glm-4,39.78%,44.12%,67.64%,87.00%,14.00%,20.00%,91.50%,0.00%,0.00%,86.47%,90.00%,81.43%,90.00%,0.00%,0.00%,87.50%,N/A,0.13,N/A,N/A
|
51 |
+
50,54.29%,GPT-4-0613 (FC),https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turbo,OpenAI,Proprietary,39.75%,38.53%,66.00%,86.50%,13.00%,8.00%,93.00%,0.00%,0.00%,64.12%,95.00%,20.00%,90.00%,0.00%,0.00%,91.67%,10.29,3.51,3.37,11.24
|
52 |
+
51,53.88%,Claude-3-Haiku-20240307 (FC tools-2024-04-04),https://www.anthropic.com/news/claude-3-family,Anthropic,Proprietary,45.09%,46.79%,86.36%,95.50%,61.00%,64.00%,93.50%,0.50%,0.00%,91.18%,96.00%,84.29%,94.00%,2.00%,0.00%,20.83%,0.29,1.46,0.54,2.35
|
53 |
+
52,52.65%,Mistral-tiny-2312 (Prompt),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,49.82%,36.16%,55.27%,70.00%,22.00%,4.00%,56.50%,47.50%,40.00%,27.65%,46.00%,1.43%,20.00%,62.00%,35.00%,83.75%,0.13,1.37,1.33,3.66
|
54 |
+
53,44.00%,Gemma-7b-it (Prompt),https://blog.google/technology/developers/gemma-open-models/,Google,gemma-terms-of-use,41.43%,31.75%,42.73%,47.75%,37.00%,14.00%,48.00%,30.50%,44.50%,30.00%,44.00%,10.00%,32.00%,40.00%,25.00%,70.83%,0.37,0.06,N/A,N/A
|
55 |
+
54,42.65%,Mistral-Small-2402 (Prompt),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,41.94%,38.03%,11.27%,6.00%,21.00%,34.00%,8.00%,79.50%,69.00%,34.12%,6.00%,74.29%,20.00%,68.00%,30.00%,98.33%,0.64,1.11,0.89,2.58
|
56 |
+
55,40.12%,Deepseek-v1.5 (Prompt),https://huggingface.co/deepseek-ai/deepseek-coder-7b-instruct-v1.5,Deepseek,Deepseek License,38.22%,30.89%,38.36%,50.00%,4.00%,14.00%,49.00%,37.00%,28.50%,37.06%,38.00%,35.71%,38.00%,36.00%,12.50%,57.08%,3.24,0.53,N/A,N/A
|
57 |
+
56,23.88%,Mistral-small-2402 (FC Auto),https://docs.mistral.ai/guides/model-selection/,Mistral AI,Proprietary,2.92%,34.37%,2.18%,2.75%,1.00%,0.00%,2.50%,3.00%,4.00%,56.47%,79.00%,24.29%,70.00%,6.00%,5.00%,99.58%,0.94,3.03,1.78,6.21
|
58 |
+
57,14.88%,Hermes-2-Theta-Llama-3-70B (FC),https://huggingface.co/NousResearch/Hermes-2-Theta-Llama-3-70B,NousResearch,apache-2.0,0.44%,3.41%,1.27%,1.50%,0.00%,2.00%,0.50%,0.00%,0.00%,7.65%,6.00%,10.00%,6.00%,0.00%,0.00%,95.42%,2.57,0.42,N/A,N/A
|