Spaces:
Sleeping
Sleeping
排名,大模型,机构,中学生物,中学物理,中学数学,中学化学,中学地理,中学历史,平均正确率 | |
🥇,通义千问2(qwen-max),阿里巴巴,93.33%,84.21%,60.78%,84.71%,89.53%,96.21%,84.80% | |
🥈,文心一言4(ERNIEBot-4),百度,85.33%,77.63%,56.86%,81.18%,80.23%,93.18%,79.07% | |
🥉,讯飞星火v3.0,科大讯飞,88.00%,72.37%,42.16%,70.59%,79.07%,81.06%,72.21% | |
4,GPT4-Turbo,OpenAI,85.33%,71.05%,44.94%,57.89%,79.07%,85.61%,70.65% | |
5,商汤日日新(Sensenova),商汤科技,89.33%,68.42%,42.16%,61.18%,66.28%,81.06%,68.07% | |
6,GPT4,OpenAI,89.33%,51.32%,40.20%,56.47%,79.07%,83.33%,66.62% | |
7,MiniMax(abab5.5-chat),MiniMax,74.67%,59.21%,41.18%,51.76%,63.95%,83.33%,62.35% | |
8,百川(baichuan2-13b-chat-v1),百川智能,68.00%,42.11%,29.41%,54.12%,74.42%,78.03%,57.68% | |
9,ChatGLM3-6B,清华&智谱,74.67%,46.05%,23.53%,43.53%,63.95%,77.27%,54.83% | |
10,360智脑(360GPT_S2_V9),360,65.33%,51.32%,34.31%,40.00%,69.77%,52.27%,52.17% | |
11,千帆-llama2,Meta/百度千帆,69.33%,43.42%,26.47%,34.12%,59.30%,75.00%,51.27% | |
12,BLOOMZ-7B,BigScience,36.00%,30.26%,23.53%,30.59%,34.88%,38.64%,32.32% | |
13,GPT3.5-Turbo,OpenAI,40.00%,28.95%,29.41%,21.18%,17.44%,17.42%,25.73% | |
14,悟道・天鹰(AquilaChat-7B),智源研究院,24.00%,25.00%,20.59%,22.35%,20.93%,25.00%,22.98% |