LLM_leaderboard / 中学试题.csv
Li
Update 中学试题.csv
09c2202 verified
排名,大模型,机构,中学生物,中学物理,中学数学,中学化学,中学地理,中学历史,平均正确率
🥇,通义千问2(qwen-max),阿里巴巴,93.33%,84.21%,60.78%,84.71%,89.53%,96.21%,84.80%
🥈,文心一言4(ERNIEBot-4),百度,85.33%,77.63%,56.86%,81.18%,80.23%,93.18%,79.07%
🥉,讯飞星火v3.0,科大讯飞,88.00%,72.37%,42.16%,70.59%,79.07%,81.06%,72.21%
4,GPT4-Turbo,OpenAI,85.33%,71.05%,44.94%,57.89%,79.07%,85.61%,70.65%
5,商汤日日新(Sensenova),商汤科技,89.33%,68.42%,42.16%,61.18%,66.28%,81.06%,68.07%
6,GPT4,OpenAI,89.33%,51.32%,40.20%,56.47%,79.07%,83.33%,66.62%
7,MiniMax(abab5.5-chat),MiniMax,74.67%,59.21%,41.18%,51.76%,63.95%,83.33%,62.35%
8,百川(baichuan2-13b-chat-v1),百川智能,68.00%,42.11%,29.41%,54.12%,74.42%,78.03%,57.68%
9,ChatGLM3-6B,清华&智谱,74.67%,46.05%,23.53%,43.53%,63.95%,77.27%,54.83%
10,360智脑(360GPT_S2_V9),360,65.33%,51.32%,34.31%,40.00%,69.77%,52.27%,52.17%
11,千帆-llama2,Meta/百度千帆,69.33%,43.42%,26.47%,34.12%,59.30%,75.00%,51.27%
12,BLOOMZ-7B,BigScience,36.00%,30.26%,23.53%,30.59%,34.88%,38.64%,32.32%
13,GPT3.5-Turbo,OpenAI,40.00%,28.95%,29.41%,21.18%,17.44%,17.42%,25.73%
14,悟道・天鹰(AquilaChat-7B),智源研究院,24.00%,25.00%,20.59%,22.35%,20.93%,25.00%,22.98%