INVALSIbenchmark / src /macro_area.csv
Andrea Seveso
Additional tabs
539e451
raw
history blame
1.63 kB
Sezione,Comprensione del testo,Comprensione del testo,Comprensione del testo,Riflessione sulla lingua,Riflessione sulla lingua,Riflessione sulla lingua,Riflessione sulla lingua,Riflessione sulla lingua,Riflessione sulla lingua
MacroAspetto,Localizzare e individuare informazioni all’interno del testo,"Ricostruire il significato del testo, a livello locale o globale","Riflettere sul contenuto o sulla forma del testo, a livello locale o globale, e valutarli",Formazione delle parole,Lessico e semantica,Morfologia,Ortografia,Sintassi,Testualità e pragmatica
Model,,,,,,,,,
LLaMAntino-3-ANITA-8B-Inst-DPO-ITA,60.2,63.1,78.8,28.6,37.9,16.7,0.0,26.3,50.0
Minerva-3B-base-v1.0,4.6,3.9,9.1,28.6,3.4,4.2,0.0,5.3,0.0
claude-3-haiku,78.7,86.0,75.8,71.4,65.5,62.5,0.0,57.9,83.3
claude-3-opus,91.7,91.6,78.8,100.0,82.8,75.0,50.0,89.5,83.3
claude-3-sonnet,87.0,90.5,75.8,100.0,62.1,75.0,0.0,52.6,100.0
command-r-plus,74.1,80.4,81.8,71.4,65.5,66.7,0.0,57.9,83.3
gemini-flash-1.5,83.3,85.5,81.8,85.7,62.1,83.3,25.0,63.2,66.7
gemini-pro,78.7,82.1,81.8,71.4,51.7,70.8,0.0,68.4,66.7
gemini-pro-1.5,90.7,87.7,84.8,57.1,55.2,58.3,25.0,63.2,33.3
gpt-3.5-turbo-0125,61.1,64.8,63.6,42.9,55.2,58.3,0.0,47.4,83.3
gpt-4-turbo,77.8,82.1,75.8,71.4,82.8,75.0,50.0,73.7,100.0
gpt-4o,64.8,69.8,51.5,100.0,69.0,87.5,0.0,89.5,100.0
llama-3-70b-instruct,83.3,85.5,75.8,71.4,55.2,33.3,0.0,47.4,50.0
llama-3-8b-instruct,48.2,53.6,63.6,14.3,34.5,29.2,0.0,31.6,50.0
mistral-7b-instruct:nitro,51.8,59.2,51.5,28.6,37.9,29.2,0.0,31.6,33.3
mixtral-8x7b-instruct,74.1,77.1,69.7,42.9,37.9,50.0,0.0,52.6,50.0
zefiro-7b-base-ITA,50.0,49.7,48.5,57.1,20.7,16.7,0.0,26.3,50.0