Update README.md
Browse files
README.md
CHANGED
@@ -4,4 +4,88 @@ language:
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.3-70B-Instruct
|
7 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
- en
|
5 |
base_model:
|
6 |
- meta-llama/Llama-3.3-70B-Instruct
|
7 |
+
---
|
8 |
+
|
9 |
+
## Quantization : Q2_K (using Llama.cpp)
|
10 |
+
- llm_load_print_meta: model type = 70B
|
11 |
+
- llm_load_print_meta: model ftype = Q2_K - Medium
|
12 |
+
- llm_load_print_meta: model params = 70.55 B
|
13 |
+
- llm_load_print_meta: model size = 24.56 GiB (2.99 BPW)
|
14 |
+
- llama_model_loader: - type f32: 162 tensors
|
15 |
+
- llama_model_loader: - type q2_K: 321 tensors
|
16 |
+
- llama_model_loader: - type q3_K: 160 tensors
|
17 |
+
- llama_model_loader: - type q5_K: 80 tensors
|
18 |
+
- llama_model_loader: - type q6_K: 1 tensors
|
19 |
+
|
20 |
+
## MMLU Result : 74.89%
|
21 |
+
Category STEM: 66.09% (18 subjects)
|
22 |
+
- high_school_chemistry: 64.04%
|
23 |
+
- high_school_mathematics: 46.67%
|
24 |
+
- abstract_algebra: 48.00%
|
25 |
+
- computer_security: 84.00%
|
26 |
+
- college_computer_science: 61.62%
|
27 |
+
- college_chemistry: 53.00%
|
28 |
+
- conceptual_physics: 74.89%
|
29 |
+
- high_school_statistics: 68.06%
|
30 |
+
- college_mathematics: 44.00%
|
31 |
+
- college_biology: 88.19%
|
32 |
+
- college_physics: 52.94%
|
33 |
+
- elementary_mathematics: 64.81%
|
34 |
+
- high_school_biology: 88.71%
|
35 |
+
- high_school_physics: 57.62%
|
36 |
+
- machine_learning: 56.25%
|
37 |
+
- astronomy: 88.16%
|
38 |
+
- electrical_engineering: 69.66%
|
39 |
+
- high_school_computer_science: 79.00%
|
40 |
+
|
41 |
+
Category humanities: 79.28% (13 subjects)
|
42 |
+
- world_religions: 84.80%
|
43 |
+
- high_school_us_history: 89.71%
|
44 |
+
- moral_disputes: 77.75%
|
45 |
+
- high_school_world_history: 88.61%
|
46 |
+
- formal_logic: 62.70%
|
47 |
+
- international_law: 85.12%
|
48 |
+
- jurisprudence: 76.85%
|
49 |
+
- professional_law: 59.58%
|
50 |
+
- logical_fallacies: 83.44%
|
51 |
+
- philosophy: 74.28%
|
52 |
+
- moral_scenarios: 78.66%
|
53 |
+
- prehistory: 84.26%
|
54 |
+
- high_school_european_history: 84.85%
|
55 |
+
|
56 |
+
Category social sciences: 82.11% (12 subjects)
|
57 |
+
- high_school_geography: 86.36%
|
58 |
+
- high_school_psychology: 91.19%
|
59 |
+
- sociology: 87.56%
|
60 |
+
- high_school_microeconomics: 86.55%
|
61 |
+
- professional_psychology: 76.80%
|
62 |
+
- security_studies: 77.55%
|
63 |
+
- us_foreign_policy: 91.00%
|
64 |
+
- public_relations: 70.91%
|
65 |
+
- high_school_government_and_politics: 93.78%
|
66 |
+
- econometrics: 61.40%
|
67 |
+
- human_sexuality: 81.68%
|
68 |
+
- high_school_macroeconomics: 80.51%
|
69 |
+
|
70 |
+
Category other (business, health, misc.): 75.95% (14 subjects)
|
71 |
+
- virology: 53.61%
|
72 |
+
- college_medicine: 72.25%
|
73 |
+
- global_facts: 62.00%
|
74 |
+
- miscellaneous: 87.36%
|
75 |
+
- medical_genetics: 84.00%
|
76 |
+
- human_aging: 78.48%
|
77 |
+
- nutrition: 83.33%
|
78 |
+
- marketing: 88.89%
|
79 |
+
- anatomy: 71.85%
|
80 |
+
- professional_medicine: 88.24%
|
81 |
+
- professional_accounting: 56.03%
|
82 |
+
- management: 82.52%
|
83 |
+
- clinical_knowledge: 80.75%
|
84 |
+
- business_ethics: 74.00%
|
85 |
+
|
86 |
+
Overall correct rate: 74.89%
|
87 |
+
Total subjects evaluated: 57
|
88 |
+
|
89 |
+
## Perplexity 6.6865 +/- 0.04336
|
90 |
+
(using wikitext-2-raw/wiki.test.raw)
|
91 |
+
|