leaderboard-pr-bot commited on
Commit
f6fbe43
1 Parent(s): 8a0609f

Adding Evaluation Results

Browse files

This is an automated PR created with https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr

The purpose of this PR is to add evaluation results from the Open LLM Leaderboard to your model card.

If you encounter any issues, please report them to https://huggingface.co/spaces/Weyaxi/open-llm-leaderboard-results-pr/discussions

Files changed (1) hide show
  1. README.md +48 -75
README.md CHANGED
@@ -1,86 +1,46 @@
1
  ---
2
- base_model: deepseek-ai/deepseek-math-7b-base
3
  tags:
4
  - alignment-handbook
5
  - generated_from_trainer
 
6
  widget:
7
- - example_title: Math problem
8
- messages:
9
- - role: user
10
- content: >-
11
- For how many values of the constant $k$ will the polynomial
12
- $x^{2}+kx+36$ have two distinct integer roots?
13
- output:
14
- text: >-
15
- ### Solution:
16
-
17
- 1- For the polynomial \\( x^2 + kx + 36 \\) to have two
18
- distinct integer roots, let's denote these roots by \\( r_1 \\) and \\(
19
- r_2 \\).
20
-
21
-
22
-
23
- 2- According to Vieta's formulas, the sum of the roots \\( r_1 + r_2 \\)
24
- is equal to \\(-k\\), and the product of the roots \\( r_1 \\cdot r_2
25
- \\) is equal to 36.
26
-
27
-
28
-
29
- 3- To find the distinct integer pairs \\((r_1, r_2)\\) whose product is
30
- 36, we need to determine all pairs of integers \\((r_1, r_2)\\) such
31
- that \\( r_1 \\cdot r_2 = 36 \\) and \\( r_1 \\neq r_2 \\).
32
-
33
-
34
-
35
- 4- From the equation \\( r_1 + r_2 = -k \\), for each valid pair
36
- \\((r_1, r_2)\\), we can compute the corresponding value of \\( k \\).
37
-
38
-
39
-
40
- 5- Finally, since we need the polynomial to have two distinct integer
41
- roots, we need to ensure that \\( r_1 \\) and \\( r_2 \\) are distinct.
42
-
43
- Let's start by finding all pairs \\((r_1, r_2)\\) such that \\( r_1
44
- \\cdot r_2 = 36 \\). We'll then determine the values of \\( k \\) and
45
- ensure the roots are distinct.
46
-
47
- ```python import itertools
48
-
49
- # Find all pairs (r1, r2) such that r1 * r2 = 36 product_36 = 36
50
- factor_pairs = []
51
-
52
- for i in range(1, product_36 + 1):
53
- if product_36 % i == 0:
54
- pair = (i, product_36 // i)
55
- if pair[0] != pair[1]: # Ensure distinct pairs
56
- factor_pairs.append(pair)
57
-
58
- # Calculate k for each pair and ensure distinct integer roots
59
- valid_k_values = set()
60
- for r1, r2 in factor_pairs:
61
- if r1 != r2:
62
- k = -(r1 + r2)
63
- valid_k_values.add(k)
64
-
65
- print((len(valid_k_values), sorted(valid_k_values)))
66
- ```
67
-
68
- ```output
69
- (4, [-37, -20, -15,-13])
70
- ```
71
- The distinct integer values of \\( k \\) that make the
72
- polynomial \\( x^2 + kx + 36 \\) have two distinct integer roots are
73
- \\(-37, -20, -15, \\text{and} -13\\).
74
-
75
- Therefore, the number of such values of \\( k \\) is:
76
-
77
- [ \\boxed{4} \\]
78
-
79
  pipeline_tag: text-generation
80
  model-index:
81
  - name: NuminaMath-7B-TIR
82
  results: []
83
- license: apache-2.0
84
  ---
85
 
86
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -197,4 +157,17 @@ If you find NuminaMath 7B TIR is useful in your work, please cite it with:
197
  journal = {Hugging Face repository},
198
  howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-TIR}}
199
  }
200
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  tags:
4
  - alignment-handbook
5
  - generated_from_trainer
6
+ base_model: deepseek-ai/deepseek-math-7b-base
7
  widget:
8
+ - example_title: Math problem
9
+ messages:
10
+ - role: user
11
+ content: For how many values of the constant $k$ will the polynomial $x^{2}+kx+36$
12
+ have two distinct integer roots?
13
+ output:
14
+ text: "### Solution: \n1- For the polynomial \\\\( x^2 + kx + 36 \\\\) to have\
15
+ \ two distinct integer roots, let's denote these roots by \\\\( r_1 \\\\) and\
16
+ \ \\\\( r_2 \\\\).\n\n\n2- According to Vieta's formulas, the sum of the roots\
17
+ \ \\\\( r_1 + r_2 \\\\) is equal to \\\\(-k\\\\), and the product of the roots\
18
+ \ \\\\( r_1 \\\\cdot r_2 \\\\) is equal to 36.\n\n\n3- To find the distinct\
19
+ \ integer pairs \\\\((r_1, r_2)\\\\) whose product is 36, we need to determine\
20
+ \ all pairs of integers \\\\((r_1, r_2)\\\\) such that \\\\( r_1 \\\\cdot r_2\
21
+ \ = 36 \\\\) and \\\\( r_1 \\\\neq r_2 \\\\).\n\n\n4- From the equation \\\\\
22
+ ( r_1 + r_2 = -k \\\\), for each valid pair \\\\((r_1, r_2)\\\\), we can compute\
23
+ \ the corresponding value of \\\\( k \\\\).\n\n\n5- Finally, since we need the\
24
+ \ polynomial to have two distinct integer roots, we need to ensure that \\\\\
25
+ ( r_1 \\\\) and \\\\( r_2 \\\\) are distinct.\nLet's start by finding all pairs\
26
+ \ \\\\((r_1, r_2)\\\\) such that \\\\( r_1 \\\\cdot r_2 = 36 \\\\). We'll then\
27
+ \ determine the values of \\\\( k \\\\) and ensure the roots are distinct.\n\
28
+ ```python import itertools\n# Find all pairs (r1, r2) such that r1 * r2 = 36\
29
+ \ product_36 = 36 factor_pairs = []\nfor i in range(1, product_36 + 1):\n if\
30
+ \ product_36 % i == 0:\n pair = (i, product_36 // i)\n if pair[0] != pair[1]:\
31
+ \ # Ensure distinct pairs\n factor_pairs.append(pair)\n \n # Calculate\
32
+ \ k for each pair and ensure distinct integer roots\n valid_k_values = set()\n\
33
+ \ for r1, r2 in factor_pairs:\n if r1 != r2:\n k = -(r1 + r2)\n\
34
+ \ valid_k_values.add(k)\n \n print((len(valid_k_values), sorted(valid_k_values)))\n\
35
+ \ ```\n \n ```output\n (4, [-37, -20, -15,-13])\n ```\n The distinct integer\
36
+ \ values of \\\\( k \\\\) that make the\npolynomial \\\\( x^2 + kx + 36 \\\\\
37
+ ) have two distinct integer roots are \\\\(-37, -20, -15, \\\\text{and} -13\\\
38
+ \\).\nTherefore, the number of such values of \\\\( k \\\\) is:\n[ \\\\boxed{4}\
39
+ \ \\\\]"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  pipeline_tag: text-generation
41
  model-index:
42
  - name: NuminaMath-7B-TIR
43
  results: []
 
44
  ---
45
 
46
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
157
  journal = {Hugging Face repository},
158
  howpublished = {\url{https://huggingface.co/AI-MO/NuminaMath-7B-TIR}}
159
  }
160
+ ```
161
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
162
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_AI-MO__NuminaMath-7B-TIR)
163
+
164
+ | Metric |Value|
165
+ |-------------------|----:|
166
+ |Avg. |11.79|
167
+ |IFEval (0-Shot) |27.56|
168
+ |BBH (3-Shot) |16.87|
169
+ |MATH Lvl 5 (4-Shot)| 1.74|
170
+ |GPQA (0-shot) | 1.12|
171
+ |MuSR (0-shot) | 4.20|
172
+ |MMLU-PRO (5-shot) |19.25|
173
+