JosephusCheung commited on
Commit
08818c1
1 Parent(s): 1357db5

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +186 -19
index.html CHANGED
@@ -1,19 +1,186 @@
1
- <!doctype html>
2
- <html>
3
- <head>
4
- <meta charset="utf-8" />
5
- <meta name="viewport" content="width=device-width" />
6
- <title>My static Space</title>
7
- <link rel="stylesheet" href="style.css" />
8
- </head>
9
- <body>
10
- <div class="card">
11
- <h1>Welcome to your static Space!</h1>
12
- <p>You can modify this app directly by editing <i>index.html</i> in the Files and versions tab.</p>
13
- <p>
14
- Also don't forget to check the
15
- <a href="https://huggingface.co/docs/hub/spaces" target="_blank">Spaces documentation</a>.
16
- </p>
17
- </div>
18
- </body>
19
- </html>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <style>
7
+ body {
8
+ background-color: #111;
9
+ font-family: Arial, sans-serif;
10
+ color: #fff;
11
+ display: flex;
12
+ justify-content: center;
13
+ align-items: center;
14
+ flex-direction: column;
15
+ height: 100vh;
16
+ margin: 0;
17
+ }
18
+
19
+ h1 {
20
+ font-size: 36px;
21
+ margin-bottom: 10px;
22
+ }
23
+
24
+ h2 {
25
+ font-size: 18px;
26
+ font-weight: normal;
27
+ margin-bottom: 30px;
28
+ color: #ccc;
29
+ }
30
+
31
+ table {
32
+ width: 90%;
33
+ border-collapse: separate;
34
+ border-spacing: 0;
35
+ background-color: #1b1b1b;
36
+ border-radius: 12px;
37
+ overflow: hidden;
38
+ margin: 20px 0;
39
+ table-layout: fixed;
40
+ }
41
+
42
+ th, td {
43
+ text-align: center;
44
+ padding: 12px;
45
+ border: 1px solid #333;
46
+ vertical-align: middle;
47
+ }
48
+
49
+ th {
50
+ background-color: #222;
51
+ font-weight: bold;
52
+ font-size: 14px;
53
+ }
54
+
55
+ td {
56
+ background-color: #1b1b1b;
57
+ font-size: 14px;
58
+ word-wrap: break-word;
59
+ }
60
+
61
+ .highlight-column {
62
+ border-left: 3px solid #0066ff;
63
+ border-right: 3px solid #0066ff;
64
+ }
65
+
66
+ .highlight-header {
67
+ border-top: 3px solid #0066ff;
68
+ border-top-left-radius: 12px;
69
+ border-top-right-radius: 12px;
70
+ }
71
+
72
+ .highlight-footer {
73
+ border-bottom: 3px solid #0066ff;
74
+ border-bottom-left-radius: 12px;
75
+ border-bottom-right-radius: 12px;
76
+ }
77
+
78
+ .bold {
79
+ font-weight: 900; /* Extra bold */
80
+ }
81
+
82
+ tr:first-child th:first-child {
83
+ border-top-left-radius: 12px;
84
+ }
85
+
86
+ tr:first-child th:last-child {
87
+ border-top-right-radius: 12px;
88
+ }
89
+
90
+ tr:last-child td:first-child {
91
+ border-bottom-left-radius: 12px;
92
+ }
93
+
94
+ tr:last-child td:last-child {
95
+ border-bottom-right-radius: 12px;
96
+ }
97
+
98
+ .footnote {
99
+ font-size: 12px;
100
+ color: #888;
101
+ text-align: left;
102
+ max-width: 90%;
103
+ margin-top: 20px;
104
+ }
105
+
106
+ </style>
107
+ </head>
108
+ <body>
109
+
110
+ <h1>田忌赛马</h1>
111
+ <h2>Goodhart's Law on Benchmarks</h2>
112
+
113
+ <table>
114
+ <tr>
115
+ <th>Capability</th>
116
+ <th>Description</th>
117
+ <th class="highlight-column highlight-header">miniG</th>
118
+ <th>Gemini-Flash</th>
119
+ <th>GLM-4-9B-Chat</th>
120
+ <th>Llama 3.1 8B Instruct</th>
121
+ </tr>
122
+ <tr>
123
+ <td class="bold">MMLU</td>
124
+ <td>Representation of questions in 57 subjects<br>(incl. STEM, humanities, and others)</td>
125
+ <td class="highlight-column bold">85.45</td>
126
+ <td>78.9</td>
127
+ <td>72.4</td>
128
+ <td>69.4</td>
129
+ </tr>
130
+ <tr>
131
+ <td class="bold">IFEval</td>
132
+ <td>Evaluation of instruction-following<br>using verifiable prompts</td>
133
+ <td class="highlight-column">74.22</td>
134
+ <td>-</td>
135
+ <td>69</td>
136
+ <td class="bold">80.4</td>
137
+ </tr>
138
+ <tr>
139
+ <td class="bold">GSM8K</td>
140
+ <td>Challenging math problems<br>(5-shot evaluation)</td>
141
+ <td class="highlight-column">75.89 (5-shot)</td>
142
+ <td class="bold">86.2 (11-shot)</td>
143
+ <td>79.6</td>
144
+ <td>84.5 (8-shot CoT)</td>
145
+ </tr>
146
+ <tr>
147
+ <td class="bold">HumanEval</td>
148
+ <td>Python code generation on a held-out dataset<br>(0-shot)</td>
149
+ <td class="highlight-column bold">79.88</td>
150
+ <td>74.3</td>
151
+ <td>71.8</td>
152
+ <td>72.6</td>
153
+ </tr>
154
+ <tr>
155
+ <td class="bold">GPQA</td>
156
+ <td>Challenging dataset of questions<br>from biology, physics, and chemistry</td>
157
+ <td class="highlight-column">37.37</td>
158
+ <td class="bold">39.5</td>
159
+ <td>34.3 (base)</td>
160
+ <td>34.2</td>
161
+ </tr>
162
+ <tr>
163
+ <td class="bold">Context Window</td>
164
+ <td>Maximum context length<br>the model can handle</td>
165
+ <td class="highlight-column bold">1M</td>
166
+ <td class="bold">1M</td>
167
+ <td>128K</td>
168
+ <td>128K</td>
169
+ </tr>
170
+ <tr>
171
+ <td class="bold">Input</td>
172
+ <td>Supported input modalities</td>
173
+ <td class="highlight-column highlight-footer">Text, image<br>(single model)</td>
174
+ <td>Text, image, audio, video</td>
175
+ <td>Text only</td>
176
+ <td>Text only</td>
177
+ </tr>
178
+ </table>
179
+
180
+ <div class="footnote">
181
+ 1. miniG is a 14B parameter model derived from the 9B parameter glm-4-9b-chat-1m model weights. It continues pre-training on a selected corpus of 20B tokens while retaining long-context capabilities. The model is fine-tuned on a dataset of 120M+ conversation entries, synthesized through cross-page clustering similar to RAG on this selected corpus. Additionally, miniG underwent multimodal training in two stages for single image input, with the second stage reinitializing 5B parameters of a Vision Transformer from glm-4v-9b for Locked-Image Tuning.<br>
182
+ 2. miniG outputs are formatted similarly to Gemini 1.5 Flash but were not trained on data generated by the Gemini models.
183
+ </div>
184
+
185
+ </body>
186
+ </html>