updates: eval results and github repo link
Browse files
README.md
CHANGED
@@ -2,9 +2,6 @@
|
|
2 |
pipeline_tag: text-generation
|
3 |
inference: false
|
4 |
license: apache-2.0
|
5 |
-
# datasets:
|
6 |
-
# metrics:
|
7 |
-
# - code_eval
|
8 |
library_name: transformers
|
9 |
tags:
|
10 |
- language
|
@@ -152,6 +149,16 @@ model-index:
|
|
152 |
type: pass@1
|
153 |
value: 38.93
|
154 |
veriefied: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
155 |
- task:
|
156 |
type: text-generation
|
157 |
dataset:
|
@@ -180,7 +187,7 @@ model-index:
|
|
180 |
metrics:
|
181 |
- name: pass@1
|
182 |
type: pass@1
|
183 |
-
value:
|
184 |
veriefied: false
|
185 |
- task:
|
186 |
type: text-generation
|
@@ -191,35 +198,25 @@ model-index:
|
|
191 |
- name: pass@1
|
192 |
type: pass@1
|
193 |
value: 17.40
|
194 |
-
veriefied: false
|
195 |
-
- task:
|
196 |
-
type: text-generation
|
197 |
-
dataset:
|
198 |
-
type: multilingual
|
199 |
-
name: MGSM
|
200 |
-
metrics:
|
201 |
-
- name: pass@1
|
202 |
-
type: pass@1
|
203 |
-
value: 25.13
|
204 |
-
veriefied: false
|
205 |
---
|
206 |
|
207 |
-
<!-- ![image/png](
|
208 |
|
209 |
# Granite-3.0-3B-A800M-Base
|
210 |
|
211 |
## Model Summary
|
212 |
**Granite-3.0-3B-A800M-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-3B-A800M-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 8 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
|
213 |
|
214 |
-
|
215 |
- **Developers:** IBM Research
|
216 |
-
- **GitHub Repository:** [ibm-granite/granite-language-models](https://github.com/ibm-granite/granite-language-models)
|
217 |
-
- **
|
|
|
218 |
- **Release Date**: October 21st, 2024
|
219 |
-
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
220 |
|
221 |
## Supported Languages
|
222 |
-
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
|
223 |
|
224 |
## Usage
|
225 |
### Intended use
|
@@ -301,4 +298,4 @@ The use of Large Language Models involves risks and ethical considerations peopl
|
|
301 |
year = {2024},
|
302 |
url = {https://arxiv.org/abs/0000.00000},
|
303 |
}
|
304 |
-
```
|
|
|
2 |
pipeline_tag: text-generation
|
3 |
inference: false
|
4 |
license: apache-2.0
|
|
|
|
|
|
|
5 |
library_name: transformers
|
6 |
tags:
|
7 |
- language
|
|
|
149 |
type: pass@1
|
150 |
value: 38.93
|
151 |
veriefied: false
|
152 |
+
- task:
|
153 |
+
type: text-generation
|
154 |
+
dataset:
|
155 |
+
type: reasoning
|
156 |
+
name: MUSR
|
157 |
+
metrics:
|
158 |
+
- name: pass@1
|
159 |
+
type: pass@1
|
160 |
+
value: 35.05
|
161 |
+
veriefied: false
|
162 |
- task:
|
163 |
type: text-generation
|
164 |
dataset:
|
|
|
187 |
metrics:
|
188 |
- name: pass@1
|
189 |
type: pass@1
|
190 |
+
value: 35.86
|
191 |
veriefied: false
|
192 |
- task:
|
193 |
type: text-generation
|
|
|
198 |
- name: pass@1
|
199 |
type: pass@1
|
200 |
value: 17.40
|
201 |
+
veriefied: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
202 |
---
|
203 |
|
204 |
+
<!-- ![image/png](granite-3_0-language-models_Group_1.png) -->
|
205 |
|
206 |
# Granite-3.0-3B-A800M-Base
|
207 |
|
208 |
## Model Summary
|
209 |
**Granite-3.0-3B-A800M-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-3B-A800M-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 8 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
|
210 |
|
|
|
211 |
- **Developers:** IBM Research
|
212 |
+
- **GitHub Repository:** [ibm-granite/granite-3.0-language-models](https://github.com/ibm-granite/granite-3.0-language-models)
|
213 |
+
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
214 |
+
- **Paper:** [Granite 3.0 Language Models]()
|
215 |
- **Release Date**: October 21st, 2024
|
216 |
+
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
217 |
|
218 |
## Supported Languages
|
219 |
+
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
|
220 |
|
221 |
## Usage
|
222 |
### Intended use
|
|
|
298 |
year = {2024},
|
299 |
url = {https://arxiv.org/abs/0000.00000},
|
300 |
}
|
301 |
+
```
|