update readme
Browse files
README.md
ADDED
@@ -0,0 +1,242 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: ibm-granite/granite-3.0-2b-instruct
|
3 |
+
license: apache-2.0
|
4 |
+
pipeline_tag: text-generation
|
5 |
+
tags:
|
6 |
+
- language
|
7 |
+
- granite-3.0
|
8 |
+
quantized_model: AliNemati
|
9 |
+
inference: false
|
10 |
+
model-index:
|
11 |
+
- name: granite-3.0-2b-instruct
|
12 |
+
results:
|
13 |
+
- task:
|
14 |
+
type: text-generation
|
15 |
+
dataset:
|
16 |
+
name: IFEval
|
17 |
+
type: instruction-following
|
18 |
+
metrics:
|
19 |
+
- type: pass@1
|
20 |
+
value: 52.27
|
21 |
+
name: pass@1
|
22 |
+
- type: pass@1
|
23 |
+
value: 8.22
|
24 |
+
name: pass@1
|
25 |
+
- task:
|
26 |
+
type: text-generation
|
27 |
+
dataset:
|
28 |
+
name: AGI-Eval
|
29 |
+
type: human-exams
|
30 |
+
metrics:
|
31 |
+
- type: pass@1
|
32 |
+
value: 40.52
|
33 |
+
name: pass@1
|
34 |
+
- type: pass@1
|
35 |
+
value: 65.82
|
36 |
+
name: pass@1
|
37 |
+
- type: pass@1
|
38 |
+
value: 34.45
|
39 |
+
name: pass@1
|
40 |
+
- task:
|
41 |
+
type: text-generation
|
42 |
+
dataset:
|
43 |
+
name: OBQA
|
44 |
+
type: commonsense
|
45 |
+
metrics:
|
46 |
+
- type: pass@1
|
47 |
+
value: 46.6
|
48 |
+
name: pass@1
|
49 |
+
- type: pass@1
|
50 |
+
value: 71.21
|
51 |
+
name: pass@1
|
52 |
+
- type: pass@1
|
53 |
+
value: 82.61
|
54 |
+
name: pass@1
|
55 |
+
- type: pass@1
|
56 |
+
value: 77.51
|
57 |
+
name: pass@1
|
58 |
+
- type: pass@1
|
59 |
+
value: 60.32
|
60 |
+
name: pass@1
|
61 |
+
- task:
|
62 |
+
type: text-generation
|
63 |
+
dataset:
|
64 |
+
name: BoolQ
|
65 |
+
type: reading-comprehension
|
66 |
+
metrics:
|
67 |
+
- type: pass@1
|
68 |
+
value: 88.65
|
69 |
+
name: pass@1
|
70 |
+
- type: pass@1
|
71 |
+
value: 21.58
|
72 |
+
name: pass@1
|
73 |
+
- task:
|
74 |
+
type: text-generation
|
75 |
+
dataset:
|
76 |
+
name: ARC-C
|
77 |
+
type: reasoning
|
78 |
+
metrics:
|
79 |
+
- type: pass@1
|
80 |
+
value: 64.16
|
81 |
+
name: pass@1
|
82 |
+
- type: pass@1
|
83 |
+
value: 33.81
|
84 |
+
name: pass@1
|
85 |
+
- type: pass@1
|
86 |
+
value: 51.55
|
87 |
+
name: pass@1
|
88 |
+
- task:
|
89 |
+
type: text-generation
|
90 |
+
dataset:
|
91 |
+
name: HumanEvalSynthesis
|
92 |
+
type: code
|
93 |
+
metrics:
|
94 |
+
- type: pass@1
|
95 |
+
value: 64.63
|
96 |
+
name: pass@1
|
97 |
+
- type: pass@1
|
98 |
+
value: 57.16
|
99 |
+
name: pass@1
|
100 |
+
- type: pass@1
|
101 |
+
value: 65.85
|
102 |
+
name: pass@1
|
103 |
+
- type: pass@1
|
104 |
+
value: 49.6
|
105 |
+
name: pass@1
|
106 |
+
- task:
|
107 |
+
type: text-generation
|
108 |
+
dataset:
|
109 |
+
name: GSM8K
|
110 |
+
type: math
|
111 |
+
metrics:
|
112 |
+
- type: pass@1
|
113 |
+
value: 68.99
|
114 |
+
name: pass@1
|
115 |
+
- type: pass@1
|
116 |
+
value: 30.94
|
117 |
+
name: pass@1
|
118 |
+
- task:
|
119 |
+
type: text-generation
|
120 |
+
dataset:
|
121 |
+
name: PAWS-X (7 langs)
|
122 |
+
type: multilingual
|
123 |
+
metrics:
|
124 |
+
- type: pass@1
|
125 |
+
value: 64.94
|
126 |
+
name: pass@1
|
127 |
+
- type: pass@1
|
128 |
+
value: 48.2
|
129 |
+
name: pass@1
|
130 |
+
---
|
131 |
+
|
132 |
+
**osllm.ai Models Highlights Program**
|
133 |
+
|
134 |
+
**We believe there's no need to pay a token if you have a GPU on your computer.**
|
135 |
+
|
136 |
+
Highlighting new and noteworthy models from the community. Join the conversation on Discord.
|
137 |
+
|
138 |
+
|
139 |
+
**Model creator**: ibm-granite
|
140 |
+
|
141 |
+
**Original model**: granite-3.0-3b-a800m-instruct
|
142 |
+
|
143 |
+
|
144 |
+
[**README**:](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct/edit/main/README.md)
|
145 |
+
|
146 |
+
<p align="center">
|
147 |
+
<a href="https://osllm.ai">Official Website</a> • <a href="https://docs.osllm.ai/index.html">Documentation</a> • <a href="https://discord.gg/2fftQauwDD">Discord</a>
|
148 |
+
</p>
|
149 |
+
|
150 |
+
|
151 |
+
|
152 |
+
<p align="center">
|
153 |
+
<b>NEW:</b> <a href="https://docs.google.com/forms/d/1CQXJvxLUqLBSXnjqQmRpOyZqD6nrKubLz2WTcIJ37fU/prefill">Subscribe to our mailing list</a> for updates and news!
|
154 |
+
</p>
|
155 |
+
|
156 |
+
|
157 |
+
Email: support@osllm.ai
|
158 |
+
|
159 |
+
|
160 |
+
**Model Summary**:
|
161 |
+
|
162 |
+
Granite-3.0-8B-Instruct is an 8B parameter model finetuned from Granite-3.0-8B-Base using a combination of open-source instruction datasets with permissive licenses and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging.
|
163 |
+
|
164 |
+
|
165 |
+
|
166 |
+
**Model Summary:**
|
167 |
+
Granite-3.0-8B-Instruct is a 8B parameter model finetuned from *Granite-3.0-8B-Base* using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets. This model is developed using a diverse set of techniques with a structured chat format, including supervised finetuning, model alignment using reinforcement learning, and model merging.
|
168 |
+
|
169 |
+
- **Developers:** Granite Team, IBM
|
170 |
+
- **GitHub Repository:** [ibm-granite/granite-3.0-language-models](https://github.com/ibm-granite/granite-3.0-language-models)
|
171 |
+
- **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
|
172 |
+
- **Paper:** [Granite 3.0 Language Models](https://github.com/ibm-granite/granite-3.0-language-models/blob/main/paper.pdf)
|
173 |
+
- **Release Date**: October 21st, 2024
|
174 |
+
- **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
175 |
+
|
176 |
+
**Supported Languages:**
|
177 |
+
English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may finetune Granite 3.0 models for languages beyond these 12 languages.
|
178 |
+
|
179 |
+
**Intended use:**
|
180 |
+
The model is designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications.
|
181 |
+
|
182 |
+
*Capabilities*
|
183 |
+
* Summarization
|
184 |
+
* Text classification
|
185 |
+
* Text extraction
|
186 |
+
* Question-answering
|
187 |
+
* Retrieval Augmented Generation (RAG)
|
188 |
+
* Code related tasks
|
189 |
+
* Function-calling tasks
|
190 |
+
* Multilingual dialog use cases
|
191 |
+
|
192 |
+
|
193 |
+
|
194 |
+
|
195 |
+
**About [osllm.ai](https://osllm.ai)**:
|
196 |
+
|
197 |
+
[osllm.ai](https://osllm.ai) is a community-driven platform that provides access to a wide range of open-source language models.
|
198 |
+
|
199 |
+
1. **[IndoxJudge](https://github.com/indoxJudge)**: A free, open-source tool for evaluating large language models (LLMs).
|
200 |
+
It provides key metrics to assess performance, reliability, and risks like bias and toxicity, helping ensure model safety.
|
201 |
+
|
202 |
+
1. **[inDox](https://github.com/inDox)**: An open-source retrieval augmentation tool for extracting data from various
|
203 |
+
document formats (text, PDFs, HTML, Markdown, LaTeX). It handles structured and unstructured data and supports both
|
204 |
+
online and offline LLMs.
|
205 |
+
|
206 |
+
1. **[IndoxGen](https://github.com/IndoxGen)**: A framework for generating high-fidelity synthetic data using LLMs and
|
207 |
+
human feedback, designed for enterprise use with high flexibility and precision.
|
208 |
+
|
209 |
+
1. **[Phoenix](https://github.com/Phoenix)**: A multi-platform, open-source chatbot that interacts with documents
|
210 |
+
locally, without internet or GPU. It integrates inDox and IndoxJudge to improve accuracy and prevent hallucinations,
|
211 |
+
ideal for sensitive fields like healthcare.
|
212 |
+
|
213 |
+
1. **[Phoenix_cli](https://github.com/Phoenix_cli)**: A multi-platform command-line tool that runs LLaMA models locally,
|
214 |
+
supporting up to eight concurrent tasks through multithreading, eliminating the need for cloud-based services.
|
215 |
+
|
216 |
+
|
217 |
+
|
218 |
+
|
219 |
+
**Special thanks**
|
220 |
+
|
221 |
+
🙏 Special thanks to [**Georgi Gerganov**](https://github.com/ggerganov) and the whole team working on [**llama.cpp**](https://github.com/ggerganov/llama.cpp) for making all of this possible.
|
222 |
+
|
223 |
+
|
224 |
+
|
225 |
+
**Disclaimers**
|
226 |
+
|
227 |
+
[osllm.ai](https://osllm.ai) is not the creator, originator, or owner of any Model featured in the Community Model Program.
|
228 |
+
Each Community Model is created and provided by third parties. osllm.ai does not endorse, support, represent,
|
229 |
+
or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand
|
230 |
+
that Community Models can produce content that might be offensive, harmful, inaccurate, or otherwise
|
231 |
+
inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who
|
232 |
+
originated such Model. osllm.ai may not monitor or control the Community Models and cannot, and does not, take
|
233 |
+
responsibility for any such Model. osllm.ai disclaims all warranties or guarantees about the accuracy,
|
234 |
+
reliability, or benefits of the Community Models. osllm.ai further disclaims any warranty that the Community
|
235 |
+
Model will meet your requirements, be secure, uninterrupted, or available at any time or location, or
|
236 |
+
error-free, virus-free, or that any errors will be corrected, or otherwise. You will be solely responsible for
|
237 |
+
any damage resulting from your use of or access to the Community Models, your downloading of any Community
|
238 |
+
Model, or use of any other Community Model provided by or through [osllm.ai](https://osllm.ai).
|
239 |
+
|
240 |
+
|
241 |
+
|
242 |
+
|