Muennighoff
commited on
Commit
•
028d295
1
Parent(s):
99b17ec
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
-
license: bigscience-bloom-rail-1.0
|
3 |
datasets:
|
4 |
- bigscience/xP3
|
|
|
5 |
language:
|
6 |
- ak
|
7 |
- ar
|
@@ -83,32 +83,57 @@ widget:
|
|
83 |
example_title: "hi-en fable"
|
84 |
---
|
85 |
|
86 |
-
|
87 |
-
|
88 |
-
#
|
89 |
-
|
90 |
-
|
91 |
-
|
92 |
-
|
93 |
-
|
94 |
-
|
95 |
-
|
96 |
-
|
97 |
-
|
98 |
-
|
99 |
-
|
100 |
-
|
101 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
102 |
|
103 |
-
- [bloomz-mt](https://huggingface.co/bigscience/bloomz-mt): 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3). **Better than [bloomz](https://huggingface.co/bigscience/bloomz) when prompting in non-english**
|
104 |
-
- [bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt): 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3). **Better than [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1) when prompting in non-english**
|
105 |
|
106 |
-
----
|
107 |
|
108 |
-
- [bloomz-p3](https://huggingface.co/bigscience/bloomz): 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [P3](https://huggingface.co/bigscience/P3). **Released for research purposes, performance is inferior to [bloomz](https://huggingface.co/bigscience/bloomz)**
|
109 |
-
- [bloomz-7b1-p3](https://huggingface.co/bigscience/bloomz-7b1): 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [P3](https://huggingface.co/bigscience/P3). **Released for research purposes, performance is inferior to [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1)**
|
110 |
|
111 |
-
----
|
112 |
|
113 |
# Intended uses
|
114 |
|
|
|
1 |
---
|
|
|
2 |
datasets:
|
3 |
- bigscience/xP3
|
4 |
+
license: bigscience-bloom-rail-1.0
|
5 |
language:
|
6 |
- ak
|
7 |
- ar
|
|
|
83 |
example_title: "hi-en fable"
|
84 |
---
|
85 |
|
86 |
+
# Table of Contents
|
87 |
+
|
88 |
+
1. [Model Summary](#model=summary)
|
89 |
+
2. [Use](#use)
|
90 |
+
3. [Bias, Risks, and Limitations](#bias-risks-and-limitations)
|
91 |
+
4. [Training Details](#training-details)
|
92 |
+
5. [Evaluation](#evaluation)
|
93 |
+
6. [Environmental Impact](#environmental-impact)
|
94 |
+
7. [Citation](#citation)
|
95 |
+
8. [Model Card Authors](#model-card-authors)
|
96 |
+
9. [How To Get Started With the Model](#how-to-get-started-with-the-model)
|
97 |
+
|
98 |
+
# Model Summary
|
99 |
+
|
100 |
+
> We present BLOOMZ & mT0, a family of models capable of following human instructions in hundreds of languages. By finetuning large BLOOM & mT5 pretrained multilingual language models on our multilingual task mixture (xP3), we discover various generalization properties of our finetuned models acrosss tasks and languages.
|
101 |
+
|
102 |
+
- **Repository:** [bigscience-workshop/xmtf](https://github.com/bigscience-workshop/xmtf)
|
103 |
+
- **Paper:** [TODO]
|
104 |
+
- **Funded by:** The French government & Hugging Face
|
105 |
+
- **Point of Contact:** [Niklas Muennighoff](mailto:niklas@hf.co)
|
106 |
+
- **BLOOMZ & mT0 Model Family:**
|
107 |
+
|Name|Explanation|
|
108 |
+
|----|-----------|
|
109 |
+
|[bloomz-560m](https://huggingface.co/bigscience/bloomz-560m)| 560M parameter multitask finetuned version of [bloom-560m](https://huggingface.co/bigscience/bloom-560m) on [xP3](https://huggingface.co/bigscience/xP3)|
|
110 |
+
|[bloomz-1b1](https://huggingface.co/bigscience/bloomz-1b1)| 1.1B parameter multitask finetuned version of [bloom-1b1](https://huggingface.co/bigscience/bloom-1b1) on [xP3](https://huggingface.co/bigscience/xP3)|
|
111 |
+
|[bloomz-1b7](https://huggingface.co/bigscience/bloomz-1b7)| 1.7B parameter multitask finetuned version of [bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) on [xP3](https://huggingface.co/bigscience/xP3)|
|
112 |
+
|[bloomz-3b](https://huggingface.co/bigscience/bloomz-3b)| 3B parameter multitask finetuned version of [bloom-3b](https://huggingface.co/bigscience/bloom-3b) on [xP3](https://huggingface.co/bigscience/xP3)|
|
113 |
+
|[bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1)|7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [xP3](https://huggingface.co/bigscience/xP3)|
|
114 |
+
|[bloomz](https://huggingface.co/bigscience/bloomz)|176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [xP3](https://huggingface.co/bigscience/xP3)|
|
115 |
+
|||
|
116 |
+
|[bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt)|7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3mt). **Better than [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1) when prompting in non-English**|
|
117 |
+
|[bloomz-mt](https://huggingface.co/bigscience/bloomz-mt)| 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3mt). **Better than [bloomz](https://huggingface.co/bigscience/bloomz) when prompting in non-English**|
|
118 |
+
|||
|
119 |
+
|[bloomz-7b1-p3](https://huggingface.co/bigscience/bloomz-7b1)| 7.1B parameter multitask finetuned version of [bloom-7b1](https://huggingface.co/bigscience/bloom-7b1) on [P3](https://huggingface.co/bigscience/P3). **Released for research purposes, performance is inferior to [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1)**|
|
120 |
+
|[bloomz-p3](https://huggingface.co/bigscience/bloomz)| 176B parameter multitask finetuned version of [bloom](https://huggingface.co/bigscience/bloom) on [P3](https://huggingface.co/bigscience/P3). **Released for research purposes, performance is inferior to [bloomz](https://huggingface.co/bigscience/bloomz)**|
|
121 |
+
|||
|
122 |
+
|||
|
123 |
+
|[mt0-small](https://huggingface.co/bigscience/mt0-xxl)|300M parameter multitask finetuned version of [mt5-small](https://huggingface.co/google/mt5-small) on [xP3](https://huggingface.co/bigscience/xP3)|
|
124 |
+
|[mt0-base](https://huggingface.co/bigscience/mt0-xxl)|580M parameter multitask finetuned version of [mt5-base](https://huggingface.co/google/mt5-base) on [xP3](https://huggingface.co/bigscience/xP3)|
|
125 |
+
|[mt0-large](https://huggingface.co/bigscience/mt0-xxl)|1.2B parameter multitask finetuned version of [mt5-large](https://huggingface.co/google/mt5-large) on [xP3](https://huggingface.co/bigscience/xP3)|
|
126 |
+
|[mt0-xl](https://huggingface.co/bigscience/mt0-xxl)|3.7B parameter multitask finetuned version of [mt5-xl](https://huggingface.co/google/mt5-xl) on [xP3](https://huggingface.co/bigscience/xP3)|
|
127 |
+
|[mt0-xxl](https://huggingface.co/bigscience/mt0-xxl)|13B parameter multitask finetuned version of [mt5-xxl](https://huggingface.co/google/mt5-xxl) on [xP3](https://huggingface.co/bigscience/xP3)|
|
128 |
+
|||
|
129 |
+
|[mt0-xxl-mt](https://huggingface.co/bigscience/mt0-xxl-mt)|13B parameter multitask finetuned version of [mt5-xxl](https://huggingface.co/google/mt5-xxl) on [xP3](https://huggingface.co/bigscience/xP3) & [xP3mt](https://huggingface.co/bigscience/xP3mt). **Better than [mt0-xxl](https://huggingface.co/bigscience/mt0-xxl) when prompting in non-English**|
|
130 |
+
|||
|
131 |
+
|[mt0-xxl-p3](https://huggingface.co/bigscience/mt0-xxl-p3)| 13B parameter multitask finetuned version of [mt5-xxl](https://huggingface.co/google/mt5-xxl) on [P3](https://huggingface.co/bigscience/P3). **Released for research purposes, performance is inferior to [mt0-xxl](https://huggingface.co/bigscience/mt0-xxl)**|
|
132 |
+
|----|-----------|
|
133 |
|
|
|
|
|
134 |
|
|
|
135 |
|
|
|
|
|
136 |
|
|
|
137 |
|
138 |
# Intended uses
|
139 |
|