Skylion007
commited on
Commit
•
0ff90b2
1
Parent(s):
25ce5c5
Update README.md
Browse files
README.md
CHANGED
@@ -1,12 +1,14 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
|
|
|
|
|
4 |
---
|
5 |
|
6 |
# Model Card for Model ID
|
7 |
|
8 |
<!-- Provide a quick summary of what the model is/does. -->
|
9 |
-
|
10 |
|
11 |
|
12 |
## Model Details
|
@@ -20,22 +22,22 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
20 |
- **Developed by:** [More Information Needed]
|
21 |
- **Funded by [optional]:** [More Information Needed]
|
22 |
- **Shared by [optional]:** [More Information Needed]
|
23 |
-
- **Model type:**
|
24 |
-
- **Language(s) (NLP):**
|
25 |
-
- **License:**
|
26 |
-
- **Finetuned from model [optional]:** [More Information Needed]
|
27 |
|
28 |
-
### Model Sources
|
29 |
|
30 |
<!-- Provide the basic links for the model. -->
|
31 |
|
32 |
-
- **Repository:**
|
33 |
-
- **Paper [optional]:**
|
34 |
- **Demo [optional]:** [More Information Needed]
|
35 |
|
36 |
## Uses
|
37 |
|
38 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
|
|
39 |
|
40 |
### Direct Use
|
41 |
|
@@ -79,7 +81,7 @@ Use the code below to get started with the model.
|
|
79 |
|
80 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
81 |
|
82 |
-
|
83 |
|
84 |
### Training Procedure
|
85 |
|
@@ -174,11 +176,29 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
|
|
174 |
|
175 |
**BibTeX:**
|
176 |
|
177 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
178 |
|
179 |
**APA:**
|
180 |
|
181 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
182 |
|
183 |
## Glossary [optional]
|
184 |
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
license: apache-2.0
|
4 |
+
language:
|
5 |
+
- en
|
6 |
---
|
7 |
|
8 |
# Model Card for Model ID
|
9 |
|
10 |
<!-- Provide a quick summary of what the model is/does. -->
|
11 |
+
This is a masked diffusion model that generates text using a diffusion process trained on the OpenWebText dataset.
|
12 |
|
13 |
|
14 |
## Model Details
|
|
|
22 |
- **Developed by:** [More Information Needed]
|
23 |
- **Funded by [optional]:** [More Information Needed]
|
24 |
- **Shared by [optional]:** [More Information Needed]
|
25 |
+
- **Model type:** Masked Language Model
|
26 |
+
- **Language(s) (NLP):** en
|
27 |
+
- **License:** Apache 2.0
|
|
|
28 |
|
29 |
+
### Model Sources
|
30 |
|
31 |
<!-- Provide the basic links for the model. -->
|
32 |
|
33 |
+
- **Repository:** https://github.com/kuleshov-group/mdlm
|
34 |
+
- **Paper [optional]:** https://arxiv.org/abs/2406.07524
|
35 |
- **Demo [optional]:** [More Information Needed]
|
36 |
|
37 |
## Uses
|
38 |
|
39 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
40 |
+
* Research
|
41 |
|
42 |
### Direct Use
|
43 |
|
|
|
81 |
|
82 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
83 |
|
84 |
+
https://huggingface.co/datasets/Skylion007/openwebtext
|
85 |
|
86 |
### Training Procedure
|
87 |
|
|
|
176 |
|
177 |
**BibTeX:**
|
178 |
|
179 |
+
```
|
180 |
+
@misc{sahoo2024simple,
|
181 |
+
title={Simple and Effective Masked Diffusion Language Models},
|
182 |
+
author={Subham Sekhar Sahoo and Marianne Arriola and Yair Schiff and Aaron Gokaslan and Edgar Marroquin and Justin T Chiu and Alexander Rush and Volodymyr Kuleshov},
|
183 |
+
year={2024},
|
184 |
+
eprint={2406.07524},
|
185 |
+
archivePrefix={arXiv},
|
186 |
+
primaryClass={cs.CL}
|
187 |
+
}
|
188 |
+
```
|
189 |
|
190 |
**APA:**
|
191 |
|
192 |
+
```
|
193 |
+
@software{Sahoo_Simple_and_Effective_2024,
|
194 |
+
author = {Sahoo, Subham Sekhar and Arriola, Marianne and Schiff, Yair and Gokaslan, Aaron and Marroquin, Edgar and Chiu, Justin T and Rush, Alexander and Kuleshov, Volodymyr},
|
195 |
+
doi = {10.48550/arXiv.2406.07524},
|
196 |
+
month = jun,
|
197 |
+
title = {{Simple and Effective Masked Diffusion Language Models}},
|
198 |
+
version = {arXiv:2406.07524v1},
|
199 |
+
year = {2024}
|
200 |
+
}
|
201 |
+
```
|
202 |
|
203 |
## Glossary [optional]
|
204 |
|