Update README.md
Browse files
README.md
CHANGED
@@ -32,7 +32,7 @@ The pre-training dataset consists of documents from different domains:
|
|
32 |
| Legal | OpenLegalData: German cases and laws | 5.4GB | 308,228 | 1B |
|
33 |
| Medical | Smaller public datasets | 253MB | 179,776 | 50M |
|
34 |
| Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
|
35 |
-
| Medical |
|
36 |
| Medical | Pubmed abstracts (translated) | 8.5GB | 21,044,382 | 1.7B |
|
37 |
| Medical | MIMIC III (translated) | 2.6GB | 24,221,834 | 695M |
|
38 |
| Medical | PMC-Patients-ReCDS (translated) | 2.1GB | 1,743,344 | 414M |
|
|
|
32 |
| Legal | OpenLegalData: German cases and laws | 5.4GB | 308,228 | 1B |
|
33 |
| Medical | Smaller public datasets | 253MB | 179,776 | 50M |
|
34 |
| Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
|
35 |
+
| Medical | Medicine Dissertations | 1.4GB | 14,496 | 295M |
|
36 |
| Medical | Pubmed abstracts (translated) | 8.5GB | 21,044,382 | 1.7B |
|
37 |
| Medical | MIMIC III (translated) | 2.6GB | 24,221,834 | 695M |
|
38 |
| Medical | PMC-Patients-ReCDS (translated) | 2.1GB | 1,743,344 | 414M |
|