amindada commited on
Commit
8c118cf
1 Parent(s): 1c1ea71

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -32,12 +32,12 @@ The pre-training dataset consists of documents from different domains:
32
  | Legal | OpenLegalData: German cases and laws | 5.4GB | 308,228 | 1B |
33
  | Medical | Smaller public datasets | 253MB | 179,776 | 50M |
34
  | Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
35
- | Medical | Medical Dissertations | 1.4GB | 14,496 | 295M |
36
- | Medical | Pubmed abstracts | 8.5GB | 21,044,382 | 1.7B |
37
- | Medical | MIMIC III | 2.6GB | 24,221,834 | 695M |
38
- | Medical | PMC-Patients-ReCDS | 2.1GB | 1,743,344 | 414M |
39
  | Literature | German Fiction | 1.1GB | 3,219 | 243M |
40
- | Literature | English books | 7.1GB | 11,038 | 1.6B |
41
  | - | Total | 167GB | 116,079,769 | 35.8B |
42
 
43
 
 
32
  | Legal | OpenLegalData: German cases and laws | 5.4GB | 308,228 | 1B |
33
  | Medical | Smaller public datasets | 253MB | 179,776 | 50M |
34
  | Medical | CC medical texts | 3.6GB | 2,000,000 | 682M |
35
+ | Medical | Medicine Dissertations | 1.4GB | 14,496 | 295M |
36
+ | Medical | Pubmed abstracts (translated) | 8.5GB | 21,044,382 | 1.7B |
37
+ | Medical | MIMIC III (translated) | 2.6GB | 24,221,834 | 695M |
38
+ | Medical | PMC-Patients-ReCDS (translated) | 2.1GB | 1,743,344 | 414M |
39
  | Literature | German Fiction | 1.1GB | 3,219 | 243M |
40
+ | Literature | English books (translated) | 7.1GB | 11,038 | 1.6B |
41
  | - | Total | 167GB | 116,079,769 | 35.8B |
42
 
43