wubingheng
commited on
Commit
•
bca8d24
1
Parent(s):
c18a51f
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ pipeline_tag: text-generation
|
|
12 |
library_name: transformers
|
13 |
---
|
14 |
|
15 |
-
|
16 |
|
17 |
Doge is an ongoing research project where we aim to train a series of small language models to further explore whether the Transformer framework allows for more complex feedforward network structures, enabling the model to have fewer cache states and larger knowledge capacity.
|
18 |
|
@@ -68,10 +68,11 @@ In addition, Doge uses Inner Function Attention with Dynamic Mask as sequence tr
|
|
68 |
... tokenizer=tokenizer,
|
69 |
... generation_config=generation_config,
|
70 |
... streamer=steamer
|
71 |
-
... )
|
|
|
72 |
|
73 |
**Fine-tue Task**:
|
74 |
-
We selected an open-source Chinese medical question answering dataset for fine-tuning.
|
75 |
|
76 |
|
77 |
**Fine-tue Environment**:
|
|
|
12 |
library_name: transformers
|
13 |
---
|
14 |
|
15 |
+
## **basic model : Doge 197M**
|
16 |
|
17 |
Doge is an ongoing research project where we aim to train a series of small language models to further explore whether the Transformer framework allows for more complex feedforward network structures, enabling the model to have fewer cache states and larger knowledge capacity.
|
18 |
|
|
|
68 |
... tokenizer=tokenizer,
|
69 |
... generation_config=generation_config,
|
70 |
... streamer=steamer
|
71 |
+
... )
|
72 |
+
```
|
73 |
|
74 |
**Fine-tue Task**:
|
75 |
+
- We selected an open-source Chinese medical question answering dataset for fine-tuning.
|
76 |
|
77 |
|
78 |
**Fine-tue Environment**:
|