Update README.md
Browse files
README.md
CHANGED
@@ -34,6 +34,8 @@ RWKV-4-Pile-430M-20220808-8066.pth : Trained on the Pile for 333B tokens.
|
|
34 |
* SC2016 acc 63.87%
|
35 |
* Hellaswag acc_norm 40.90%
|
36 |
|
|
|
|
|
37 |
With tiny attention (--tiny_att_dim 512 --tiny_att_layer 18):
|
38 |
RWKV-4a-Pile-433M-20221223-8039.pth
|
39 |
* Pile loss 2.2394
|
|
|
34 |
* SC2016 acc 63.87%
|
35 |
* Hellaswag acc_norm 40.90%
|
36 |
|
37 |
+
## Warning: 4 / 4a / 4b models ARE NOT compatible!!! Use RWKV-4 unless you know what you are doing.
|
38 |
+
|
39 |
With tiny attention (--tiny_att_dim 512 --tiny_att_layer 18):
|
40 |
RWKV-4a-Pile-433M-20221223-8039.pth
|
41 |
* Pile loss 2.2394
|