correct link for featherless inference + one typo
#4
by
wxgeorge
- opened
README.md
CHANGED
@@ -37,7 +37,7 @@ This approach demonstrates the architecture design and scalability of RWKV, rein
|
|
37 |
|
38 |
One downside to this technique is that the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
|
39 |
|
40 |
-
Due to the the lack of RWKV-based channel mix and feedforward layers,
|
41 |
|
42 |
Furthermore, due to compute constraints, we were only able to train up to 16K token context length. While the model is stable beyond this limit, additional training might be required to support longer context lengths.
|
43 |
|
@@ -53,8 +53,8 @@ Lastly, we intend to provide details on the conversion along with our paper afte
|
|
53 |
## Links
|
54 |
- [Our wiki](https://wiki.rwkv.com)
|
55 |
- [TensorWave - The AMD Cloud](https://tensorwave.com) - Access MI300X today!
|
56 |
-
- [Recursal.AI Cloud Platform](https://recursal.ai)
|
57 |
-
- [Featherless Inference](https://featherless.ai/
|
58 |
|
59 |
## Acknowledgement
|
60 |
We are grateful for the help and support from the following key groups:
|
|
|
37 |
|
38 |
One downside to this technique is that the model's inherent knowledge and dataset training are inherited from its "parent" model. Consequently, unlike previous RWKV models trained on over 100+ languages, the QRWKV model is limited to approximately 30 languages supported by the Qwen line of models.
|
39 |
|
40 |
+
Due to the the lack of RWKV-based channel mix and feedforward layers, separate inference code is needed for this specific model.
|
41 |
|
42 |
Furthermore, due to compute constraints, we were only able to train up to 16K token context length. While the model is stable beyond this limit, additional training might be required to support longer context lengths.
|
43 |
|
|
|
53 |
## Links
|
54 |
- [Our wiki](https://wiki.rwkv.com)
|
55 |
- [TensorWave - The AMD Cloud](https://tensorwave.com) - Access MI300X today!
|
56 |
+
- [Recursal.AI Cloud Platform](https://platform.recursal.ai)
|
57 |
+
- [Featherless Inference](https://featherless.ai/model-families/rwkv6/)
|
58 |
|
59 |
## Acknowledgement
|
60 |
We are grateful for the help and support from the following key groups:
|