Last updated: 2023-06-07
This is BlinkDL/rwkv-4-pileplus converted to GGML for use with rwkv.cpp and KoboldCpp. rwkv.cpp's conversion instructions were followed.
RAM USAGE (KoboldCpp)
Model | RAM usage (with OpenBLAS) |
---|---|
Unloaded | 41.3 MiB |
169M q4_0 | 232.2 MiB |
169M q5_0 | 243.3 MiB |
169M q5_1 | 249.2 MiB |
430M q4_0 | 413.2 MiB |
430M q5_0 | 454.4 MiB |
430M q5_1 | 471.8 MiB |
1.5B q4_0 | 1.1 GiB |
1.5B q5_0 | 1.3 GiB |
1.5B q5_1 | 1.3 GiB |
3B q4_0 | 2.0 GiB |
3B q5_0 | 2.3 GiB |
3B q5_1 | 2.4 GiB |
Original model card by BlinkDL is below.
RWKV-4 PilePlus
Model Description
RWKV-4-pile models finetuning on [RedPajama + some of Pile v2 = 1.7T tokens]. Updated with 2020+2021+2022 data, and better at all European languages.
Although some of these are intermedia checkpoints (XXXGtokens means finetuned for XXXG tokens), you can already use them because I am finetuning from Pile models (instead of retraining).
Note: not instruct tuned yet, and recommended to replace vanilla Pile models.
7B and 14B coming soon.
See https://github.com/BlinkDL/RWKV-LM for details.
Use https://github.com/BlinkDL/ChatRWKV to run it.