File size: 1,892 Bytes
1132065 9f53963 1132065 24cfb51 1132065 b1c8310 b6b0ad6 b1c8310 b6b0ad6 b1c8310 1132065 9f53963 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
language:
- en
- zh
- fr
- es
- de
- pt
- ru
- it
- ja
- ko
- vi
- ar
tags:
- pytorch
- text-generation
- causal-lm
- rwkv
license: apache-2.0
datasets:
- HuggingFaceFW/fineweb-edu
- mlfoundations/dclm-baseline-1.0
- cerebras/SlimPajama-627B
- EleutherAI/pile
- bigcode/starcoderdata
- oscar-corpus/OSCAR-2301
---
# RWKV-7 World
Use rwkv pip package 0.8.28+ for RWKV-7 inference: https://pypi.org/project/rwkv/
https://www.rwkv.com/
For developers: https://github.com/BlinkDL/RWKV-LM/tree/main/RWKV-v7
Chat demo: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py
## Model Description
RWKV-7 trained on 100+ world languages (80% English, 10% multilang, 10% code).
World-v3 = 3.1T tokens
World-v2.9 = subsampled 2T tokens
World-v2.8 = subsampled 1T tokens
Recommended fine-tuning format (use \n for newlines):
```
User: xxxxxxxxxxxxxxx
Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
User: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
Assistant: xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
xxxxxxxxxxxxxxx
```
A good chat prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
```
User: hi
Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
User: xxx
Assistant:
```
QA prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
```
Question: xxx
Answer:
```
and
```
Instruction: xxx
Input: xxx
Response:
```
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
!!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!! |