RWKV-x070-2B9-CJE-Instruct Model Card
Model Overview
- Model Name: RWKV-x070-2B9-CJE-Instruct
- Description: An instruction-tuned model specialized for Japanese, Chinese, and English languages
- Base Model: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth
- Architecture: RWKV x070 "Goose"
- Parameters: 2.9B
- Model Dimension: 2560
- Number of Layers: 32
Fine-tuning Details
Training Configuration
- Trainer: RWKV-LM-RLHF (https://github.com/OpenMOSE/RWKV-LM-RLHF)
- PEFT Mode: Hybrid Training combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training
- SFT Method: SmoothingLoss SFT
- Context Window: 5120 (trained with 1024 token overlap)
- Compute Power: AMD Instinct MI100 x 2 60hrs (100ï¼… solar energy)
Dataset Specifications
- Size: 800k pairs
- Content:
- Mixed data in Japanese, Chinese, and English
- Conversations
- Programming code
- Translation tasks
- Chain-of-Thought reasoning tasks
How to use
- Install latest RWKV-Infer (Linux,WSL) (https://github.com/OpenMOSE/RWKV-Infer)
- make folder 'models'
- move rwkv-x070-2b9-cje-instruct-1.pth to models folder
curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json" -d '{"model_filename":"models/rwkv-x070-2b9-cje-instruct-1.pth","model_viewname":"RWKV x070 2B9 CJE Instruct-1","model_strategy":"fp16","endtoken":"\\n\\n\\x17"}'
- Enjoy with openai compatible api http://127.0.0.1:9000/v1 :)
Important Note
- Set the end token as '\n\n\x17'
User: who are you?\n\n\x17
Assistant: gooday i'm rwkv\n\n\x17
Limitations and Considerations
- This is an experimental model; inference stability is not fully guaranteed
- Unexpected behaviors may occur
- Continuous improvements are being made; feedback is welcome
License
Apache License 2.0
Acknowledgments
We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model.