RWKV-x070-2B9-CJE-Instruct Model Card

Model Overview

  • Model Name: RWKV-x070-2B9-CJE-Instruct
  • Description: An instruction-tuned model specialized for Japanese, Chinese, and English languages
  • Base Model: rwkv-x070-2b9-world-v3-40%trained-20250113-ctx4k.pth
  • Architecture: RWKV x070 "Goose"
  • Parameters: 2.9B
  • Model Dimension: 2560
  • Number of Layers: 32

Fine-tuning Details

Training Configuration

  • Trainer: RWKV-LM-RLHF (https://github.com/OpenMOSE/RWKV-LM-RLHF)
  • PEFT Mode: Hybrid Training combining frozen embeddings and Bone (Block Affine Transformation) + full parameter training
  • SFT Method: SmoothingLoss SFT
  • Context Window: 5120 (trained with 1024 token overlap)
  • Compute Power: AMD Instinct MI100 x 2 60hrs (100ï¼… solar energy)

Dataset Specifications

  • Size: 800k pairs
  • Content:
    • Mixed data in Japanese, Chinese, and English
    • Conversations
    • Programming code
    • Translation tasks
    • Chain-of-Thought reasoning tasks

How to use

curl http://127.0.0.1:9000/loadmodel -X POST -H "Content-Type: application/json" -d '{"model_filename":"models/rwkv-x070-2b9-cje-instruct-1.pth","model_viewname":"RWKV x070 2B9 CJE Instruct-1","model_strategy":"fp16","endtoken":"\\n\\n\\x17"}'

Important Note

  • Set the end token as '\n\n\x17'
User: who are you?\n\n\x17
Assistant: gooday i'm rwkv\n\n\x17

Limitations and Considerations

  • This is an experimental model; inference stability is not fully guaranteed
  • Unexpected behaviors may occur
  • Continuous improvements are being made; feedback is welcome

License

Apache License 2.0

Acknowledgments

We express our gratitude to the RWKV base model and the RWKV community for their support in developing this model.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .