Bronya-Qwen2.5-14B-Instruct-Pissa - A Bronya Dialogue Model with PiSSA Fine-tuning
Overview
Bronya-Qwen2.5-14B-Instruct-Pissa
is a dialogue model fine-tuned from the Qwen2.5-14B-Instruct base model using PiSSA (Principal Singular Values and Singular Vectors Adaptation), a cutting-edge parameter-efficient fine-tuning method developed by Peking University. PiSSA significantly outperforms traditional LoRA by adapting the principal components of the pre-trained model’s weights for enhanced performance.
This model aims to:
- Faithfully Replicate Bronya's Dialogue Style: Trained on a carefully curated Bronya dialogue dataset to capture her unique personality and speaking patterns.
- Enhance Conversational Abilities: Integrates a humorous and slightly absurd question-answering dataset (
ruozhiba_qa2449_gpt4o.json
) to make the model more versatile and engaging in conversations, able to handle unexpected or quirky inquiries while providing informative responses. - Provide Basic Cybersecurity Awareness: Includes a cybersecurity dataset to allow the model to discuss security topics.
- Mitigate Catastrophic Forgetting: Leverages the Alpaca 51k dataset to maintain the model's pre-existing knowledge during fine-tuning.
By using PiSSA, this model achieves better performance than LoRA with a similar number of trainable parameters.
Training Details
Base Model
- Qwen2.5-14B-Instruct: A robust pre-trained language model that serves as the foundational base for dialogue generation. (https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
Datasets
- Bronya Dialogue Dataset (8,000 entries): A meticulously curated collection of Bronya dialogues designed to instill her distinctive speech characteristics.
- Cognitive Dataset (91 entries): Enhances the model's basic cognitive understanding and improves response accuracy.
- Humorous & Quirky Q&A Dataset (
ruozhiba_qa2449_gpt4o.json
): This dataset, featuring a variety of unusual and amusing questions, bolsters the model's conversational capabilities, allowing it to handle unexpected prompts while still providing informative and contextually appropriate responses. - Cybersecurity Dataset: Provides the model with domain-specific knowledge related to network security, enhancing its ability to discuss security concepts.
- Alpaca 51k: Used to mitigate catastrophic forgetting, preserving the general capabilities of the base model during fine-tuning.
Training Parameters
- Fine-tuning Method: PiSSA (Principal Singular Values and Singular Vectors Adaptation)
- PiSSA initializes the LoRA adapter matrices using the principal singular values and singular vectors of the base model's weight matrix. This method focuses on adapting the most significant parameters of the base model, leading to better fine-tuning results.
- As shown in the PiSSA paper, this approach achieves better results than traditional LoRA with similar parameter counts.
- LoRA Rank: 8
- Training Framework: Llama Factory
- Training Epochs: 6
- Learning Rate: 5e-4
Training Hardware
- GPU: NVIDIA A800
- GPU Count: 1
Training Author
- Sole Author & Publisher: biluo
Model Capabilities
This model demonstrates strong capabilities in:
- Conversational Fluency: Exhibits fluid, natural, and multi-turn dialogue capabilities.
- Bronya-Like Style: Closely replicates Bronya's distinctive speaking patterns, including tone and word choices.
- Cybersecurity Knowledge: Provides answers to basic cybersecurity questions, demonstrating a basic understanding of security concepts.
- Engaging Interactions: Capable of creating engaging and enjoyable dialogues, adapting responses to the conversational context.
PiSSA Method Highlights
PiSSA, developed by Peking University, is a cutting-edge fine-tuning method that offers significant improvements over traditional LoRA. Here's a breakdown of why PiSSA is superior:
- Leverages Intrinsic Dimensionality: PiSSA builds on the principle that pre-trained model weights have low-rank properties. Instead of adapting noise, PiSSA directly adapts the principal components (largest singular values and singular vectors) of the pre-trained weight matrices, optimizing the model more effectively.
- Faster Convergence and Better Performance: PiSSA converges faster and achieves better performance than LoRA using a similar number of trainable parameters as demonstrated in the PiSSA paper. This makes it a highly efficient fine-tuning option.
- Initialization Method: Rather than initializing the adapter matrices with random noise as LoRA does, PiSSA leverages the Singular Value Decomposition (SVD) of the pre-trained weights and uses their largest singular values and singular vectors to initialize the adapter matrices.
- Efficient Implementation: PiSSA can be implemented with ease in the
peft
library by changing the initialization method, as a way of initializing the LoRA framework . - Fast SVD: PiSSA uses a fast SVD method that significantly speeds up initialization compared to the standard SVD while maintaining comparable results.
License
This model is released under the Apache 2.0 license, allowing for commercial use and modification.
Disclaimer
This model is intended for research and experimental purposes only. It must not be used to generate any illegal or harmful content. The author assumes no responsibility for any loss or damage caused by the use of this model. The views and information generated by this model do not represent the views of the author.
Contribution
Contributions are welcome! If you have any suggestions, ideas, or have identified issues, feel free to submit a pull request or open an issue.
Acknowledgements: Thanks to the open-source community, the team at Peking University for their work on PiSSA, the creators of the datasets used in this project, and Alibaba for their Qwen models for their support and contributions!