|
--- |
|
language: |
|
- en |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
--- |
|
# Summary |
|
The name is self-explanatory. This LoRA was trained on 50MB of text taken from Archive Of Our Own (AO3). In total, 1441 stories were selected from the Furry fandom category. I don't remember what filters I used. This LoRA is meant to improve a model's roleplaying capabilities, but I'll let you be the judge of that. Feel free to leave feedback, I'd like to hear your opinions on this LoRA. |
|
|
|
# Dataset Settings |
|
- Context length: 4096 |
|
- Epochs: 3 |
|
|
|
# LoRA Settings |
|
- Rank: 128 |
|
- Alpha: 256 |
|
- Targeted modules: Q, K, V, O, Gate, Up, Down |
|
- NEFTune alpha: 10 (to try to reduce overfitting) |
|
- Learning rate: 1e-4 |
|
- Dropout: 0 (unsloth doesn't support LoRA dropout) |
|
|
|
# Model Settings |
|
- Base model: Mistral 7B |
|
- Data Type: BF16, 4 bit quantization (thanks BitsandBytes) |
|
|
|
# Misc Settings |
|
- Batch size: 2 |
|
- Gradient Accumulation steps: 16 |
|
- LR Scheduler: Linear |
|
|
|
# Software and Hardware |
|
- Unsloth was used to speed up training. |
|
- Training was done on 1x RTX 3090 (with 24 GB of VRAM) and took 11 hours. |
|
|
|
# Warnings |
|
- Obviously, having been trained on AO3 fanfics, this LoRA will probably increase the chances of a model generating 18+ content. Furthermore, it is possible that, if prompted to do so, the LoRA may help generate illegal content. So yknow, don't ask it to do that. |
|
- Additionally, there is a chance this LoRA will output training data. The training graph seems to suggest that the LoRA was overfitting. |
|
|
|
# Training Graph |
|
|
|
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64504e9be1d7a97f3b698682/0Zv-e-d3C4hwsWWZJbyB9.png) |