Furry-AO3-LoRA / README.md
Astris's picture
Update README.md
f8742d8 verified
|
raw
history blame
1.63 kB
metadata
language:
  - en
base_model: mistralai/Mistral-7B-v0.1

Summary

The name is self-explanatory. This LoRA was trained on 50MB of text taken from Archive Of Our Own (AO3). In total, 1441 stories were selected from the Furry fandom category. I don't remember what filters I used. This LoRA is meant to improve a model's roleplaying capabilities, but I'll let you be the judge of that. Feel free to leave feedback, I'd like to hear your opinions on this LoRA.

Dataset Settings

  • Context length: 4096
  • Epochs: 3

LoRA Settings

  • Rank: 128
  • Alpha: 256
  • Targeted modules: Q, K, V, O, Gate, Up, Down
  • NEFTune alpha: 10 (to try to reduce overfitting)
  • Learning rate: 1e-4
  • Dropout: 0 (unsloth doesn't support LoRA dropout)

Model Settings

  • Base model: Mistral 7B
  • Data Type: BF16, 4 bit quantization (thanks BitsandBytes)

Misc Settings

  • Batch size: 2
  • Gradient Accumulation steps: 16
  • LR Scheduler: Linear

Software and Hardware

  • Unsloth was used to speed up training.
  • Training was done on 1x RTX 3090 (with 24 GB of VRAM) and took 11 hours.

Warnings

  • Obviously, having been trained on AO3 fanfics, this LoRA will probably increase the chances of a model generating 18+ content. Furthermore, it is possible that, if prompted to do so, the LoRA may help generate illegal content. So yknow, don't ask it to do that.
  • Additionally, there is a chance this LoRA will output training data. The training graph seems to suggest that the LoRA was overfitting.

Training Graph

image/png