please a 13 b version
please a 13 b version
If I figure out how to get some A100 GPUs :) I tried training the 13B version on the A10G but ran out of GPU memory. Might look around on vasti.ai runpod.io or something for those A100s...
I was able to train 13B model on two RTX 3090:
{'train_runtime': 95229.7197, 'train_samples_per_second': 0.363, 'train_steps_per_second': 0.091, 'train_loss': 0.5828390517308127, 'epoch': 1.0}
100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 8649/8649 [26:27:09<00:00, 11.01s/it]
Not to be greedy, but even better an uncensored version fine tuned on the LLongMA-2-13b version would be even better. ;)
https://huggingface.co/conceptofmind/LLongMA-2-13b
It should be mentioned, that these uncensored models still somehow have some safety bias from llama2. When used for story telling in a dark fantasy setting for example, they will still resist well established social mores of the setting, generating out of prompt context replies. As much as I'd love a 13B model of llama2 chat uncensored, it might take some time to work out the ideal datasets.
@georgesung
I can't imagine how you managed to finetune 7B model on a single 24Gb GPU...
I tried, but got OOM, so i had to run on two and after one hour it consumed more than 27Gb VRAM:
@arogov Did you use QLoRA for fine-tuning? You can reproduce my model like this:
git clone https://github.com/georgesung/llm_qlora
cd llm_qlora
pip install -r requirements.txt
python train.py configs/llama2_7b_chat_uncensored.yaml
Yes, but with few code improvements