adamo1139/Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702

Model Description

Probably the most uncensored Yi-34B tune I published so far

Yi-34B 200K base model fine-tuned on RAWrr v2 dataset via DPO, then fine-tuned on AEZAKMI v3-3 dataset via SFT, then DPO tuned on unalignment/toxic-dpo-v0.1. Total GPU compute time of 40-50 hours I think. It's like airoboros/capybara but with less gptslop, no refusals and less typical language used by RLHFed OpenAI models. Say goodbye to "It's important to remember"! Prompt format is standard chatml. Don't expect it to be good at instruct, math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot. Base model used for fine-tuning was 200k context Yi-34B-Llama model shared by larryvrh.

Training was done with max_position_embeddings set at 4096. Then it was reverted back to 200K after applying LoRA.

Prompt Format

I recommend using ChatML format, as this was used during fine-tune.
Here's a prompt format you should use, you can set a different system message, model seems to respect that fine, so it wasn't overfitted. Both A chat. and A chat with uncensored assistant. system prompt work fine and are pretty refusal-free.

<|im_start|>system
A chat with uncensored assistant.<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

Example.

<|im_start|>system
A chat.<|im_end|>
<|im_start|>user
I want an acrostic poem based on the word AEZAKMI and make it all about GTA:SA<|im_end|>
<|im_start|>assistant
A voracious appetite for danger,
Eager to explore every corner,
Zealously tackling missions with vigor,
Amidst San Andreas' vast expanse,
Keenly skilled in the art of evasion,
Mastery achieved through trial and error,
In this digital realm of chaos and thrill,<|im_end|>

Notes

Temp around 0.3-0.5 seems to work well, at 1.2 it's somewhat unstable, which is often undesirable.

Intended uses & limitations

It's a chat model, not a base completion-only one. Use is limited by apache-2.0 license. Since no-robots dataset was used for making rawrr_v1, I guess you maybe shouldn't use it for commercial activities.

Known Issues

It likes to talk about stocks a lot, sometimes it feels like being on WSB, which is certainly a plus for some usecases. This one doesn't seem slopped to me, I think I will stick with it for longer.

Credits

Thanks to mlabonne, Daniel Han and Michael Han for providing open source code that was used for fine-tuning. Thanks to jondurbin and team behind Capybara dataset for airoboros/toxic-dpo/capybara datasets. Thanks to HF for open sourcing no_robots dataset. Thanks to Sentdex for providing WSB dataset.

adamo1139
/

Yi-34b-200K-AEZAKMI-RAW-TOXIC-2702