Lyra4-Gutenberg-12B - EXL2 8bpw max rpcal_mk2

This is a 8bpw EXL2 quant of nbeerbower/Lyra4-Gutenberg-12B

This quant was made using exllamav2-0.2.1 with Fullmoon-light dataset for RP. I used a slightly modified quantization script to force use of highest bpw methods for all layers in the model (which is usually "1:8b_128g s4") to ensure max quality.

I also added a small fix in config file to set max default context at 128k as original Mistral-Nemo should have.

I tested this quant shortly in some random RPs (including ones over 8k context) and it seems to work fine.

Prompt Templates

Uses ChatML or modified mistral format like mentioned in original Lyra v4. I tested it with ChatML.

Original readme below

Lyra4-Gutenberg-12B

Sao10K/MN-12B-Lyra-v4 finetuned on jondurbin/gutenberg-dpo-v0.1.

Method

ORPO Finetuned using an RTX 3090 + 4060 Ti for 3 epochs.

Fine-tune Llama 3 with ORPO

DeusImperator
/

Lyra4-Gutenberg-12B_exl2_8bpw_max_rpcal_mk2

Lyra4-Gutenberg-12B - EXL2 8bpw max rpcal_mk2

Prompt Templates

Original readme below

Lyra4-Gutenberg-12B

Method

Model tree for DeusImperator/Lyra4-Gutenberg-12B_exl2_8bpw_max_rpcal_mk2

Dataset used to train DeusImperator/Lyra4-Gutenberg-12B_exl2_8bpw_max_rpcal_mk2

Collection including DeusImperator/Lyra4-Gutenberg-12B_exl2_8bpw_max_rpcal_mk2

Exl2 8bpw MAX quants