Panchovix
/

Venus-103b-v1.1-exl2-5bpw

Text Generation

Inference Endpoints

Model card Files Files and versions Community

5 bits/bpw quantization of Venus-103b-v1.1 to be used on exllamav2.

Calibration dataset was a cleaned Pippa dataset (https://huggingface.co/datasets/royallab/PIPPA-cleaned), same as used as on the original model card.

You can use the measurement.json from there to do your own quant sizes

Original model card

Venus 103b - version 1.1

Model Details

A result of interleaving layers of Sao10K/Euryale-1.3-L2-70B, migtissera/SynthIA-70B-v1.5, and Xwin-LM/Xwin-LM-70B-V0.1 using mergekit.
The resulting model has 120 layers and 103 billion parameters.
See mergekit-config.yml for details on the merge method used.
See the exl2-* branches for exllama2 quantizations. The 5.65 bpw quant should fit in 80GB VRAM, and the 3.35 bpw quant should fit in 48GB VRAM.
Inspired by Goliath-120b

Warning: This model will produce NSFW content!

Results

Seems to be more "talkative" than Venus-103b-v1.0 (i.e characters speakmore often in roleplays)
Sometimes struggles to pay attention to small details in the scenes
Prose seems pretty creative and more logical than Venus-120b-v1.0

Downloads last month: 8

Inference Providers NEW

Text Generation

This model is not currently available via any of the supported Inference Providers.