5 bits/bpw quantization of Venus-103b-v1.1 to be used on exllamav2.

Calibration dataset was a cleaned Pippa dataset (https://huggingface.co/datasets/royallab/PIPPA-cleaned), same as used as on the original model card.

You can use the measurement.json from there to do your own quant sizes

Original model card

Venus 103b - version 1.1

Model Details

Warning: This model will produce NSFW content!

Results

  1. Seems to be more "talkative" than Venus-103b-v1.0 (i.e characters speakmore often in roleplays)
  2. Sometimes struggles to pay attention to small details in the scenes
  3. Prose seems pretty creative and more logical than Venus-120b-v1.0
Downloads last month
8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.