Aetheria-L2-70B-exl2

Exllama v2 quant of royallab/Aetheria-L2-70B

Branches:

  • main: measurement.json calculated at 2048 token calibration rows on PIPPA
  • 5.0bpw-h6: 5 decoder bits per weight, 6 head bits
    • ideal for 2x 24gb GPUs at 8192 context, or 1x 48gb GPU at 8192 context with CFG cache
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model authors have turned it off explicitly.

Collection including royallab/Aetheria-L2-70B-exl2