File size: 1,356 Bytes

fb5875b
 
 
 
 
2c96a91
dc2f432
a88c7d6
 
dc2f432
a88c7d6
8d89535
9aee85c
 
e9e527a
d7db517
 
 
 
 
 
 
ea084d5

---
license: mit
language:
- en
---
For the bandwidth limited ones <3

# GGUFs for [HanNayeoniee/LHK_DPO_v1](https://huggingface.co/HanNayeoniee/LHK_DPO_v1)
For a general representation of how quantization level influences output quality, check any model card from TheBloke, or [see this table](https://docs.faraday.dev/models/choose-model#size-vs-perplexity-tradeoff). Note those benchmarks were done on Llama models, and are probably not recent. Also I don't know how the MOE architecture influences those results but you got the idea!

So about the model, I just played with it 40min so far (Q5_K_M, ChatML template, [TGWUI](https://github.com/oobabooga/text-generation-webui), ratherly short context size) but from what I saw, this model was really impressive 👏 I should rather say quite astonishing!

[Edit: every quants are now tested and validated]

The coherence seems remarkably well maintained. To illustrate, [see this sequence of interactions](https://bin.0xfc.de/?3110c74187a4b1f6#9qZMtmnmqeTrVrsoUsf37a7H39uXJvizRcpFdCf2yokS) with the model.



[HanNayeoniee/LHK_DPO_v1](https://huggingface.co/HanNayeoniee/LHK_DPO_v1) was trained via Direct Preference Optimization(DPO) from [TomGrc/FusionNet_7Bx2_MoE_14B](https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_14B).



Thanks for the community and sincere congrats to HanNayeoniee and TomGrc!