File size: 1,356 Bytes
fb5875b 2c96a91 dc2f432 a88c7d6 dc2f432 a88c7d6 8d89535 9aee85c e9e527a d7db517 ea084d5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 |
---
license: mit
language:
- en
---
For the bandwidth limited ones <3
# GGUFs for [HanNayeoniee/LHK_DPO_v1](https://huggingface.co/HanNayeoniee/LHK_DPO_v1)
For a general representation of how quantization level influences output quality, check any model card from TheBloke, or [see this table](https://docs.faraday.dev/models/choose-model#size-vs-perplexity-tradeoff). Note those benchmarks were done on Llama models, and are probably not recent. Also I don't know how the MOE architecture influences those results but you got the idea!
So about the model, I just played with it 40min so far (Q5_K_M, ChatML template, [TGWUI](https://github.com/oobabooga/text-generation-webui), ratherly short context size) but from what I saw, this model was really impressive 👏 I should rather say quite astonishing!
[Edit: every quants are now tested and validated]
The coherence seems remarkably well maintained. To illustrate, [see this sequence of interactions](https://bin.0xfc.de/?3110c74187a4b1f6#9qZMtmnmqeTrVrsoUsf37a7H39uXJvizRcpFdCf2yokS) with the model.
[HanNayeoniee/LHK_DPO_v1](https://huggingface.co/HanNayeoniee/LHK_DPO_v1) was trained via Direct Preference Optimization(DPO) from [TomGrc/FusionNet_7Bx2_MoE_14B](https://huggingface.co/TomGrc/FusionNet_7Bx2_MoE_14B).
Thanks for the community and sincere congrats to HanNayeoniee and TomGrc! |