|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- nsfw |
|
- not-for-all-audiences |
|
- roleplay |
|
--- |
|
|
|
## InfinityKumon-2x7B |
|
|
|
![InfinityKumon-2x7B](https://cdn.discordapp.com/attachments/843160171676565508/1222560876578476103/00000-3033963009.png?ex=6616a98b&is=6604348b&hm=6434f8a16f22a3515728ab38bf7230a01448b00e6136729d42d75ae0374e5802&) |
|
|
|
GGUF - Imatrix quant of [InfinityKumon-2x7B](https://huggingface.co/R136a1/InfinityKumon-2x7B) |
|
|
|
Another MoE merge from [Endevor/InfinityRP-v1-7B](https://huggingface.co/Endevor/InfinityRP-v1-7B) and [grimjim/kukulemon-7B](https://huggingface.co/grimjim/kukulemon-7B). |
|
|
|
The reason? Because I like InfinityRP-v1-7B so much and wondering if I can improve it even more by merging 2 great models into MoE. |
|
|
|
|
|
## Perplexity |
|
|
|
Using llama.cpp/perplexity with private roleplay dataset. |
|
|
|
| Format | PPL | |
|
| --- | --- | |
|
| FP16 | 3.1748 +/- 0.11928 | |
|
| Q8_0 | 3.1734 +/- 0.11935 | |
|
| Q6_K | 3.1752 +/- 0.11899 | |
|
| Q5_K_M | 3.1731 +/- 0.11892 | |
|
| IQ4_NL | 3.1752 +/- 0.11943 | |
|
| IQ3_M | 3.1773 +/- 0.11528 | |
|
| Q2_K | 3.2309 +/- 0.11996 | |
|
|
|
I don't really recomend using Q2_K based on the ppl, the other quants are fine. |
|
|
|
### Prompt format: |
|
Alpaca or ChatML |
|
|
|
Switch: [FP16](https://huggingface.co/R136a1/InfinityKumon-2x7B) - [GGUF](https://huggingface.co/R136a1/InfinityKumon-2x7B-GGUF) |