File size: 934 Bytes
4784245
 
9db89c3
 
 
b26941c
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
---
license: apache-2.0
datasets:
- ssmi153/Capybara-ShareGPT
- jondurbin/airoboros-3.2
---

QLoRA fine-tune of [Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1) on a combination of the Capybara and Airoboros datasets.

Uses Mistral instruct formatting, like this:
[INST] Describe quantum computing to a layperson. [/INST]

Model details:
- Trained with QLoRA, on 4 4090s, using my own [qlora-pipe](https://github.com/tdrussell/qlora-pipe) training script
- LoRA rank 64
- 4096 sequence length
- 2 epochs

You can find the LoRA adapter files [here](https://huggingface.co/tdrussell/Mixtral-8x22B-Capyboros-v1-lora). I have also uploaded a single quant (GGUF q4_k_s) [here](https://huggingface.co/tdrussell/Mixtral-8x22B-Capyboros-v1-GGUF-q4_k_s) if you want to try it without quantizing yourself or waiting for someone else to make all the quants. It fits with at least 16k context length on 96GB VRAM.