bhenrym14
/

airoboros-13b-gpt4-1.4.1-PI-8192-LoRA

Model card Files Files and versions Community

bhenrym14 commited on Jul 6, 2023

Commit

f73a957

•

1 Parent(s): 7254cf0

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -11,11 +11,11 @@ fp16 merged weights can be found here: https://huggingface.co/bhenrym14/airoboro
 ## Overview
-This is [Jon Durbin's Airoboros 13B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-13b-gpt4-1.4) (merged model with GPTQ Quantization) with several key modifications:
 - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-13b.
 - Training sequences beyond 2048 have the target truncated to equal 2048.
 - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
 - **This is a QLoRA fine-tune**. The original 13b model is a full fine-tune.
-It was trained on 1x RTX 6000 Ada for ~18 hours.

 ## Overview
+This is [Jon Durbin's Airoboros 13B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-13b-gpt4-1.4) (LoRA weights) with several key modifications:
 - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-13b.
 - Training sequences beyond 2048 have the target truncated to equal 2048.
 - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
 - **This is a QLoRA fine-tune**. The original 13b model is a full fine-tune.
+It was trained on 1x RTX 6000 Ada for ~17 hours.