bhenrym14 commited on
Commit
f73a957
1 Parent(s): 7254cf0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -11,11 +11,11 @@ fp16 merged weights can be found here: https://huggingface.co/bhenrym14/airoboro
11
 
12
  ## Overview
13
 
14
- This is [Jon Durbin's Airoboros 13B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-13b-gpt4-1.4) (merged model with GPTQ Quantization) with several key modifications:
15
  - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-13b.
16
  - Training sequences beyond 2048 have the target truncated to equal 2048.
17
  - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
18
  - **This is a QLoRA fine-tune**. The original 13b model is a full fine-tune.
19
 
20
- It was trained on 1x RTX 6000 Ada for ~18 hours.
21
 
 
11
 
12
  ## Overview
13
 
14
+ This is [Jon Durbin's Airoboros 13B GPT4 1.4](https://huggingface.co/jondurbin/airoboros-13b-gpt4-1.4) (LoRA weights) with several key modifications:
15
  - Context length extended to 8192 by RoPE Scaled Embeddings, but NOT via the superHOT LoRA. I started with base Llama-13b.
16
  - Training sequences beyond 2048 have the target truncated to equal 2048.
17
  - Used airoboros-gpt4-1.4.1 dataset instead of airoboros-gpt4-1.4
18
  - **This is a QLoRA fine-tune**. The original 13b model is a full fine-tune.
19
 
20
+ It was trained on 1x RTX 6000 Ada for ~17 hours.
21