File size: 1,550 Bytes
a95cfcd
 
 
72f506c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
license: other
---

- [wand](https://wandb.ai/open-assistant/supervised-finetuning/runs/2jfazjt9) (still internal, needs to be moved to public-sft)
- checkpoint: 3319 steps

## Note

In order to load this model you need to install a pre-release version of the Huggingface transformers library.


## Model Configuration
```
llama2_13b_orca_8k:
  rng_seed: 0xe1291f1a
  use_custom_sampler: true
  sort_by_length: false
  dtype: fp16
  log_dir: "llama2_log_13b_orca_8k"
  learning_rate: 1e-5
  model_name: /mnt/data/llama2/Llama-2-13b-hf/
  output_dir: llama2_13b_orca_8k
  deepspeed_config: configs/zero_config_pretrain.json
  weight_decay: 0.0
  max_length: 8192
  warmup_steps: 100
  use_flash_attention: true
  gradient_checkpointing: true
  gradient_accumulation_steps: 8
  per_device_train_batch_size: 2
  per_device_eval_batch_size: 1
  residual_dropout: 0.0
  eval_steps: 200
  save_steps: 1000  # (total steps: 3319)
  num_train_epochs: 1
  save_total_limit: 4
  superhot: true
  superhot_config:
    type: linear
    scale: 2
  datasets:
    # Dataset Composition:
    # Tain (sampled):
    #   orca-chat: 100.00% (188842)
    #   fanfics: 100.00% (47760)
    #   red_pajama: 25.00% (188262)
    # Valid:
    #   orca-chat: 5000 (71.43%)
    #   fanfics: 1000 (14.29%)
    #   red_pajama: 1000 (14.29%)
    - orca-chat:
        max_val_set: 5000
    - fanfics:
        max_chunk_size: 65535
        max_val_set: 1000
    - red_pajama:
        fraction: 0.25
        max_val_set: 1000
        max_chunk_size: 65535
  peft_model: false
```