Heralax commited on
Commit
83711a8
1 Parent(s): 1beed45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -93
README.md CHANGED
@@ -9,100 +9,8 @@ model-index:
9
  results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
16
- <details><summary>See axolotl config</summary>
17
-
18
- axolotl version: `0.4.1`
19
- ```yaml
20
- base_model: alpindale/Mistral-7B-v0.2-hf
21
- tokenizer_type: AutoTokenizer
22
- is_mistral_derived_model: true
23
- load_in_8bit: false
24
- load_in_4bit: false
25
- strict: false
26
-
27
- datasets:
28
- - path: json
29
- data_files: hidden_pretraining-us-army.jsonl
30
- ds_type: json
31
- type: completion
32
-
33
-
34
- dataset_prepared_path: last_run_prepared
35
- output_dir: ./army-pretraining
36
-
37
- sequence_len: 4096
38
- sample_packing: false
39
- pad_to_sequence_len: true
40
- shuffle_merged_datasets: true
41
-
42
- wandb_project: mistral-army
43
- wandb_entity:
44
- wandb_watch:
45
- wandb_run_id:
46
- wandb_log_model:
47
-
48
- gradient_accumulation_steps: 6
49
- micro_batch_size: 2
50
- eval_batch_size: 1
51
- num_epochs: 11
52
- optimizer: paged_adamw_8bit
53
- lr_scheduler: cosine
54
- learning_rate: 0.000020
55
- weight_decay: 0
56
- # Gradient clipping max norm
57
- max_grad_norm: 1.0
58
- noisy_embedding_alpha: 0
59
- train_on_inputs: false
60
- group_by_length: false
61
- bf16: true
62
- fp16: false
63
- tf32: false
64
-
65
- gradient_checkpointing: unsloth
66
- early_stopping_patience:
67
- resume_from_checkpoint:
68
- logging_steps: 1
69
- xformers_attention:
70
- flash_attention: true
71
-
72
- chat_template: chatml
73
-
74
- warmup_ratio: 0.5
75
- auto_resume_from_checkpoints: false
76
- #warmup_ratio: 0.5
77
- eval_steps: 10
78
- saves_per_epoch: 1
79
- eval_sample_packing: false
80
- save_total_limit: 3
81
- debug:
82
- deepspeed: deepspeed_configs/zero2.json
83
- special_tokens:
84
- pad_token: "<|end_of_text|>"
85
- ```
86
-
87
- </details><br>
88
-
89
- # army-pretraining
90
-
91
- This model is a fine-tuned version of [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) on the None dataset.
92
-
93
- ## Model description
94
-
95
- More information needed
96
-
97
- ## Intended uses & limitations
98
-
99
- More information needed
100
-
101
- ## Training and evaluation data
102
-
103
- More information needed
104
-
105
- ## Training procedure
106
 
107
  ### Training hyperparameters
108
 
 
9
  results: []
10
  ---
11
 
 
 
12
 
13
+ The pretrained base of the Mistrillitary model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ### Training hyperparameters
16