ZhangShenao commited on
Commit
b561367
1 Parent(s): fc720f3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -9
README.md CHANGED
@@ -36,7 +36,7 @@ This model is a fine-tuned version of [ZhangShenao/SELM-Llama-3-8B-Instruct-iter
36
 
37
 
38
 
39
- - Model type: A 8B parameter Llama3-based Self-Exploring Language Models (SELM).
40
  - License: MIT
41
 
42
 
@@ -59,22 +59,14 @@ The following hyperparameters were used during training:
59
  - alpha: 0.0001
60
  - beta: 0.01
61
  - train_batch_size: 4
62
- - eval_batch_size: 4
63
  - seed: 42
64
  - distributed_type: multi-GPU
65
  - num_devices: 8
66
  - gradient_accumulation_steps: 4
67
  - total_train_batch_size: 128
68
- - total_eval_batch_size: 32
69
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
70
- - lr_scheduler_type: cosine
71
- - lr_scheduler_warmup_ratio: 0.1
72
  - num_epochs: 1
73
 
74
- ### Training results
75
-
76
-
77
-
78
  ### Framework versions
79
 
80
  - Transformers 4.40.2
 
36
 
37
 
38
 
39
+ - Model type: A 8B parameter Llama3-instruct-based Self-Exploring Language Models (SELM).
40
  - License: MIT
41
 
42
 
 
59
  - alpha: 0.0001
60
  - beta: 0.01
61
  - train_batch_size: 4
 
62
  - seed: 42
63
  - distributed_type: multi-GPU
64
  - num_devices: 8
65
  - gradient_accumulation_steps: 4
66
  - total_train_batch_size: 128
 
67
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 
 
68
  - num_epochs: 1
69
 
 
 
 
 
70
  ### Framework versions
71
 
72
  - Transformers 4.40.2