sumuks commited on
Commit
3daac94
·
verified ·
1 Parent(s): 3b4552f

Model save

Browse files
Files changed (1) hide show
  1. README.md +7 -10
README.md CHANGED
@@ -16,9 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # full_review
18
 
19
- This model is a fine-tuned version of [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) on the openreview_full_review dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 1.6107
22
 
23
  ## Model description
24
 
@@ -42,9 +42,9 @@ The following hyperparameters were used during training:
42
  - eval_batch_size: 1
43
  - seed: 42
44
  - distributed_type: multi-GPU
45
- - num_devices: 4
46
- - total_train_batch_size: 32
47
- - total_eval_batch_size: 4
48
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: cosine
50
  - lr_scheduler_warmup_ratio: 0.1
@@ -54,11 +54,8 @@ The following hyperparameters were used during training:
54
 
55
  | Training Loss | Epoch | Step | Validation Loss |
56
  |:-------------:|:------:|:----:|:---------------:|
57
- | 1.6658 | 0.5089 | 600 | 1.6607 |
58
- | 1.5992 | 1.0178 | 1200 | 1.6405 |
59
- | 1.6182 | 1.5267 | 1800 | 1.6241 |
60
- | 1.5463 | 2.0356 | 2400 | 1.6182 |
61
- | 1.5356 | 2.5445 | 3000 | 1.6117 |
62
 
63
 
64
  ### Framework versions
 
16
 
17
  # full_review
18
 
19
+ This model is a fine-tuned version of [Qwen/Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.6306
22
 
23
  ## Model description
24
 
 
42
  - eval_batch_size: 1
43
  - seed: 42
44
  - distributed_type: multi-GPU
45
+ - num_devices: 8
46
+ - total_train_batch_size: 64
47
+ - total_eval_batch_size: 8
48
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: cosine
50
  - lr_scheduler_warmup_ratio: 0.1
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss |
56
  |:-------------:|:------:|:----:|:---------------:|
57
+ | 1.6261 | 1.0169 | 600 | 1.6485 |
58
+ | 1.5922 | 2.0339 | 1200 | 1.6306 |
 
 
 
59
 
60
 
61
  ### Framework versions