File size: 11,250 Bytes
5dea528
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
---
license: cc-by-nc-4.0
base_model: davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1
tags:
- generated_from_trainer
model-index:
- name: ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter2

This model is a fine-tuned version of [davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1](https://huggingface.co/davidberenstein1957/ultra-feedback-dutch-cleaned-hq-spin-geitje-7b-ultra-sft_iter1) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0162
- Rewards/real: -8.1731
- Rewards/generated: -31.3826
- Rewards/accuracies: 0.9917
- Rewards/margins: 23.2095
- Logps/generated: -956.3063
- Logps/real: -525.1735
- Logits/generated: -1.5719
- Logits/real: -1.7813

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-07
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 2
- total_train_batch_size: 64
- total_eval_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 2

### Training results

| Training Loss | Epoch | Step | Validation Loss | Rewards/real | Rewards/generated | Rewards/accuracies | Rewards/margins | Logps/generated | Logps/real | Logits/generated | Logits/real |
|:-------------:|:-----:|:----:|:---------------:|:------------:|:-----------------:|:------------------:|:---------------:|:---------------:|:----------:|:----------------:|:-----------:|
| 0.6097        | 0.04  | 25   | 0.4147          | -0.6192      | -1.4312           | 0.9250             | 0.8120          | -656.7919       | -449.6341  | -2.0004          | -2.0773     |
| 0.2137        | 0.08  | 50   | 0.1745          | -2.0300      | -5.0060           | 0.9519             | 2.9761          | -692.5404       | -463.7422  | -1.9306          | -2.0237     |
| 0.1292        | 0.12  | 75   | 0.1012          | -2.8227      | -7.4967           | 0.9685             | 4.6740          | -717.4471       | -471.6697  | -1.8843          | -1.9887     |
| 0.0665        | 0.16  | 100  | 0.0676          | -3.2936      | -9.3177           | 0.9778             | 6.0240          | -735.6567       | -476.3786  | -1.8508          | -1.9628     |
| 0.0429        | 0.21  | 125  | 0.0477          | -3.7328      | -11.2722          | 0.9824             | 7.5395          | -755.2025       | -480.7701  | -1.8123          | -1.9332     |
| 0.0299        | 0.25  | 150  | 0.0369          | -4.2161      | -13.2599          | 0.9870             | 9.0437          | -775.0787       | -485.6039  | -1.7938          | -1.9226     |
| 0.0252        | 0.29  | 175  | 0.0320          | -4.7201      | -15.0489          | 0.9880             | 10.3288         | -792.9691       | -490.6432  | -1.7758          | -1.9116     |
| 0.0249        | 0.33  | 200  | 0.0301          | -5.0757      | -16.3570          | 0.9880             | 11.2813         | -806.0497       | -494.1995  | -1.7515          | -1.8923     |
| 0.0175        | 0.37  | 225  | 0.0273          | -5.4299      | -17.6751          | 0.9880             | 12.2451         | -819.2310       | -497.7419  | -1.7362          | -1.8821     |
| 0.0183        | 0.41  | 250  | 0.0254          | -5.4183      | -18.3899          | 0.9889             | 12.9715         | -826.3791       | -497.6259  | -1.7300          | -1.8793     |
| 0.0182        | 0.45  | 275  | 0.0245          | -6.0900      | -20.5760          | 0.9889             | 14.4860         | -848.2401       | -504.3426  | -1.6961          | -1.8564     |
| 0.0253        | 0.49  | 300  | 0.0224          | -5.9239      | -20.7184          | 0.9898             | 14.7944         | -849.6640       | -502.6819  | -1.6938          | -1.8573     |
| 0.0075        | 0.53  | 325  | 0.0234          | -7.0436      | -24.1126          | 0.9898             | 17.0691         | -883.6064       | -513.8781  | -1.6522          | -1.8252     |
| 0.0141        | 0.58  | 350  | 0.0212          | -5.5696      | -20.9714          | 0.9898             | 15.4017         | -852.1937       | -499.1387  | -1.7082          | -1.8693     |
| 0.0135        | 0.62  | 375  | 0.0182          | -5.2646      | -20.3901          | 0.9907             | 15.1254         | -846.3809       | -496.0890  | -1.7285          | -1.8897     |
| 0.014         | 0.66  | 400  | 0.0182          | -5.5057      | -21.1579          | 0.9907             | 15.6522         | -854.0594       | -498.4994  | -1.7137          | -1.8783     |
| 0.0122        | 0.7   | 425  | 0.0172          | -5.3398      | -20.7520          | 0.9907             | 15.4122         | -849.9997       | -496.8405  | -1.7231          | -1.8857     |
| 0.0144        | 0.74  | 450  | 0.0164          | -4.6606      | -19.3766          | 0.9917             | 14.7160         | -836.2463       | -490.0483  | -1.7465          | -1.9042     |
| 0.0103        | 0.78  | 475  | 0.0160          | -4.8739      | -20.1058          | 0.9907             | 15.2319         | -843.5385       | -492.1819  | -1.7445          | -1.9064     |
| 0.0147        | 0.82  | 500  | 0.0156          | -5.1220      | -20.9607          | 0.9917             | 15.8387         | -852.0875       | -494.6623  | -1.7434          | -1.9092     |
| 0.0154        | 0.86  | 525  | 0.0155          | -5.1481      | -21.3994          | 0.9917             | 16.2513         | -856.4740       | -494.9235  | -1.7357          | -1.9040     |
| 0.0158        | 0.91  | 550  | 0.0151          | -5.6088      | -22.9532          | 0.9917             | 17.3444         | -872.0123       | -499.5304  | -1.7139          | -1.8881     |
| 0.0053        | 0.95  | 575  | 0.0149          | -5.7209      | -23.5217          | 0.9917             | 17.8008         | -877.6972       | -500.6515  | -1.7113          | -1.8888     |
| 0.008         | 0.99  | 600  | 0.0147          | -5.7523      | -23.7474          | 0.9917             | 17.9952         | -879.9544       | -500.9651  | -1.7086          | -1.8878     |
| 0.0049        | 1.03  | 625  | 0.0154          | -6.1839      | -24.8883          | 0.9907             | 18.7044         | -891.3632       | -505.2818  | -1.6731          | -1.8585     |
| 0.0057        | 1.07  | 650  | 0.0155          | -6.4947      | -25.8924          | 0.9917             | 19.3977         | -901.4037       | -508.3892  | -1.6592          | -1.8484     |
| 0.0076        | 1.11  | 675  | 0.0158          | -6.8543      | -26.9217          | 0.9917             | 20.0674         | -911.6970       | -511.9859  | -1.6407          | -1.8339     |
| 0.004         | 1.15  | 700  | 0.0158          | -7.1325      | -27.7743          | 0.9917             | 20.6418         | -920.2236       | -514.7678  | -1.6269          | -1.8236     |
| 0.0168        | 1.19  | 725  | 0.0157          | -6.9019      | -26.2791          | 0.9917             | 19.3772         | -905.2711       | -512.4611  | -1.6566          | -1.8448     |
| 0.0022        | 1.23  | 750  | 0.0163          | -6.9586      | -26.5145          | 0.9917             | 19.5559         | -907.6251       | -513.0281  | -1.6533          | -1.8423     |
| 0.0039        | 1.28  | 775  | 0.0165          | -7.5386      | -28.2224          | 0.9917             | 20.6837         | -924.7038       | -518.8289  | -1.6369          | -1.8327     |
| 0.002         | 1.32  | 800  | 0.0165          | -7.6568      | -28.6441          | 0.9907             | 20.9872         | -928.9208       | -520.0109  | -1.6365          | -1.8344     |
| 0.002         | 1.36  | 825  | 0.0165          | -7.7989      | -29.2028          | 0.9917             | 21.4038         | -934.5078       | -521.4318  | -1.6348          | -1.8352     |
| 0.0019        | 1.4   | 850  | 0.0165          | -7.8978      | -29.5958          | 0.9917             | 21.6980         | -938.4382       | -522.4203  | -1.6166          | -1.8169     |
| 0.0041        | 1.44  | 875  | 0.0162          | -7.9696      | -29.7930          | 0.9917             | 21.8234         | -940.4100       | -523.1380  | -1.6165          | -1.8176     |
| 0.0023        | 1.48  | 900  | 0.0164          | -8.2086      | -30.6909          | 0.9917             | 22.4823         | -949.3892       | -525.5286  | -1.6045          | -1.8093     |
| 0.0038        | 1.52  | 925  | 0.0166          | -8.1217      | -30.6727          | 0.9917             | 22.5510         | -949.2076       | -524.6597  | -1.5919          | -1.7978     |
| 0.0096        | 1.56  | 950  | 0.0162          | -7.8257      | -30.1144          | 0.9917             | 22.2887         | -943.6237       | -521.6992  | -1.5909          | -1.7956     |
| 0.0057        | 1.6   | 975  | 0.0166          | -8.0335      | -30.6654          | 0.9917             | 22.6319         | -949.1342       | -523.7775  | -1.5854          | -1.7919     |
| 0.0046        | 1.65  | 1000 | 0.0165          | -8.1757      | -31.0139          | 0.9917             | 22.8382         | -952.6191       | -525.2000  | -1.5768          | -1.7852     |
| 0.0009        | 1.69  | 1025 | 0.0165          | -8.0553      | -30.7565          | 0.9917             | 22.7012         | -950.0453       | -523.9951  | -1.5757          | -1.7830     |
| 0.002         | 1.73  | 1050 | 0.0164          | -8.1838      | -31.3365          | 0.9917             | 23.1528         | -955.8453       | -525.2800  | -1.5692          | -1.7790     |
| 0.0069        | 1.77  | 1075 | 0.0163          | -8.1908      | -31.4118          | 0.9917             | 23.2210         | -956.5981       | -525.3508  | -1.5749          | -1.7850     |
| 0.0029        | 1.81  | 1100 | 0.0166          | -8.4138      | -32.0830          | 0.9917             | 23.6692         | -963.3098       | -527.5802  | -1.5624          | -1.7752     |
| 0.0047        | 1.85  | 1125 | 0.0166          | -8.4223      | -32.1526          | 0.9917             | 23.7304         | -964.0065       | -527.6652  | -1.5631          | -1.7759     |
| 0.0037        | 1.89  | 1150 | 0.0163          | -8.1563      | -31.3209          | 0.9917             | 23.1646         | -955.6895       | -525.0057  | -1.5739          | -1.7832     |
| 0.0026        | 1.93  | 1175 | 0.0163          | -8.2107      | -31.5009          | 0.9917             | 23.2901         | -957.4888       | -525.5498  | -1.5708          | -1.7807     |
| 0.0058        | 1.98  | 1200 | 0.0162          | -8.1731      | -31.3826          | 0.9917             | 23.2095         | -956.3063       | -525.1735  | -1.5719          | -1.7813     |


### Framework versions

- Transformers 4.37.0
- Pytorch 2.1.2+cu121
- Datasets 2.14.6
- Tokenizers 0.15.2