End of training

Browse files

Files changed (4) hide show

README.md +20 -64
model-00001-of-00002.safetensors +1 -1
model-00002-of-00002.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.4806
-- Num Input Tokens Seen: 17323856
 ## Model description
@@ -53,68 +53,24 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
 | No log        | 0      | 0    | 1.3956          | 0                 |
-| 1.7423        | 0.0160 | 5    | 1.3655          | 267632            |
-| 1.6351        | 0.0319 | 10   | 1.2583          | 552216            |
-| 1.4911        | 0.0479 | 15   | 1.1928          | 827968            |
-| 1.2409        | 0.0639 | 20   | 1.1586          | 1100280           |
-| 1.2604        | 0.0799 | 25   | 1.1378          | 1379288           |
-| 1.134         | 0.0958 | 30   | 1.1549          | 1649984           |
-| 1.0086        | 0.1118 | 35   | 1.1735          | 1926208           |
-| 0.9581        | 0.1278 | 40   | 1.2146          | 2202560           |
-| 0.8928        | 0.1438 | 45   | 1.2737          | 2479136           |
-| 0.8534        | 0.1597 | 50   | 1.3064          | 2760256           |
-| 0.7188        | 0.1757 | 55   | 1.3590          | 3042584           |
-| 0.7528        | 0.1917 | 60   | 1.3853          | 3312256           |
-| 0.677         | 0.2077 | 65   | 1.4389          | 3585504           |
-| 0.5664        | 0.2236 | 70   | 1.4586          | 3854016           |
-| 0.4825        | 0.2396 | 75   | 1.4975          | 4133168           |
-| 0.4866        | 0.2556 | 80   | 1.4721          | 4407288           |
-| 0.4285        | 0.2716 | 85   | 1.5586          | 4678520           |
-| 0.3265        | 0.2875 | 90   | 1.5246          | 4949728           |
-| 0.3516        | 0.3035 | 95   | 1.5590          | 5225616           |
-| 0.3268        | 0.3195 | 100  | 1.5276          | 5501384           |
-| 0.2797        | 0.3355 | 105  | 1.5182          | 5780224           |
-| 0.2672        | 0.3514 | 110  | 1.5094          | 6059920           |
-| 0.2777        | 0.3674 | 115  | 1.5643          | 6340400           |
-| 0.2453        | 0.3834 | 120  | 1.5118          | 6613416           |
-| 0.2118        | 0.3994 | 125  | 1.5538          | 6880920           |
-| 0.2029        | 0.4153 | 130  | 1.4862          | 7158296           |
-| 0.2346        | 0.4313 | 135  | 1.5006          | 7438416           |
-| 0.2331        | 0.4473 | 140  | 1.5101          | 7717112           |
-| 0.2371        | 0.4633 | 145  | 1.4671          | 7999472           |
-| 0.1374        | 0.4792 | 150  | 1.5032          | 8276904           |
-| 0.2166        | 0.4952 | 155  | 1.5048          | 8559152           |
-| 0.17          | 0.5112 | 160  | 1.4610          | 8835280           |
-| 0.119         | 0.5272 | 165  | 1.4604          | 9116904           |
-| 0.1284        | 0.5431 | 170  | 1.4638          | 9391744           |
-| 0.1899        | 0.5591 | 175  | 1.4611          | 9663568           |
-| 0.1606        | 0.5751 | 180  | 1.4262          | 9931184           |
-| 0.1786        | 0.5911 | 185  | 1.4451          | 10209616          |
-| 0.1505        | 0.6070 | 190  | 1.4156          | 10486896          |
-| 0.1507        | 0.6230 | 195  | 1.4259          | 10767808          |
-| 0.1546        | 0.6390 | 200  | 1.4544          | 11046344          |
-| 0.1364        | 0.6550 | 205  | 1.4106          | 11327440          |
-| 0.1248        | 0.6709 | 210  | 1.4525          | 11606960          |
-| 0.1672        | 0.6869 | 215  | 1.4773          | 11880848          |
-| 0.1577        | 0.7029 | 220  | 1.4349          | 12161672          |
-| 0.1245        | 0.7188 | 225  | 1.4439          | 12442664          |
-| 0.1302        | 0.7348 | 230  | 1.4415          | 12713624          |
-| 0.0654        | 0.7508 | 235  | 1.4390          | 12990952          |
-| 0.1354        | 0.7668 | 240  | 1.4475          | 13266200          |
-| 0.2104        | 0.7827 | 245  | 1.4559          | 13541912          |
-| 0.1072        | 0.7987 | 250  | 1.4351          | 13819320          |
-| 0.1398        | 0.8147 | 255  | 1.4095          | 14098680          |
-| 0.1013        | 0.8307 | 260  | 1.4447          | 14379424          |
-| 0.1324        | 0.8466 | 265  | 1.4160          | 14657904          |
-| 0.1668        | 0.8626 | 270  | 1.4443          | 14933688          |
-| 0.1894        | 0.8786 | 275  | 1.4766          | 15212440          |
-| 0.1551        | 0.8946 | 280  | 1.4633          | 15493152          |
-| 0.1221        | 0.9105 | 285  | 1.4492          | 15772512          |
-| 0.1201        | 0.9265 | 290  | 1.4642          | 16049992          |
-| 0.1003        | 0.9425 | 295  | 1.4544          | 16322976          |
-| 0.1042        | 0.9585 | 300  | 1.4601          | 16601528          |
-| 0.1254        | 0.9744 | 305  | 1.4701          | 16877264          |
-| 0.1313        | 0.9904 | 310  | 1.4481          | 17156976          |
 ### Framework versions

 This model is a fine-tuned version of [google/gemma-2-2b](https://huggingface.co/google/gemma-2-2b) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.4374
+- Num Input Tokens Seen: 5219176
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss | Input Tokens Seen |
 |:-------------:|:------:|:----:|:---------------:|:-----------------:|
 | No log        | 0      | 0    | 1.3956          | 0                 |
+| 1.7668        | 0.0533 | 5    | 1.3663          | 278592            |
+| 1.5612        | 0.1065 | 10   | 1.2589          | 555472            |
+| 1.3399        | 0.1598 | 15   | 1.1962          | 837344            |
+| 1.1995        | 0.2130 | 20   | 1.2034          | 1118136           |
+| 0.902         | 0.2663 | 25   | 1.2406          | 1401192           |
+| 0.7229        | 0.3196 | 30   | 1.3380          | 1684336           |
+| 0.4769        | 0.3728 | 35   | 1.4243          | 1969096           |
+| 0.4301        | 0.4261 | 40   | 1.4694          | 2250056           |
+| 0.3838        | 0.4794 | 45   | 1.4889          | 2523440           |
+| 0.3293        | 0.5326 | 50   | 1.4766          | 2802544           |
+| 0.2194        | 0.5859 | 55   | 1.4711          | 3086432           |
+| 0.2283        | 0.6391 | 60   | 1.4290          | 3359744           |
+| 0.1524        | 0.6924 | 65   | 1.4481          | 3638328           |
+| 0.2296        | 0.7457 | 70   | 1.4577          | 3917992           |
+| 0.1773        | 0.7989 | 75   | 1.4764          | 4199640           |
+| 0.1202        | 0.8522 | 80   | 1.4139          | 4486008           |
+| 0.178         | 0.9055 | 85   | 1.4641          | 4764120           |
+| 0.1187        | 0.9587 | 90   | 1.4421          | 5045664           |
 ### Framework versions

model-00001-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6fc657316c43c083c3c1e74c46bb0e8fc0f3a2e8e8e9b182cb14e68bfb26fa75
 size 4988025760

 version https://git-lfs.github.com/spec/v1
+oid sha256:bd6a415e8ad7afdec23196036e9e4d520a081e7cb33aab93e52b647068a3e32d
 size 4988025760

model-00002-of-00002.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5b0b73bcc6e4e48805ea210d171d59522926402bb60784a43edfde803d2f370c
 size 240691728

 version https://git-lfs.github.com/spec/v1
+oid sha256:d9d4b2ce53e6d37c9d9c438a6d19c511ff58a653bed78fabb8a8cbb7b35d3c6f
 size 240691728

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:72531063e3c6ca46c84d3322850e30560017bffd1d37d1cc64a163e152145d81
 size 5560

 version https://git-lfs.github.com/spec/v1
+oid sha256:3db2d0637a474da96b24158284fa06930ec866026fbff2b760b39c8d6be77d95
 size 5560