C0uchP0tat0
/

my_rugpt3medium_finetune

@@ -12,9 +12,9 @@ should probably proofread and complete it, then remove this comment. -->
 # my_rugpt3medium_finetune
-This model is a fine-tuned version of [ai-forever/rugpt3medium_based_on_gpt2](https://huggingface.co/ai-forever/rugpt3medium_based_on_gpt2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.9269
 ## Model description
@@ -42,28 +42,88 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 1000
-- num_epochs: 25
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.601         | 1.6   | 25   | 3.6157          |
-| 3.601         | 3.19  | 50   | 3.6010          |
-| 3.5542        | 4.79  | 75   | 3.5621          |
-| 3.5309        | 6.38  | 100  | 3.5117          |
-| 3.496         | 7.98  | 125  | 3.4615          |
-| 3.446         | 9.57  | 150  | 3.4173          |
-| 3.34          | 11.17 | 175  | 3.3699          |
-| 3.3581        | 12.77 | 200  | 3.3214          |
-| 3.3136        | 14.36 | 225  | 3.2743          |
-| 3.214         | 15.96 | 250  | 3.2227          |
-| 3.2098        | 17.55 | 275  | 3.1738          |
-| 3.1348        | 19.15 | 300  | 3.1153          |
-| 3.0931        | 20.74 | 325  | 3.0561          |
-| 3.0383        | 22.34 | 350  | 2.9922          |
-| 2.9739        | 23.94 | 375  | 2.9269          |
 ### Framework versions

 # my_rugpt3medium_finetune
+This model is a fine-tuned version of [ai-forever/rugpt3medium_based_on_gpt2](https://huggingface.co/ai-forever/rugpt3medium_based_on_gpt2) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.9955
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 1000
+- num_epochs: 35
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.5373        | 0.46  | 25   | 3.4828          |
+| 3.5265        | 0.93  | 50   | 3.4708          |
+| 3.478         | 1.39  | 75   | 3.4398          |
+| 3.4851        | 1.85  | 100  | 3.3995          |
+| 3.4407        | 2.31  | 125  | 3.3609          |
+| 3.3731        | 2.78  | 150  | 3.3241          |
+| 3.3584        | 3.24  | 175  | 3.2886          |
+| 3.3267        | 3.7   | 200  | 3.2540          |
+| 3.3043        | 4.17  | 225  | 3.2200          |
+| 3.229         | 4.63  | 250  | 3.1853          |
+| 3.2618        | 5.09  | 275  | 3.1508          |
+| 3.1823        | 5.56  | 300  | 3.1164          |
+| 3.172         | 6.02  | 325  | 3.0779          |
+| 3.1354        | 6.48  | 350  | 3.0395          |
+| 3.0899        | 6.94  | 375  | 2.9987          |
+| 3.0741        | 7.41  | 400  | 2.9577          |
+| 3.009         | 7.87  | 425  | 2.9140          |
+| 2.9598        | 8.33  | 450  | 2.8737          |
+| 2.9187        | 8.8   | 475  | 2.8294          |
+| 2.9378        | 9.26  | 500  | 2.7842          |
+| 2.8396        | 9.72  | 525  | 2.7374          |
+| 2.8608        | 10.19 | 550  | 2.6889          |
+| 2.7296        | 10.65 | 575  | 2.6405          |
+| 2.7452        | 11.11 | 600  | 2.5926          |
+| 2.6882        | 11.57 | 625  | 2.5389          |
+| 2.6463        | 12.04 | 650  | 2.4893          |
+| 2.572         | 12.5  | 675  | 2.4356          |
+| 2.5384        | 12.96 | 700  | 2.3788          |
+| 2.5246        | 13.43 | 725  | 2.3296          |
+| 2.4055        | 13.89 | 750  | 2.2747          |
+| 2.3759        | 14.35 | 775  | 2.2155          |
+| 2.3351        | 14.81 | 800  | 2.1606          |
+| 2.286         | 15.28 | 825  | 2.1061          |
+| 2.2694        | 15.74 | 850  | 2.0504          |
+| 2.1745        | 16.2  | 875  | 1.9967          |
+| 2.1053        | 16.67 | 900  | 1.9411          |
+| 2.1184        | 17.13 | 925  | 1.8878          |
+| 2.0107        | 17.59 | 950  | 1.8362          |
+| 2.027         | 18.06 | 975  | 1.7854          |
+| 1.9153        | 18.52 | 1000 | 1.7304          |
+| 1.9267        | 18.98 | 1025 | 1.6854          |
+| 1.8131        | 19.44 | 1050 | 1.6331          |
+| 1.8405        | 19.91 | 1075 | 1.5839          |
+| 1.7294        | 20.37 | 1100 | 1.5370          |
+| 1.7154        | 20.83 | 1125 | 1.4971          |
+| 1.6573        | 21.3  | 1150 | 1.4476          |
+| 1.6391        | 21.76 | 1175 | 1.4130          |
+| 1.5497        | 22.22 | 1200 | 1.3727          |
+| 1.5194        | 22.69 | 1225 | 1.3378          |
+| 1.535         | 23.15 | 1250 | 1.3000          |
+| 1.4514        | 23.61 | 1275 | 1.2714          |
+| 1.4711        | 24.07 | 1300 | 1.2388          |
+| 1.4105        | 24.54 | 1325 | 1.2136          |
+| 1.4202        | 25.0  | 1350 | 1.1890          |
+| 1.3351        | 25.46 | 1375 | 1.1679          |
+| 1.3575        | 25.93 | 1400 | 1.1440          |
+| 1.2882        | 26.39 | 1425 | 1.1202          |
+| 1.3378        | 26.85 | 1450 | 1.1074          |
+| 1.3094        | 27.31 | 1475 | 1.0864          |
+| 1.2793        | 27.78 | 1500 | 1.0743          |
+| 1.2377        | 28.24 | 1525 | 1.0626          |
+| 1.2693        | 28.7  | 1550 | 1.0468          |
+| 1.2157        | 29.17 | 1575 | 1.0368          |
+| 1.2007        | 29.63 | 1600 | 1.0263          |
+| 1.2376        | 30.09 | 1625 | 1.0221          |
+| 1.2216        | 30.56 | 1650 | 1.0136          |
+| 1.1923        | 31.02 | 1675 | 1.0102          |
+| 1.2143        | 31.48 | 1700 | 1.0039          |
+| 1.1764        | 31.94 | 1725 | 1.0014          |
+| 1.1654        | 32.41 | 1750 | 0.9990          |
+| 1.2031        | 32.87 | 1775 | 0.9976          |
+| 1.1952        | 33.33 | 1800 | 0.9965          |
+| 1.1852        | 33.8  | 1825 | 0.9961          |
+| 1.1737        | 34.26 | 1850 | 0.9959          |
+| 1.1609        | 34.72 | 1875 | 0.9955          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:720095fc544006d19dd75064e06645c36f75a32cf3956b55faab42c04a85a284
 size 1423517184

 version https://git-lfs.github.com/spec/v1
+oid sha256:2cc3c3c6dc63dea0e60a4f481edcbe973fad68350d039223ce490a333f92225a
 size 1423517184

runs/Dec28_09-55-46_cd8a8367193a/events.out.tfevents.1703757350.cd8a8367193a.207.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:792683a36556972dec13aca2a22237e80d0244162719553d1c37c6ebb6e3d61b
-size 21636

 version https://git-lfs.github.com/spec/v1
+oid sha256:28a91ff56dcfd391f3205ff84df870f3a887ce4b07558a2e26fb976d7765d70a
+size 36970