End of training

Files changed (7) hide show

README.md CHANGED Viewed

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [ai-forever/rugpt3medium_based_on_gpt2](https://huggingface.co/ai-forever/rugpt3medium_based_on_gpt2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 7.6538
 ## Model description
@@ -49,21 +49,21 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 10.9229       | 1.6   | 25   | 10.5844         |
-| 10.3434       | 3.19  | 50   | 9.8909          |
-| 9.8567        | 4.79  | 75   | 9.5807          |
-| 9.6367        | 6.38  | 100  | 9.4535          |
-| 9.5384        | 7.98  | 125  | 9.3692          |
-| 9.4568        | 9.57  | 150  | 9.2894          |
-| 9.3755        | 11.17 | 175  | 9.1763          |
-| 9.2723        | 12.77 | 200  | 9.0257          |
-| 9.1276        | 14.36 | 225  | 8.8652          |
-| 9.0174        | 15.96 | 250  | 8.6803          |
-| 8.8551        | 17.55 | 275  | 8.5146          |
-| 8.7365        | 19.15 | 300  | 8.2912          |
-| 8.5503        | 20.74 | 325  | 8.0806          |
-| 8.4153        | 22.34 | 350  | 7.8582          |
-| 8.266         | 23.94 | 375  | 7.6538          |
 ### Framework versions

 This model is a fine-tuned version of [ai-forever/rugpt3medium_based_on_gpt2](https://huggingface.co/ai-forever/rugpt3medium_based_on_gpt2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.9269
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.601         | 1.6   | 25   | 3.6157          |
+| 3.601         | 3.19  | 50   | 3.6010          |
+| 3.5542        | 4.79  | 75   | 3.5621          |
+| 3.5309        | 6.38  | 100  | 3.5117          |
+| 3.496         | 7.98  | 125  | 3.4615          |
+| 3.446         | 9.57  | 150  | 3.4173          |
+| 3.34          | 11.17 | 175  | 3.3699          |
+| 3.3581        | 12.77 | 200  | 3.3214          |
+| 3.3136        | 14.36 | 225  | 3.2743          |
+| 3.214         | 15.96 | 250  | 3.2227          |
+| 3.2098        | 17.55 | 275  | 3.1738          |
+| 3.1348        | 19.15 | 300  | 3.1153          |
+| 3.0931        | 20.74 | 325  | 3.0561          |
+| 3.0383        | 22.34 | 350  | 2.9922          |
+| 2.9739        | 23.94 | 375  | 2.9269          |
 ### Framework versions

config.json CHANGED Viewed

@@ -17,7 +17,7 @@
   },
   "layer_norm_epsilon": 1e-05,
   "model_type": "gpt2",
-  "n_ctx": 64,
   "n_embd": 1024,
   "n_head": 16,
   "n_inner": null,

   },
   "layer_norm_epsilon": 1e-05,
   "model_type": "gpt2",
+  "n_ctx": 2048,
   "n_embd": 1024,
   "n_head": 16,
   "n_inner": null,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d7d7152fff8469263506d42775bf9045e13af057a81141ec5727524516896b24
 size 1423517184

 version https://git-lfs.github.com/spec/v1
+oid sha256:62e98ff50b88cae36ff0f22421aee04a4b36e773e59aa6d4f5c71d2a1af32894
 size 1423517184

runs/Dec28_08-01-03_29de5df9d89b/events.out.tfevents.1703750470.29de5df9d89b.2425.1 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:0676afda0cb7e652a3af237f4609160e555c1dc1497b3122615acf2604fb8488
+size 11290

special_tokens_map.json CHANGED Viewed

@@ -20,7 +20,13 @@
     "rstrip": false,
     "single_word": false
   },
-  "pad_token": "</s>",
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

     "rstrip": false,
     "single_word": false
   },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": true,
+    "rstrip": false,
+    "single_word": false
+  },
   "unk_token": {
     "content": "<unk>",
     "lstrip": false,

tokenizer_config.json CHANGED Viewed

@@ -49,7 +49,7 @@
   "errors": "replace",
   "mask_token": "<mask>",
   "model_max_length": 2048,
-  "pad_token": "</s>",
   "padding_side": "left",
   "tokenizer_class": "GPT2Tokenizer",
   "truncation_side": "left",

   "errors": "replace",
   "mask_token": "<mask>",
   "model_max_length": 2048,
+  "pad_token": "<pad>",
   "padding_side": "left",
   "tokenizer_class": "GPT2Tokenizer",
   "truncation_side": "left",

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:48b3d54437b058f9d9ad195c4d902156fff1d1e014be87376947a56f46efa886
 size 4600

 version https://git-lfs.github.com/spec/v1
+oid sha256:2729ac8f55c02f07d5e0fc8c44d41d7a749f69b0b71acb223f9cae006ed7bfef
 size 4600