End of training

Browse files

Files changed (5) hide show

README.md +58 -8
config.json +1 -1
model.safetensors +1 -1
runs/Jun02_11-20-13_ae7d5d7e9b83/events.out.tfevents.1717327217.ae7d5d7e9b83.34.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,17 +1,67 @@
 ---
-language:
-- en
 license: mit
-library_name: transformers
 tags:
 - generated_from_trainer
-base_model: gpt2
-datasets:
-- Vezora/10k-Python-2048-Max
-pipeline_tag: text-generation
 model-index:
 - name: gpt2coder-8epochs
   results: []
 ---
-This model is pre-trained for code-generation tasks specifically for python. This is just a pre-trained model, fine-tunning is on-going

 ---
 license: mit
+base_model: Aravindan/gpt2out
 tags:
 - generated_from_trainer
 model-index:
 - name: gpt2coder-8epochs
   results: []
 ---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# gpt2coder-8epochs
+This model is a fine-tuned version of [Aravindan/gpt2out](https://huggingface.co/Aravindan/gpt2out) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.9718
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 10
+- total_train_batch_size: 80
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 8
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| No log        | 1.0   | 125  | 3.1181          |
+| No log        | 2.0   | 250  | 2.6411          |
+| No log        | 3.0   | 375  | 2.4035          |
+| 2.9711        | 4.0   | 500  | 2.2375          |
+| 2.9711        | 5.0   | 625  | 2.1289          |
+| 2.9711        | 6.0   | 750  | 2.0475          |
+| 2.9711        | 7.0   | 875  | 1.9931          |
+| 2.1959        | 8.0   | 1000 | 1.9718          |
+### Framework versions
+- Transformers 4.41.1
+- Pytorch 2.1.2
+- Datasets 2.19.1
+- Tokenizers 0.19.1

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "gpt2",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

 {
+  "_name_or_path": "Aravindan/gpt2out",
   "activation_function": "gelu_new",
   "architectures": [
     "GPT2LMHeadModel"

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9e7646390bf9e93424f606063f889306b319ffb5c6e3534cf9534aceb74f492e
 size 497774208

 version https://git-lfs.github.com/spec/v1
+oid sha256:a1be145ec86717b1f33f14ab714d56a62366e204342bda2d6f6af94ceae0e1ec
 size 497774208

runs/Jun02_11-20-13_ae7d5d7e9b83/events.out.tfevents.1717327217.ae7d5d7e9b83.34.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d1684771a766c00e44b81a956e5c5194c16ac2b71836123c77083ddc431efb79
+size 8008

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5980876d519391750dedffd367b98344d37b11d489c07832e40f84b6c89dbfbc
 size 5112

 version https://git-lfs.github.com/spec/v1
+oid sha256:6e9096652f666a5a9037023053ae1fa3ff8e596df2b1c46d324831ec3b6db17f
 size 5112