Aravindan commited on
Commit
71b13b2
1 Parent(s): eaff5f8

End of training

Browse files
README.md CHANGED
@@ -1,17 +1,67 @@
1
  ---
2
- language:
3
- - en
4
  license: mit
5
- library_name: transformers
6
  tags:
7
  - generated_from_trainer
8
- base_model: gpt2
9
- datasets:
10
- - Vezora/10k-Python-2048-Max
11
- pipeline_tag: text-generation
12
  model-index:
13
  - name: gpt2coder-8epochs
14
  results: []
15
  ---
16
 
17
- This model is pre-trained for code-generation tasks specifically for python. This is just a pre-trained model, fine-tunning is on-going
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
 
 
2
  license: mit
3
+ base_model: Aravindan/gpt2out
4
  tags:
5
  - generated_from_trainer
 
 
 
 
6
  model-index:
7
  - name: gpt2coder-8epochs
8
  results: []
9
  ---
10
 
11
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
12
+ should probably proofread and complete it, then remove this comment. -->
13
+
14
+ # gpt2coder-8epochs
15
+
16
+ This model is a fine-tuned version of [Aravindan/gpt2out](https://huggingface.co/Aravindan/gpt2out) on an unknown dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 1.9718
19
+
20
+ ## Model description
21
+
22
+ More information needed
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+ More information needed
31
+
32
+ ## Training procedure
33
+
34
+ ### Training hyperparameters
35
+
36
+ The following hyperparameters were used during training:
37
+ - learning_rate: 5e-05
38
+ - train_batch_size: 8
39
+ - eval_batch_size: 8
40
+ - seed: 42
41
+ - gradient_accumulation_steps: 10
42
+ - total_train_batch_size: 80
43
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
+ - lr_scheduler_type: linear
45
+ - lr_scheduler_warmup_ratio: 0.1
46
+ - num_epochs: 8
47
+
48
+ ### Training results
49
+
50
+ | Training Loss | Epoch | Step | Validation Loss |
51
+ |:-------------:|:-----:|:----:|:---------------:|
52
+ | No log | 1.0 | 125 | 3.1181 |
53
+ | No log | 2.0 | 250 | 2.6411 |
54
+ | No log | 3.0 | 375 | 2.4035 |
55
+ | 2.9711 | 4.0 | 500 | 2.2375 |
56
+ | 2.9711 | 5.0 | 625 | 2.1289 |
57
+ | 2.9711 | 6.0 | 750 | 2.0475 |
58
+ | 2.9711 | 7.0 | 875 | 1.9931 |
59
+ | 2.1959 | 8.0 | 1000 | 1.9718 |
60
+
61
+
62
+ ### Framework versions
63
+
64
+ - Transformers 4.41.1
65
+ - Pytorch 2.1.2
66
+ - Datasets 2.19.1
67
+ - Tokenizers 0.19.1
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "gpt2",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2LMHeadModel"
 
1
  {
2
+ "_name_or_path": "Aravindan/gpt2out",
3
  "activation_function": "gelu_new",
4
  "architectures": [
5
  "GPT2LMHeadModel"
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9e7646390bf9e93424f606063f889306b319ffb5c6e3534cf9534aceb74f492e
3
  size 497774208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1be145ec86717b1f33f14ab714d56a62366e204342bda2d6f6af94ceae0e1ec
3
  size 497774208
runs/Jun02_11-20-13_ae7d5d7e9b83/events.out.tfevents.1717327217.ae7d5d7e9b83.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d1684771a766c00e44b81a956e5c5194c16ac2b71836123c77083ddc431efb79
3
+ size 8008
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5980876d519391750dedffd367b98344d37b11d489c07832e40f84b6c89dbfbc
3
  size 5112
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e9096652f666a5a9037023053ae1fa3ff8e596df2b1c46d324831ec3b6db17f
3
  size 5112