c4-model / README.md
bowphs's picture
End of training
989830e verified
metadata
library_name: transformers
base_model: bowphs/pythia-70m-multi
tags:
  - generated_from_trainer
datasets:
  - allenai/c4
metrics:
  - accuracy
model-index:
  - name: c4-model
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: allenai/c4 en
          type: allenai/c4
          args: en
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.3716248289345064

c4-model

This model is a fine-tuned version of bowphs/pythia-70m-multi on the allenai/c4 en dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5532
  • Accuracy: 0.3716

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 30000

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0.0000 1 10.7029 0.0164
No log 0.0001 2 10.5331 0.0496
No log 0.0001 4 10.3022 0.0533
No log 0.0003 8 10.0235 0.0536
No log 0.0005 16 9.6536 0.0635
No log 0.0011 32 9.0284 0.0759
No log 0.0021 64 8.0249 0.0832
No log 0.0043 128 6.9172 0.1129
No log 0.0085 256 6.1629 0.1558
No log 0.0171 512 5.5805 0.1817
No log 0.0341 1024 5.1235 0.2028
5.4529 0.0667 2000 4.7613 0.2264
5.4529 0.0683 2048 4.7481 0.2281
4.5765 0.1333 4000 4.4123 0.2610
4.5765 0.1365 4096 4.4043 0.2625
4.3252 0.2 6000 4.2221 0.2827
4.146 0.2667 8000 4.0350 0.3098
4.146 0.2731 8192 4.0134 0.3129
3.9652 0.3333 10000 3.8860 0.3304
3.8441 0.4 12000 3.8005 0.3418
3.7739 0.4667 14000 3.7315 0.3503
3.72 0.5333 16000 3.6880 0.3553
3.72 0.5461 16384 3.6777 0.3564
3.6718 0.6 18000 3.6533 0.3593
3.6527 0.6667 20000 3.6212 0.3633
3.6201 0.7333 22000 3.5985 0.3660
3.593 0.8 24000 3.5819 0.3679
3.5857 0.8667 26000 3.5683 0.3697
3.5801 0.9333 28000 3.5582 0.3711
3.5649 1.0 30000 3.5532 0.3716

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0