metadata

library_name: transformers
base_model: bowphs/pythia-70m-multi
tags:
  - generated_from_trainer
datasets:
  - allenai/c4
metrics:
  - accuracy
model-index:
  - name: c4-model
    results:
      - task:
          name: Causal Language Modeling
          type: text-generation
        dataset:
          name: allenai/c4 en
          type: allenai/c4
          args: en
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.3716248289345064

c4-model

This model is a fine-tuned version of bowphs/pythia-70m-multi on the allenai/c4 en dataset. It achieves the following results on the evaluation set:

Loss: 3.5532
Accuracy: 0.3716

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
training_steps: 30000

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	0.0000	1	10.7029	0.0164
No log	0.0001	2	10.5331	0.0496
No log	0.0001	4	10.3022	0.0533
No log	0.0003	8	10.0235	0.0536
No log	0.0005	16	9.6536	0.0635
No log	0.0011	32	9.0284	0.0759
No log	0.0021	64	8.0249	0.0832
No log	0.0043	128	6.9172	0.1129
No log	0.0085	256	6.1629	0.1558
No log	0.0171	512	5.5805	0.1817
No log	0.0341	1024	5.1235	0.2028
5.4529	0.0667	2000	4.7613	0.2264
5.4529	0.0683	2048	4.7481	0.2281
4.5765	0.1333	4000	4.4123	0.2610
4.5765	0.1365	4096	4.4043	0.2625
4.3252	0.2	6000	4.2221	0.2827
4.146	0.2667	8000	4.0350	0.3098
4.146	0.2731	8192	4.0134	0.3129
3.9652	0.3333	10000	3.8860	0.3304
3.8441	0.4	12000	3.8005	0.3418
3.7739	0.4667	14000	3.7315	0.3503
3.72	0.5333	16000	3.6880	0.3553
3.72	0.5461	16384	3.6777	0.3564
3.6718	0.6	18000	3.6533	0.3593
3.6527	0.6667	20000	3.6212	0.3633
3.6201	0.7333	22000	3.5985	0.3660
3.593	0.8	24000	3.5819	0.3679
3.5857	0.8667	26000	3.5683	0.3697
3.5801	0.9333	28000	3.5582	0.3711
3.5649	1.0	30000	3.5532	0.3716

Framework versions

Transformers 4.48.0.dev0
Pytorch 2.5.1+cu124
Datasets 3.2.0
Tokenizers 0.21.0