RylanSchaeffer
/

collapse_gemma-2-2b_hs2_accumulatesubsample_iter19_sftsd1

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_accumulatesubsample_iter19_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.2055
Num Input Tokens Seen: 4907024

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 1
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.3427	0.0527	5	1.2782	258072
1.0971	0.1053	10	1.2131	521696
0.9209	0.1580	15	1.2167	782872
0.7304	0.2107	20	1.2697	1039040
0.6214	0.2633	25	1.2589	1307632
0.5449	0.3160	30	1.3018	1568000
0.521	0.3687	35	1.2918	1824608
0.4267	0.4213	40	1.2783	2087280
0.4484	0.4740	45	1.2457	2348744
0.403	0.5267	50	1.2346	2610176
0.3899	0.5793	55	1.2224	2873528
0.3705	0.6320	60	1.2227	3133328
0.3662	0.6847	65	1.2187	3395112
0.3322	0.7373	70	1.2076	3656104
0.3614	0.7900	75	1.2070	3917544
0.3462	0.8427	80	1.2021	4174120
0.3258	0.8953	85	1.2061	4437136
0.3069	0.9480	90	1.2061	4699512

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 5

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_accumulatesubsample_iter19_sftsd1

Base model

google/gemma-2-2b

Finetuned

(484)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard