RylanSchaeffer
/

collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd1

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd1

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1796
Num Input Tokens Seen: 5038648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 1
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.381	0.0531	5	1.2701	271832
1.1988	0.1062	10	1.1934	544104
1.0069	0.1594	15	1.1852	802480
0.86	0.2125	20	1.1991	1070600
0.8632	0.2656	25	1.2041	1347848
0.7896	0.3187	30	1.2265	1614736
0.6294	0.3718	35	1.2269	1874656
0.5769	0.4250	40	1.2244	2140272
0.4774	0.4781	45	1.2120	2408424
0.5257	0.5312	50	1.2074	2680984
0.501	0.5843	55	1.2010	2949464
0.4729	0.6375	60	1.1885	3214888
0.4757	0.6906	65	1.1828	3485080
0.4514	0.7437	70	1.1845	3751008
0.4081	0.7968	75	1.1793	4016424
0.4307	0.8499	80	1.1869	4289272
0.4335	0.9031	85	1.1811	4558712
0.3815	0.9562	90	1.1835	4822832

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 1

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_accumulatesubsample_iter8_sftsd1

Base model

google/gemma-2-2b

Finetuned

(484)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard