RylanSchaeffer
/

collapse_gemma-2-2b_hs2_replace_iter7_sftsd0

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-2b_hs2_replace_iter7_sftsd0

This model is a fine-tuned version of google/gemma-2-2b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.4308
Num Input Tokens Seen: 4841952

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 8
eval_batch_size: 16
seed: 0
gradient_accumulation_steps: 16
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.3909	0
1.5346	0.0516	5	1.2786	260208
0.9281	0.1032	10	1.2957	512616
0.5962	0.1547	15	1.5282	766320
0.2905	0.2063	20	1.7105	1018352
0.2044	0.2579	25	1.9340	1270808
0.1128	0.3095	30	2.0841	1527056
0.0634	0.3611	35	2.2219	1778088
0.042	0.4126	40	2.3269	2022528
0.0308	0.4642	45	2.3926	2280096
0.0302	0.5158	50	2.4223	2535736
0.0273	0.5674	55	2.4248	2778432
0.0238	0.6190	60	2.4126	3034168
0.0253	0.6705	65	2.4444	3284112
0.0236	0.7221	70	2.4540	3539984
0.0261	0.7737	75	2.4524	3791064
0.0253	0.8253	80	2.4419	4033320
0.0264	0.8769	85	2.4440	4285000
0.0253	0.9284	90	2.4517	4543120
0.0253	0.9800	95	2.4359	4791856

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 8

Safetensors

Model size

2.61B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-2b_hs2_replace_iter7_sftsd0

Base model

google/gemma-2-2b

Finetuned

(511)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard