RylanSchaeffer
/

collapse_gemma-2-9b_hs2_accumulate_iter1_sftsd0

Generated from Trainer

Model card Files Files and versions Community

collapse_gemma-2-9b_hs2_accumulate_iter1_sftsd0

This model is a fine-tuned version of google/gemma-2-9b on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.9317
Num Input Tokens Seen: 5253020

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-06
train_batch_size: 4
eval_batch_size: 16
seed: 0
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
No log	0	0	1.2335	0
1.1387	0.0511	5	1.0668	272284
1.0001	0.1021	10	0.9837	541544
0.973	0.1532	15	0.9711	815788
0.9408	0.2043	20	0.9643	1087364
0.9645	0.2553	25	0.9589	1353816
0.9649	0.3064	30	0.9552	1626852
0.9603	0.3575	35	0.9514	1893740
0.9398	0.4086	40	0.9476	2168380
0.9474	0.4596	45	0.9456	2438380
0.9416	0.5107	50	0.9432	2708844
0.9821	0.5618	55	0.9417	2978352
0.9701	0.6128	60	0.9409	3244148
0.9409	0.6639	65	0.9382	3518100
1.0424	0.7150	70	0.9367	3784544
0.8923	0.7660	75	0.9348	4055360
0.9918	0.8171	80	0.9339	4326216
0.9415	0.8682	85	0.9338	4595508
0.8826	0.9192	90	0.9325	4869076
0.9146	0.9703	95	0.9320	5142272

Framework versions

Transformers 4.44.0
Pytorch 2.4.0+cu121
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: 4

Safetensors

Model size

9.24B params

Tensor type

BF16

·

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for RylanSchaeffer/collapse_gemma-2-9b_hs2_accumulate_iter1_sftsd0

Base model

google/gemma-2-9b

Finetuned

(226)

this model

Evaluation results

Metadata error: specify a dataset to view leaderboard