Edit model card

llama-real-and-synthetic-sftsd2

This model is a fine-tuned version of meta-llama/Llama-3.2-3B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9722
  • Num Input Tokens Seen: 1907704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-06
  • train_batch_size: 8
  • eval_batch_size: 16
  • seed: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
No log 0 0 1.5881 0
1.3233 0.0856 5 1.2853 161264
1.1841 0.1711 10 1.1799 323576
1.0739 0.2567 15 1.1151 485960
1.0499 0.3422 20 1.0835 648472
1.0249 0.4278 25 1.0623 813392
1.0022 0.5134 30 1.0431 978584
1.0058 0.5989 35 1.0266 1144056
0.9862 0.6845 40 1.0096 1304592
0.93 0.7701 45 0.9925 1474856
0.9372 0.8556 50 0.9802 1637096
0.914 0.9412 55 0.9751 1807632

Framework versions

  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
9
Safetensors
Model size
3.21B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for jkazdan/llama3b-real-only-sftsd2

Finetuned
(105)
this model