Edit model card

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Visualize in Weights & Biases

mistral-sum-r16a16longer

This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6498

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 20
  • training_steps: 500

Training results

Training Loss Epoch Step Validation Loss
2.196 0.0331 20 1.7916
1.6952 0.0663 40 1.7285
1.6458 0.0994 60 1.7016
1.6835 0.1326 80 1.6930
1.636 0.1657 100 1.6898
1.6294 0.1988 120 1.6890
1.67 0.2320 140 1.6813
1.6695 0.2651 160 1.6818
1.6442 0.2983 180 1.6784
1.6376 0.3314 200 1.6720
1.6094 0.3645 220 1.6692
1.6205 0.3977 240 1.6700
1.6372 0.4308 260 1.6663
1.6511 0.4640 280 1.6675
1.6071 0.4971 300 1.6705
1.6609 0.5302 320 1.6615
1.6039 0.5634 340 1.6576
1.6411 0.5965 360 1.6545
1.6363 0.6297 380 1.6551
1.6341 0.6628 400 1.6512
1.6123 0.6959 420 1.6492
1.6368 0.7291 440 1.6508
1.6265 0.7622 460 1.6513
1.6028 0.7954 480 1.6504
1.6229 0.8285 500 1.6498

Framework versions

  • PEFT 0.11.2.dev0
  • Transformers 4.42.0.dev0
  • Pytorch 2.1.2+cu121
  • Datasets 2.19.1
  • Tokenizers 0.19.1
Downloads last month
0
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for Taizer/mistral-sum-r16a16longer

Adapter
(1172)
this model