Model Card for Model ID
Model Details
Model Description
This model is SFT by HuggingFaceH4/deita-10k-v0-sft
dataset on lmsys/vicuna-7b-v1.5
model.
- Model type: Llama2 Decoder-Only
- Language(s) (NLP): English
- License: llama2
- Finetuned from model: lmsys/vicuna-7b-v1.5
Training Details
Training Data
HuggingFaceH4/deita-10k-v0-sft
Training Procedure
SFT
Notice: do_sample
in generation_config.json
was set to True
to avoid this error https://github.com/huggingface/transformers/issues/29988
.
Training Hyperparameters
- Precision: BFloat16
- Chat Template: Vicuna 1.1
- Global Batch Size: 128
- Learning Rate: 2.0e-5
- Num Epoches: 3
- Max Length: 2048
- Packing: True
- Training Steps 1047
Evaluation
It Finally achieved loss=0.8375901579856873 in the eval set of HuggingFaceH4/deita-10k-v0-sft
Testing Data, Factors & Metrics
- Downloads last month
- 10
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.