Fine-tuning the llama3-8b-instruct model using the msagent-pro dataset and the loss_scale technique with swift, the script is as follows:

NPROC_PER_NODE=8 \
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=29500 \
swift sft \
    --model_type llama3-8b-instruct \
    --learning_rate 2e-5 \
    --sft_type lora \
    --dataset msagent-pro \
    --gradient_checkpointing true \
    --gradient_accumulation_steps 8 \
    --deepspeed default-zero3 \
    --lora_target_modules ALL \
    --use_loss_scale true \
    --save_strategy epoch \
    --batch_size 1 \
    --num_train_epochs 2 \
    --max_length 4096 \
    --preprocess_num_proc 4 \
    --use_loss_scale true \
    --loss_scale_config_path agent-flan \
    --ddp_backend nccl \

Comparison with the Original Model on the ToolBench Evaluation Set

Model ToolBench (in-domain) ToolBench (out-of-domain)
Plan.EM Act.EM HalluRate (lower is better) Avg.F1 R-L Plan.EM Act.EM HalluRate (lower is better) Avg.F1
llama3-8b-instruct 74.22 36.17 15.68 20.0 12.14 69.47 34.21 14.72 20.25
llama3-8b-agent-instruct-v2 85.15 58.1 1.57 52.10 26.02 85.79 59.43 2.56 52.19

For detailed explanations of the evaluation metrics, please refer to document

Deploy this model:

USE_HF=True swift deploy \
  --model_id_or_path modelscope/llama3-8b-agent-instruct-v2 \
  --model_type llama3-8b-instruct \
  --infer_backend vllm \
  --tools_prompt toolbench
Downloads last month
17
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.