Fine-tuning the qwen2-7b-instruct model using the msagent-pro dataset and the loss_scale technique with swift, the script is as follows:

NPROC_PER_NODE=8 \
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MASTER_PORT=29500 \
swift sft \
    --model_type qwen2-7b-instruct \
    --learning_rate 2e-6 \
    --sft_type full \
    --dataset msagent-pro \
    --gradient_checkpointing true \
    --gradient_accumulation_steps 8 \
    --deepspeed default-zero3 \
    --use_loss_scale true \
    --save_strategy epoch \
    --batch_size 1 \
    --num_train_epochs 1 \
    --max_length 4096 \
    --preprocess_num_proc 4 \
    --use_loss_scale true \
    --loss_scale_config_path agent-flan \
    --ddp_backend nccl \

Comparison with the Original Model on the ToolBench Evaluation Set

Model ToolBench (in-domain) ToolBench (out-of-domain)
Plan.EM Act.EM HalluRate (lower is better) Avg.F1 R-L Plan.EM Act.EM HalluRate (lower is better) Avg.F1
llama3-8b-instruct 74.11 54.74 4.16 46.53 8.51 73.17 57.67 3.84 48.58
llama3-8b-agent-instruct-v2 83.37 60.01 2.58 54.41 26.34 82.57 60.14 1.79 55.25

For detailed explanations of the evaluation metrics, please refer to document

deploy this model:

USE_HF=True swift deploy \
  --model_id_or_path modelscope/qwen2-7b-agent-instruct \
  --model_type qwen2-7b-instruct \
  --infer_backend vllm \
  --tools_prompt toolbench
Downloads last month
33
Safetensors
Model size
7.62B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.