Repository: modelscope/ms-swift Branch: main Commit: fe928a9f1464 Files: 1228 Total size: 7.2 MB Directory structure: gitextract_9q3kx9l7/ ├── .dev_scripts/ │ ├── build_docs.sh │ ├── ci_container_test.sh │ ├── dockerci.sh │ └── dockerci_npu.sh ├── .github/ │ ├── ISSUE_TEMPLATE/ │ │ ├── 1-bug-report.yml │ │ ├── 2-feature-request.yml │ │ ├── 3-question-discussion.yml │ │ └── config.yml │ ├── PULL_REQUEST_TEMPLATE.md │ ├── SECURITY.md │ └── workflows/ │ ├── citest.yaml │ ├── citest_npu.yaml │ ├── close_tale_issue.yaml │ ├── lint.yaml │ └── publish.yaml ├── .gitignore ├── .pre-commit-config.yaml ├── .pre-commit-config_local.yaml ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── CONTRIBUTING_CN.md ├── LICENSE ├── MANIFEST.in ├── Makefile ├── README.md ├── README_CN.md ├── docs/ │ ├── Makefile │ ├── README.md │ ├── make.bat │ ├── source/ │ │ ├── .readthedocs.yaml │ │ ├── BestPractices/ │ │ │ ├── Elastic.md │ │ │ ├── Embedding.md │ │ │ ├── GRPO-Code-Training.md │ │ │ ├── GRPO-Multi-Modal-Training.md │ │ │ ├── GRPO.md │ │ │ ├── MLLM-Registration.md │ │ │ ├── Metax-support.md │ │ │ ├── More-Best-Practices.md │ │ │ ├── NPU-support.md │ │ │ ├── Qwen3-Best-Practice.md │ │ │ ├── Qwen3-VL-Best-Practice.md │ │ │ ├── Qwen3_5-Best-Practice.md │ │ │ ├── Rapidly-Training-VL-model.md │ │ │ └── Reranker.md │ │ ├── Customization/ │ │ │ ├── Architecture.md │ │ │ ├── Custom-dataset.md │ │ │ └── Custom-model.md │ │ ├── GetStarted/ │ │ │ ├── Quick-start.md │ │ │ ├── SWIFT-installation.md │ │ │ └── Web-UI.md │ │ ├── Instruction/ │ │ │ ├── Agent-support.md │ │ │ ├── Command-line-parameters.md │ │ │ ├── Evaluation.md │ │ │ ├── Export-and-push.md │ │ │ ├── Frequently-asked-questions.md │ │ │ ├── GKD.md │ │ │ ├── GRPO/ │ │ │ │ ├── AdvancedResearch/ │ │ │ │ │ ├── CHORD.md │ │ │ │ │ ├── CISPO.md │ │ │ │ │ ├── DAPO.md │ │ │ │ │ ├── GSPO.md │ │ │ │ │ ├── REINFORCEPP.md │ │ │ │ │ ├── RLOO.md │ │ │ │ │ ├── SAPO.md │ │ │ │ │ ├── deepeyes.md │ │ │ │ │ ├── entropy_mask.md │ │ │ │ │ ├── index.rst │ │ │ │ │ ├── training_inference_mismatch.md │ │ │ │ │ └── treepo.md │ │ │ │ ├── DeveloperGuide/ │ │ │ │ │ ├── gym_env.md │ │ │ │ │ ├── index.rst │ │ │ │ │ ├── loss_types.md │ │ │ │ │ ├── multi_task.md │ │ │ │ │ ├── multi_turn.md │ │ │ │ │ ├── reward_function.md │ │ │ │ │ └── reward_model.md │ │ │ │ ├── GetStarted/ │ │ │ │ │ ├── GRPO.md │ │ │ │ │ └── index.rst │ │ │ │ └── index.rst │ │ │ ├── Inference-and-deployment.md │ │ │ ├── Pre-training-and-Fine-tuning.md │ │ │ ├── RLHF.md │ │ │ ├── Ray.md │ │ │ ├── Reinforced-Fine-tuning.md │ │ │ ├── Sample.md │ │ │ ├── Supported-models-and-datasets.md │ │ │ └── Use-tuners.md │ │ ├── Megatron-SWIFT/ │ │ │ ├── Ascend.md │ │ │ ├── Command-line-parameters.md │ │ │ ├── GKD.md │ │ │ ├── GRPO.md │ │ │ ├── LoRA-Training.md │ │ │ ├── Mcore-Bridge.md │ │ │ ├── Multimodal-Model.md │ │ │ └── Quick-start.md │ │ ├── _templates/ │ │ │ ├── autosummary/ │ │ │ │ └── class.rst │ │ │ ├── classtemplate.rst │ │ │ └── sobolengine.rst │ │ ├── conf.py │ │ └── index.rst │ └── source_en/ │ ├── .readthedocs.yaml │ ├── BestPractices/ │ │ ├── Elastic.md │ │ ├── Embedding.md │ │ ├── GRPO-Code-Training.md │ │ ├── GRPO-Multi-Modal-Training.md │ │ ├── GRPO.md │ │ ├── MLLM-Registration.md │ │ ├── Metax-support.md │ │ ├── More-Best-Practices.md │ │ ├── NPU-support.md │ │ ├── Qwen3-Best-Practice.md │ │ ├── Qwen3-VL-Best-Practice.md │ │ ├── Qwen3_5-Best-Practice.md │ │ ├── Rapidly-Training-VL-model.md │ │ └── Reranker.md │ ├── Customization/ │ │ ├── Architecture.md │ │ ├── Custom-dataset.md │ │ └── Custom-model.md │ ├── GetStarted/ │ │ ├── Quick-start.md │ │ ├── SWIFT-installation.md │ │ └── Web-UI.md │ ├── Instruction/ │ │ ├── Agent-support.md │ │ ├── Command-line-parameters.md │ │ ├── Evaluation.md │ │ ├── Export-and-push.md │ │ ├── Frequently-asked-questions.md │ │ ├── GKD.md │ │ ├── GRPO/ │ │ │ ├── AdvancedResearch/ │ │ │ │ ├── CHORD.md │ │ │ │ ├── CISPO.md │ │ │ │ ├── DAPO.md │ │ │ │ ├── GSPO.md │ │ │ │ ├── REINFORCEPP.md │ │ │ │ ├── RLOO.md │ │ │ │ ├── SAPO.md │ │ │ │ ├── deepeyes.md │ │ │ │ ├── entropy_mask.md │ │ │ │ ├── index.rst │ │ │ │ ├── training_inference_mismatch.md │ │ │ │ └── treepo.md │ │ │ ├── DeveloperGuide/ │ │ │ │ ├── gym_env.md │ │ │ │ ├── index.rst │ │ │ │ ├── loss_types.md │ │ │ │ ├── multi_task.md │ │ │ │ ├── multi_turn.md │ │ │ │ ├── reward_function.md │ │ │ │ └── reward_model.md │ │ │ ├── GetStarted/ │ │ │ │ ├── GRPO.md │ │ │ │ └── index.rst │ │ │ └── index.rst │ │ ├── Inference-and-deployment.md │ │ ├── Pre-training-and-Fine-tuning.md │ │ ├── RLHF.md │ │ ├── Ray.md │ │ ├── Reinforced-Fine-tuning.md │ │ ├── Sample.md │ │ ├── Supported-models-and-datasets.md │ │ └── Use-tuners.md │ ├── Megatron-SWIFT/ │ │ ├── Ascend.md │ │ ├── Command-line-parameters.md │ │ ├── GKD.md │ │ ├── GRPO.md │ │ ├── LoRA-Training.md │ │ ├── Mcore-Bridge.md │ │ ├── Multimodal-Model.md │ │ └── Quick-start.md │ ├── _templates/ │ │ ├── autosummary/ │ │ │ └── class.rst │ │ ├── classtemplate.rst │ │ └── sobolengine.rst │ ├── conf.py │ └── index.rst ├── examples/ │ ├── README.md │ ├── app/ │ │ ├── base_url/ │ │ │ ├── demo.py │ │ │ └── demo.sh │ │ ├── llm/ │ │ │ ├── sglang.sh │ │ │ └── vllm.sh │ │ └── mllm.sh │ ├── ascend/ │ │ ├── activation_cpu_offload/ │ │ │ ├── fsdp2.json │ │ │ └── train.sh │ │ ├── deploy/ │ │ │ └── vllm.sh │ │ ├── infer/ │ │ │ └── vllm/ │ │ │ └── dp_tp.sh │ │ ├── megatron/ │ │ │ └── train_sft_full.sh │ │ ├── multi-node/ │ │ │ └── megatron/ │ │ │ ├── node1.sh │ │ │ └── node2.sh │ │ └── train/ │ │ ├── qwen3/ │ │ │ ├── qwen3_lora_deepspeed.sh │ │ │ ├── qwen3_lora_fsdp/ │ │ │ │ ├── fsdp.json │ │ │ │ └── train.sh │ │ │ └── qwen3_lora_megatron.sh │ │ ├── qwen3_next/ │ │ │ └── qwen3_next_megatron.sh │ │ ├── qwen3_omni/ │ │ │ └── qwen3_omni_full_mindspeed.sh │ │ └── qwen3_vl/ │ │ └── moe_full_mindspeed.sh │ ├── custom/ │ │ ├── dataset.py │ │ ├── infer.sh │ │ ├── model.py │ │ ├── model_hf.py │ │ ├── my_qwen2_5_omni/ │ │ │ ├── my_register.py │ │ │ ├── test_register.py │ │ │ └── train.py │ │ └── sft.sh │ ├── deploy/ │ │ ├── README.md │ │ ├── agent/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── bert/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── client/ │ │ │ ├── llm/ │ │ │ │ ├── base/ │ │ │ │ │ ├── openai_client.py │ │ │ │ │ └── swift_client.py │ │ │ │ └── chat/ │ │ │ │ ├── openai_client.py │ │ │ │ └── swift_client.py │ │ │ └── mllm/ │ │ │ ├── openai_client.py │ │ │ └── swift_client.py │ │ ├── embedding/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── lora/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── reranker/ │ │ │ ├── client.py │ │ │ ├── client_generative.py │ │ │ └── server.sh │ │ ├── reward_model/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── seq_cls/ │ │ │ ├── client.py │ │ │ └── server.sh │ │ ├── sglang.sh │ │ ├── vllm.sh │ │ └── vllm_dp.sh │ ├── eval/ │ │ ├── eval_url/ │ │ │ ├── demo.py │ │ │ └── eval.sh │ │ ├── llm/ │ │ │ ├── sglang.sh │ │ │ └── vllm.sh │ │ ├── train_eval/ │ │ │ └── train.sh │ │ └── vlm/ │ │ └── eval.sh │ ├── export/ │ │ ├── merge_lora.sh │ │ ├── ollama.sh │ │ ├── push_to_hub.sh │ │ └── quantize/ │ │ ├── awq.sh │ │ ├── bert/ │ │ │ ├── bnb.sh │ │ │ └── gptq.sh │ │ ├── bnb.sh │ │ ├── fp8.sh │ │ ├── gptq.sh │ │ ├── gptq_v2.sh │ │ ├── mllm/ │ │ │ ├── awq.sh │ │ │ ├── bnb.sh │ │ │ ├── fp8.sh │ │ │ └── gptq.sh │ │ ├── moe/ │ │ │ ├── awq.sh │ │ │ ├── bnb.sh │ │ │ ├── fp8.sh │ │ │ └── gptq.sh │ │ ├── omni/ │ │ │ └── gptq.sh │ │ └── reward_model/ │ │ ├── bnb.sh │ │ └── gptq.sh │ ├── infer/ │ │ ├── cli_demo.sh │ │ ├── demo.py │ │ ├── demo_agent.py │ │ ├── demo_bert.py │ │ ├── demo_embedding.py │ │ ├── demo_grounding.py │ │ ├── demo_hf.py │ │ ├── demo_lora.py │ │ ├── demo_mllm.py │ │ ├── demo_reranker.py │ │ ├── demo_reward_model.py │ │ ├── demo_vllm_reasoning_parser.py │ │ ├── lmdeploy/ │ │ │ ├── batch_ddp.sh │ │ │ └── mllm_tp.sh │ │ ├── sglang/ │ │ │ ├── demo.sh │ │ │ ├── distill_qwen3_235b.sh │ │ │ ├── mtp.sh │ │ │ └── tp.sh │ │ ├── transformers/ │ │ │ ├── batch_ddp.sh │ │ │ ├── bert.sh │ │ │ ├── lora.sh │ │ │ ├── mllm_device_map.sh │ │ │ ├── prm.sh │ │ │ └── reward_model.sh │ │ └── vllm/ │ │ ├── dp_tp.sh │ │ ├── mllm_ddp.sh │ │ ├── mllm_tp.sh │ │ └── mtp.sh │ ├── megatron/ │ │ ├── base_to_chat.sh │ │ ├── benchmark/ │ │ │ └── deepspeed.sh │ │ ├── dense/ │ │ │ ├── 72b_offload.sh │ │ │ └── qwen3_32b.sh │ │ ├── embedding/ │ │ │ ├── qwen3_emb.sh │ │ │ └── qwen3_vl_emb.sh │ │ ├── export/ │ │ │ ├── full.sh │ │ │ └── lora.sh │ │ ├── fp8/ │ │ │ ├── benchmark.sh │ │ │ ├── llm.sh │ │ │ └── vlm.sh │ │ ├── grpo/ │ │ │ ├── dense_colocate.sh │ │ │ ├── dense_server.sh │ │ │ ├── moe_colocate_full.sh │ │ │ ├── moe_colocate_lora.sh │ │ │ └── sapo.sh │ │ ├── long_text.sh │ │ ├── lora/ │ │ │ ├── dense.sh │ │ │ ├── dpo.sh │ │ │ ├── loss_scale.sh │ │ │ ├── moe.sh │ │ │ ├── mtp.sh │ │ │ ├── new_special_tokens.sh │ │ │ └── qwen3_235b.sh │ │ ├── mcore_bridge/ │ │ │ ├── full/ │ │ │ │ ├── dense.sh │ │ │ │ └── moe.sh │ │ │ └── lora/ │ │ │ ├── moe.sh │ │ │ ├── new_special_tokens.sh │ │ │ └── seq_cls.sh │ │ ├── moe/ │ │ │ ├── deepseek_v3.sh │ │ │ ├── moe.sh │ │ │ ├── qwen3_moe.sh │ │ │ └── qwen3_moe_offload.sh │ │ ├── multi-node/ │ │ │ ├── node1.sh │ │ │ └── node2.sh │ │ ├── multimodal/ │ │ │ ├── dense/ │ │ │ │ ├── dpo.sh │ │ │ │ ├── full.sh │ │ │ │ └── lora.sh │ │ │ ├── lora_llm_vit_full/ │ │ │ │ └── sft.sh │ │ │ ├── moe/ │ │ │ │ ├── full_dpo_offload.sh │ │ │ │ └── lora.sh │ │ │ └── omni/ │ │ │ ├── dense.sh │ │ │ └── moe.sh │ │ ├── pretrain.sh │ │ ├── reranker/ │ │ │ ├── qwen3_reranker.sh │ │ │ └── qwen3_vl_reranker.sh │ │ ├── rlhf/ │ │ │ ├── dpo/ │ │ │ │ ├── dense.sh │ │ │ │ ├── group_by_length.sh │ │ │ │ ├── moe.sh │ │ │ │ └── packing.sh │ │ │ ├── gkd/ │ │ │ │ ├── dense.sh │ │ │ │ ├── opsd.sh │ │ │ │ └── teacher_server.sh │ │ │ ├── kto/ │ │ │ │ ├── dense.sh │ │ │ │ └── moe.sh │ │ │ └── rm/ │ │ │ ├── dense.sh │ │ │ └── moe.sh │ │ ├── seq_cls/ │ │ │ ├── full.sh │ │ │ └── lora/ │ │ │ ├── infer.sh │ │ │ └── train.sh │ │ └── sft.sh │ ├── models/ │ │ ├── deepseek_ocr/ │ │ │ ├── infer.py │ │ │ └── train.sh │ │ ├── deepseek_vl2/ │ │ │ └── train.sh │ │ ├── glm-4.6v/ │ │ │ ├── flash.sh │ │ │ └── mcore.sh │ │ ├── gpt_oss/ │ │ │ ├── internvl3_5_gpt.sh │ │ │ ├── mcore.sh │ │ │ └── train.sh │ │ ├── hunyuan_ocr/ │ │ │ └── train.sh │ │ ├── internvl3/ │ │ │ └── train.sh │ │ ├── keye/ │ │ │ └── train.sh │ │ ├── llama4/ │ │ │ └── mcore.sh │ │ ├── minicpmv/ │ │ │ └── train.sh │ │ ├── ovis2/ │ │ │ └── train.sh │ │ ├── qwen3_5/ │ │ │ ├── mcore.sh │ │ │ ├── mcore_full.sh │ │ │ ├── mcore_grpo_moe.sh │ │ │ ├── packing.sh │ │ │ └── transformers.sh │ │ ├── qwen3_next/ │ │ │ ├── mcore.sh │ │ │ ├── mtp.sh │ │ │ ├── non_padding_free.sh │ │ │ └── transformers.sh │ │ ├── qwen3_omni/ │ │ │ ├── transformers.sh │ │ │ └── zero3.sh │ │ └── qwen3_vl/ │ │ ├── mcore.sh │ │ ├── mcore_full.sh │ │ ├── mixed.sh │ │ ├── transformers.sh │ │ └── zero3.sh │ ├── notebook/ │ │ ├── qwen2_5-self-cognition/ │ │ │ ├── infer.ipynb │ │ │ ├── infer.sh │ │ │ ├── self-cognition-sft.ipynb │ │ │ └── sft.sh │ │ ├── qwen2_5-vl-grounding/ │ │ │ └── zh.ipynb │ │ └── qwen2vl-ocr/ │ │ ├── infer.ipynb │ │ └── ocr-sft.ipynb │ ├── sampler/ │ │ ├── distill/ │ │ │ ├── distill.sh │ │ │ └── distill.yaml │ │ └── sample/ │ │ ├── sample.sh │ │ └── sampling.yaml │ ├── train/ │ │ ├── agent/ │ │ │ ├── deepseek_r1.sh │ │ │ ├── glm4.sh │ │ │ ├── loss_scale/ │ │ │ │ ├── infer_lora.py │ │ │ │ └── train.sh │ │ │ └── qwen2_5.sh │ │ ├── all_to_all/ │ │ │ ├── infer.sh │ │ │ └── train.sh │ │ ├── base_to_chat/ │ │ │ ├── full.sh │ │ │ ├── lora.sh │ │ │ └── lora2.sh │ │ ├── cached_dataset/ │ │ │ ├── dpo.sh │ │ │ ├── mcore.sh │ │ │ ├── pretrained.sh │ │ │ ├── reranker.sh │ │ │ ├── seq_cls.sh │ │ │ ├── sft.sh │ │ │ └── vlm.sh │ │ ├── early_stop/ │ │ │ └── lora_sft.sh │ │ ├── embedding/ │ │ │ ├── qwen3/ │ │ │ │ ├── infer.py │ │ │ │ ├── qwen3_emb.sh │ │ │ │ └── qwen3_vl_emb.sh │ │ │ └── train_gme.sh │ │ ├── flash_attention_3/ │ │ │ ├── mcore.sh │ │ │ └── transformers.sh │ │ ├── full/ │ │ │ ├── dft.sh │ │ │ ├── infer.sh │ │ │ ├── qwen2_5_32b.sh │ │ │ └── train.sh │ │ ├── grpo/ │ │ │ ├── external/ │ │ │ │ ├── README.md │ │ │ │ ├── agent.sh │ │ │ │ ├── grpo_32b_full.sh │ │ │ │ ├── grpo_7b.sh │ │ │ │ ├── moe_full.sh │ │ │ │ ├── moe_lora.sh │ │ │ │ ├── vllm_gym.sh │ │ │ │ └── vllm_multi_turn.sh │ │ │ ├── internal/ │ │ │ │ ├── README.md │ │ │ │ ├── chord.sh │ │ │ │ ├── full_lmdeploy.sh │ │ │ │ ├── gspo.sh │ │ │ │ ├── moe_full.sh │ │ │ │ ├── moe_lora.sh │ │ │ │ ├── qlora.sh │ │ │ │ ├── reinforce_plus_plus.sh │ │ │ │ ├── rloo.sh │ │ │ │ ├── sapo.sh │ │ │ │ ├── transformers.sh │ │ │ │ ├── vllm_72b_4gpu.sh │ │ │ │ ├── vllm_lora_qwenvl72b.sh │ │ │ │ ├── vllm_multi_turn.sh │ │ │ │ └── vllm_vl7b.sh │ │ │ ├── multi_node/ │ │ │ │ ├── Qwen2_5_32B_full.sh │ │ │ │ ├── colocate_multi_node1.sh │ │ │ │ ├── colocate_multi_node2.sh │ │ │ │ ├── server_multi_node.sh │ │ │ │ └── train_dlc.sh │ │ │ ├── plugin/ │ │ │ │ ├── deepeyes/ │ │ │ │ │ ├── deepeyes.sh │ │ │ │ │ └── deepeyes_plugin.py │ │ │ │ ├── gsm8k/ │ │ │ │ │ ├── gsm8k.sh │ │ │ │ │ └── gsm8k_plugin.py │ │ │ │ ├── plugin.py │ │ │ │ ├── run_external_reward_func.sh │ │ │ │ ├── run_external_reward_model.sh │ │ │ │ ├── run_external_scheduler.sh │ │ │ │ └── treepo/ │ │ │ │ ├── tree_rollout.py │ │ │ │ ├── tree_rollout.sh │ │ │ │ └── tree_rollout_plugin.py │ │ │ ├── prompt.txt │ │ │ └── qwen2_5_omni/ │ │ │ ├── grpo.sh │ │ │ └── infer.sh │ │ ├── infer.sh │ │ ├── liger/ │ │ │ └── sft.sh │ │ ├── lora_sft.sh │ │ ├── moe/ │ │ │ ├── llama4.sh │ │ │ └── qwen3_moe.sh │ │ ├── multi-gpu/ │ │ │ ├── ddp/ │ │ │ │ └── train.sh │ │ │ ├── ddp_device_map/ │ │ │ │ └── train.sh │ │ │ ├── deepspeed/ │ │ │ │ ├── train_zero2.sh │ │ │ │ └── train_zero3.sh │ │ │ ├── device_map/ │ │ │ │ └── train.sh │ │ │ ├── fsdp2_lora/ │ │ │ │ ├── fsdp2.json │ │ │ │ └── train.sh │ │ │ └── fsdp_qlora/ │ │ │ ├── fsdp_offload.json │ │ │ └── train.sh │ │ ├── multi-node/ │ │ │ ├── accelerate/ │ │ │ │ ├── multi_node.yaml │ │ │ │ ├── train_node1.sh │ │ │ │ └── train_node2.sh │ │ │ ├── deepspeed/ │ │ │ │ ├── README.md │ │ │ │ ├── host.txt │ │ │ │ └── train.sh │ │ │ ├── dlc/ │ │ │ │ └── train.sh │ │ │ ├── ray/ │ │ │ │ ├── sft.sh │ │ │ │ └── sft.yaml │ │ │ ├── swift/ │ │ │ │ ├── train_node1.sh │ │ │ │ └── train_node2.sh │ │ │ └── torchrun/ │ │ │ ├── train_node1.sh │ │ │ └── train_node2.sh │ │ ├── multimodal/ │ │ │ ├── audio.sh │ │ │ ├── caption.sh │ │ │ ├── grounding.sh │ │ │ ├── infer.sh │ │ │ ├── lora_llm_full_vit/ │ │ │ │ ├── infer.sh │ │ │ │ ├── merge_lora.sh │ │ │ │ ├── seq_cls.sh │ │ │ │ └── sft.sh │ │ │ ├── ocr.sh │ │ │ ├── omni/ │ │ │ │ ├── infer.sh │ │ │ │ └── sft.sh │ │ │ ├── rlhf/ │ │ │ │ ├── dpo/ │ │ │ │ │ ├── full.sh │ │ │ │ │ └── lora.sh │ │ │ │ ├── gkd/ │ │ │ │ │ ├── fast.sh │ │ │ │ │ └── full.sh │ │ │ │ └── kto.sh │ │ │ ├── video.sh │ │ │ └── vit_gradient_checkpointing.sh │ │ ├── new_special_tokens/ │ │ │ ├── infer.sh │ │ │ ├── merge_lora.sh │ │ │ ├── tokens.txt │ │ │ └── train.sh │ │ ├── on_policy_distillation.sh │ │ ├── optimizer/ │ │ │ ├── muon.sh │ │ │ └── muonclip.sh │ │ ├── packing/ │ │ │ ├── dpo.sh │ │ │ ├── dpo_vlm.sh │ │ │ ├── liger_kernel.sh │ │ │ ├── llm.sh │ │ │ ├── qwen2_5_omni.sh │ │ │ ├── qwen2_5_vl.sh │ │ │ └── streaming.sh │ │ ├── padding_free/ │ │ │ ├── dpo_vlm.sh │ │ │ └── sft.sh │ │ ├── plugins/ │ │ │ ├── loss_scale.sh │ │ │ └── tuner_phi4_mm.sh │ │ ├── predict_with_generate/ │ │ │ └── train.sh │ │ ├── pretrain/ │ │ │ └── train.sh │ │ ├── qlora/ │ │ │ ├── awq/ │ │ │ │ ├── merge_lora.sh │ │ │ │ └── train.sh │ │ │ ├── bnb/ │ │ │ │ ├── merge_lora.sh │ │ │ │ └── train.sh │ │ │ ├── gptq.sh │ │ │ └── hqq.sh │ │ ├── reranker/ │ │ │ ├── qwen3/ │ │ │ │ ├── infer.py │ │ │ │ ├── qwen3_reranker.sh │ │ │ │ └── qwen3_vl_reranker.sh │ │ │ ├── train_generative_reranker.sh │ │ │ ├── train_generative_reranker_listwise.sh │ │ │ ├── train_reranker.sh │ │ │ ├── train_reranker_auto_patch.sh │ │ │ ├── train_reranker_listwise.sh │ │ │ └── train_reranker_mm.sh │ │ ├── rft/ │ │ │ ├── math.json │ │ │ └── rft.py │ │ ├── rlhf/ │ │ │ ├── README.md │ │ │ ├── cpo.sh │ │ │ ├── dpo/ │ │ │ │ ├── full.sh │ │ │ │ └── lora.sh │ │ │ ├── gkd/ │ │ │ │ ├── fast.sh │ │ │ │ ├── full.sh │ │ │ │ ├── teacher_server.sh │ │ │ │ ├── think_model.sh │ │ │ │ ├── vllm_colocate.sh │ │ │ │ └── vllm_server.sh │ │ │ ├── kto.sh │ │ │ ├── mpo.sh │ │ │ ├── opsd/ │ │ │ │ ├── opsd.sh │ │ │ │ └── opsd_plugin.py │ │ │ ├── orpo.sh │ │ │ ├── ppo/ │ │ │ │ ├── full.sh │ │ │ │ └── lora.sh │ │ │ ├── rm.sh │ │ │ └── simpo.sh │ │ ├── seq_cls/ │ │ │ ├── bert/ │ │ │ │ ├── deploy.sh │ │ │ │ ├── infer.sh │ │ │ │ └── sft.sh │ │ │ ├── multi_label/ │ │ │ │ ├── infer.py │ │ │ │ ├── infer.sh │ │ │ │ ├── sft.sh │ │ │ │ └── vlm.sh │ │ │ ├── qwen2_5/ │ │ │ │ ├── deploy.sh │ │ │ │ ├── infer.sh │ │ │ │ └── sft.sh │ │ │ ├── qwen2_5_omni/ │ │ │ │ ├── infer.py │ │ │ │ ├── infer.sh │ │ │ │ └── sft.sh │ │ │ └── regression/ │ │ │ ├── deploy.sh │ │ │ ├── infer.sh │ │ │ └── sft.sh │ │ ├── sequence_parallel/ │ │ │ ├── sequence_parallel.sh │ │ │ ├── sequence_parallel_512k.sh │ │ │ ├── sequence_parallel_dpo.sh │ │ │ ├── sequence_parallel_emb.sh │ │ │ ├── sequence_parallel_grpo.sh │ │ │ ├── sequence_parallel_reranker.sh │ │ │ └── sequence_parallel_seq_cls.sh │ │ ├── streaming/ │ │ │ ├── lazy_tokenize.sh │ │ │ └── streaming.sh │ │ ├── think_model/ │ │ │ ├── deepseek_r1.sh │ │ │ ├── qwen3_demo1.sh │ │ │ └── qwen3_demo2.sh │ │ └── tuners/ │ │ ├── adalora/ │ │ │ └── train.sh │ │ ├── adapter/ │ │ │ └── train.sh │ │ ├── boft/ │ │ │ └── train.sh │ │ ├── bone/ │ │ │ └── train.sh │ │ ├── dora/ │ │ │ └── train.sh │ │ ├── galore/ │ │ │ ├── train_galore.sh │ │ │ └── train_qgalore.sh │ │ ├── lisa/ │ │ │ └── train.sh │ │ ├── llamapro/ │ │ │ └── train.sh │ │ ├── longlora/ │ │ │ └── train.sh │ │ ├── lora/ │ │ │ └── train.sh │ │ ├── lora-ga/ │ │ │ └── train.sh │ │ ├── neftune/ │ │ │ └── train.sh │ │ ├── olora/ │ │ │ └── train.sh │ │ ├── pissa/ │ │ │ └── train.sh │ │ ├── qlora/ │ │ │ └── train.sh │ │ ├── reft/ │ │ │ └── train.sh │ │ └── unsloth/ │ │ └── train.sh │ └── yaml/ │ ├── sft.sh │ └── sft.yaml ├── requirements/ │ ├── docs.txt │ ├── eval.txt │ ├── framework.txt │ ├── install_all.sh │ ├── ray.txt │ ├── swanlab.txt │ └── tests.txt ├── requirements.txt ├── scripts/ │ ├── benchmark/ │ │ ├── config/ │ │ │ └── tuner.json │ │ ├── exp.py │ │ ├── exp_utils.py │ │ └── generate_report.py │ └── utils/ │ ├── plot_loss.py │ ├── run_dataset_info.py │ ├── run_model_info.py │ ├── run_template.py │ └── test_link_valid.py ├── setup.cfg ├── setup.py ├── swift/ │ ├── __init__.py │ ├── agent_template/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── deepseek_v3_1.py │ │ ├── extra.py │ │ ├── glm4.py │ │ ├── hermes.py │ │ ├── llama.py │ │ ├── mapping.py │ │ ├── minimax_m2.py │ │ ├── mistral.py │ │ ├── qwen.py │ │ ├── qwen3_coder.py │ │ ├── react.py │ │ ├── seed_oss.py │ │ ├── toolbench.py │ │ └── youtu.py │ ├── arguments/ │ │ ├── __init__.py │ │ ├── app_args.py │ │ ├── base_args/ │ │ │ ├── __init__.py │ │ │ ├── base_args.py │ │ │ ├── data_args.py │ │ │ ├── generation_args.py │ │ │ ├── model_args.py │ │ │ ├── quant_args.py │ │ │ └── template_args.py │ │ ├── deploy_args.py │ │ ├── eval_args.py │ │ ├── export_args.py │ │ ├── infer_args.py │ │ ├── merge_args.py │ │ ├── pretrain_args.py │ │ ├── rlhf_args.py │ │ ├── sampling_args.py │ │ ├── sft_args.py │ │ ├── tuner_args.py │ │ └── webui_args.py │ ├── callbacks/ │ │ ├── __init__.py │ │ ├── activation_cpu_offload.py │ │ ├── adalora.py │ │ ├── base.py │ │ ├── deepspeed_elastic.py │ │ ├── early_stop.py │ │ ├── lisa.py │ │ ├── mapping.py │ │ └── perf_log.py │ ├── cli/ │ │ ├── __init__.py │ │ ├── _megatron/ │ │ │ ├── __init__.py │ │ │ ├── export.py │ │ │ ├── main.py │ │ │ ├── pt.py │ │ │ ├── rlhf.py │ │ │ └── sft.py │ │ ├── app.py │ │ ├── deploy.py │ │ ├── eval.py │ │ ├── export.py │ │ ├── infer.py │ │ ├── main.py │ │ ├── merge_lora.py │ │ ├── pt.py │ │ ├── rlhf.py │ │ ├── rollout.py │ │ ├── sample.py │ │ ├── sft.py │ │ ├── utils.py │ │ └── web_ui.py │ ├── config/ │ │ ├── fsdp2.json │ │ ├── zero0.json │ │ ├── zero1.json │ │ ├── zero2.json │ │ ├── zero2_offload.json │ │ ├── zero3.json │ │ └── zero3_offload.json │ ├── dataloader/ │ │ ├── __init__.py │ │ ├── dispatcher.py │ │ └── shard.py │ ├── dataset/ │ │ ├── __init__.py │ │ ├── data/ │ │ │ └── dataset_info.json │ │ ├── dataset/ │ │ │ ├── __init__.py │ │ │ ├── llm.py │ │ │ └── mllm.py │ │ ├── dataset_meta.py │ │ ├── dataset_syntax.py │ │ ├── indexed_dataset.py │ │ ├── loader.py │ │ ├── media.py │ │ ├── packing.py │ │ ├── preprocessor/ │ │ │ ├── __init__.py │ │ │ ├── core.py │ │ │ └── extra.py │ │ ├── register.py │ │ └── utils.py │ ├── hub/ │ │ ├── __init__.py │ │ ├── constant.py │ │ └── hub.py │ ├── infer_engine/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── grpo_vllm_engine.py │ │ ├── infer_client.py │ │ ├── infer_engine.py │ │ ├── lmdeploy_engine.py │ │ ├── patch.py │ │ ├── protocol.py │ │ ├── sglang_engine.py │ │ ├── transformers_engine.py │ │ ├── utils.py │ │ └── vllm_engine.py │ ├── loss/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── causal_lm.py │ │ ├── embedding.py │ │ ├── mapping.py │ │ └── reranker.py │ ├── loss_scale/ │ │ ├── __init__.py │ │ ├── agent.py │ │ ├── base.py │ │ ├── config/ │ │ │ ├── agentflan.json │ │ │ ├── alpha_umi.json │ │ │ ├── hermes.json │ │ │ ├── ignore_empty_think.json │ │ │ ├── qwen.json │ │ │ └── react.json │ │ ├── mapping.py │ │ ├── other.py │ │ └── utils.py │ ├── megatron/ │ │ ├── __init__.py │ │ ├── arguments/ │ │ │ ├── __init__.py │ │ │ ├── export_args.py │ │ │ ├── megatron_args.py │ │ │ ├── megatron_base_args.py │ │ │ ├── pretrain_args.py │ │ │ ├── rlhf_args.py │ │ │ └── sft_args.py │ │ ├── callbacks/ │ │ │ ├── __init__.py │ │ │ ├── base.py │ │ │ ├── default_flow.py │ │ │ ├── mapping.py │ │ │ ├── print.py │ │ │ ├── swanlab.py │ │ │ ├── tensorboard.py │ │ │ ├── utils.py │ │ │ └── wandb.py │ │ ├── convert.py │ │ ├── init.py │ │ ├── model/ │ │ │ ├── __init__.py │ │ │ ├── constant.py │ │ │ ├── gpt_bridge.py │ │ │ ├── gpt_model.py │ │ │ ├── gpts/ │ │ │ │ ├── __init__.py │ │ │ │ ├── glm4.py │ │ │ │ ├── minimax_m2.py │ │ │ │ ├── olmoe.py │ │ │ │ ├── qwen3_emb.py │ │ │ │ └── qwen3_next.py │ │ │ ├── mm_gpt_model.py │ │ │ ├── mm_gpts/ │ │ │ │ ├── __init__.py │ │ │ │ ├── glm.py │ │ │ │ ├── internvl.py │ │ │ │ ├── kimi_vl.py │ │ │ │ ├── llama4.py │ │ │ │ ├── qwen.py │ │ │ │ ├── qwen3_5.py │ │ │ │ ├── qwen3_5_gdn.py │ │ │ │ ├── qwen3_vl.py │ │ │ │ └── utils.py │ │ │ ├── model_config.py │ │ │ ├── modules/ │ │ │ │ ├── __init__.py │ │ │ │ ├── gated_delta_net.py │ │ │ │ └── gated_self_attention.py │ │ │ ├── register.py │ │ │ └── rope.py │ │ ├── pipelines/ │ │ │ ├── __init__.py │ │ │ ├── export/ │ │ │ │ ├── __init__.py │ │ │ │ └── export.py │ │ │ └── train/ │ │ │ ├── __init__.py │ │ │ ├── pretrain.py │ │ │ ├── rlhf.py │ │ │ └── sft.py │ │ ├── trainers/ │ │ │ ├── __init__.py │ │ │ ├── base.py │ │ │ ├── batch_sampler.py │ │ │ ├── dpo_trainer.py │ │ │ ├── embedding_trainer.py │ │ │ ├── gkd_trainer.py │ │ │ ├── grpo_trainer.py │ │ │ ├── kto_trainer.py │ │ │ ├── reranker_trainer.py │ │ │ ├── reward_trainer.py │ │ │ ├── rlhf_mixin.py │ │ │ ├── rollout_mixin.py │ │ │ ├── trainer.py │ │ │ ├── utils.py │ │ │ └── vocab_parallel_utils.py │ │ ├── tuners/ │ │ │ ├── __init__.py │ │ │ └── lora.py │ │ └── utils/ │ │ ├── __init__.py │ │ ├── convert_utils.py │ │ ├── megatron_lm_utils.py │ │ ├── parallel_utils.py │ │ ├── patcher.py │ │ └── utils.py │ ├── metrics/ │ │ ├── __init__.py │ │ ├── acc.py │ │ ├── base.py │ │ ├── embedding.py │ │ ├── mapping.py │ │ ├── nlg.py │ │ ├── reranker.py │ │ └── utils.py │ ├── model/ │ │ ├── __init__.py │ │ ├── constant.py │ │ ├── model_arch.py │ │ ├── model_meta.py │ │ ├── models/ │ │ │ ├── __init__.py │ │ │ ├── baai.py │ │ │ ├── baichuan.py │ │ │ ├── baidu.py │ │ │ ├── bert.py │ │ │ ├── codefuse.py │ │ │ ├── deepseek.py │ │ │ ├── gemma.py │ │ │ ├── glm.py │ │ │ ├── internlm.py │ │ │ ├── llama.py │ │ │ ├── llava.py │ │ │ ├── llm.py │ │ │ ├── mamba.py │ │ │ ├── microsoft.py │ │ │ ├── minicpm.py │ │ │ ├── minimax.py │ │ │ ├── mistral.py │ │ │ ├── mllm.py │ │ │ ├── moonshot.py │ │ │ ├── mplug.py │ │ │ ├── openbuddy.py │ │ │ ├── qwen.py │ │ │ ├── seed.py │ │ │ ├── skywork.py │ │ │ ├── stepfun.py │ │ │ ├── telechat.py │ │ │ ├── tencent.py │ │ │ ├── valley.py │ │ │ └── yi.py │ │ ├── npu_patcher.py │ │ ├── patcher.py │ │ ├── register.py │ │ └── utils.py │ ├── optimizers/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── galore/ │ │ │ ├── __init__.py │ │ │ ├── adafactor.py │ │ │ ├── adamw.py │ │ │ ├── adamw8bit.py │ │ │ ├── galore_projector.py │ │ │ └── utils.py │ │ ├── lorap.py │ │ ├── mapping.py │ │ ├── multimodal.py │ │ ├── muon.py │ │ └── muonclip.py │ ├── pipelines/ │ │ ├── __init__.py │ │ ├── app/ │ │ │ ├── __init__.py │ │ │ ├── app.py │ │ │ ├── build_ui.py │ │ │ └── locale.py │ │ ├── base.py │ │ ├── eval/ │ │ │ ├── __init__.py │ │ │ ├── eval.py │ │ │ └── utils.py │ │ ├── export/ │ │ │ ├── __init__.py │ │ │ ├── cached_dataset.py │ │ │ ├── export.py │ │ │ ├── merge_lora.py │ │ │ ├── ollama.py │ │ │ └── quant.py │ │ ├── infer/ │ │ │ ├── __init__.py │ │ │ ├── deploy.py │ │ │ ├── infer.py │ │ │ ├── rollout.py │ │ │ └── utils.py │ │ ├── sampling/ │ │ │ ├── __init__.py │ │ │ ├── base.py │ │ │ ├── distill_sampler.py │ │ │ ├── sampling.py │ │ │ ├── utils.py │ │ │ └── vanilla_sampler.py │ │ ├── train/ │ │ │ ├── __init__.py │ │ │ ├── kto.py │ │ │ ├── pretrain.py │ │ │ ├── rlhf.py │ │ │ ├── sft.py │ │ │ └── tuner.py │ │ └── utils.py │ ├── ray/ │ │ ├── __init__.py │ │ ├── arguments.py │ │ ├── base.py │ │ └── resource_manager.py │ ├── rewards/ │ │ ├── __init__.py │ │ ├── orm.py │ │ ├── prm.py │ │ └── rm_plugin.py │ ├── rlhf_trainers/ │ │ ├── __init__.py │ │ ├── args_mixin.py │ │ ├── arguments.py │ │ ├── cpo_trainer.py │ │ ├── dpo_trainer.py │ │ ├── gkd_trainer.py │ │ ├── grpo_trainer.py │ │ ├── kto_trainer.py │ │ ├── orpo_trainer.py │ │ ├── ppo_trainer.py │ │ ├── reward_trainer.py │ │ ├── rlhf_mixin.py │ │ ├── rollout_mixin.py │ │ ├── utils.py │ │ └── vllm_client.py │ ├── rollout/ │ │ ├── __init__.py │ │ ├── gym_env.py │ │ └── multi_turn.py │ ├── sequence_parallel/ │ │ ├── __init__.py │ │ ├── ulysses.py │ │ ├── utils.py │ │ └── zigzag_ring_attn.py │ ├── template/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── constant.py │ │ ├── grounding.py │ │ ├── register.py │ │ ├── template_inputs.py │ │ ├── template_meta.py │ │ ├── templates/ │ │ │ ├── __init__.py │ │ │ ├── baai.py │ │ │ ├── baidu.py │ │ │ ├── bert.py │ │ │ ├── deepseek.py │ │ │ ├── dots.py │ │ │ ├── gemma.py │ │ │ ├── glm.py │ │ │ ├── idefics3.py │ │ │ ├── internlm.py │ │ │ ├── internvl.py │ │ │ ├── kwai.py │ │ │ ├── llama.py │ │ │ ├── llava.py │ │ │ ├── llm.py │ │ │ ├── megrez.py │ │ │ ├── microsoft.py │ │ │ ├── midashenglm.py │ │ │ ├── minicpm.py │ │ │ ├── minimax.py │ │ │ ├── minimind.py │ │ │ ├── mistral.py │ │ │ ├── molmo.py │ │ │ ├── moonshot.py │ │ │ ├── mplug.py │ │ │ ├── openbuddy.py │ │ │ ├── pixtral.py │ │ │ ├── qwen.py │ │ │ ├── seed.py │ │ │ ├── stepfun.py │ │ │ ├── tencent.py │ │ │ ├── utils.py │ │ │ ├── valley.py │ │ │ └── yi.py │ │ ├── utils.py │ │ └── vision_utils.py │ ├── trainers/ │ │ ├── __init__.py │ │ ├── arguments.py │ │ ├── embedding_trainer.py │ │ ├── mixin.py │ │ ├── patcher.py │ │ ├── reranker_trainer.py │ │ ├── seq2seq_trainer.py │ │ ├── trainer.py │ │ ├── trainer_factory.py │ │ └── utils.py │ ├── tuner_plugin/ │ │ ├── __init__.py │ │ ├── base.py │ │ ├── dummy.py │ │ ├── ia3.py │ │ ├── lora_llm.py │ │ └── mapping.py │ ├── tuners/ │ │ ├── __init__.py │ │ ├── adapter.py │ │ ├── base.py │ │ ├── llamapro.py │ │ ├── longlora/ │ │ │ ├── __init__.py │ │ │ ├── llama.py │ │ │ └── longlora.py │ │ ├── lora.py │ │ ├── lora_layers.py │ │ ├── mapping.py │ │ ├── neftune.py │ │ ├── part.py │ │ ├── peft.py │ │ ├── prompt.py │ │ ├── reft.py │ │ ├── restuning.py │ │ ├── restuning_components.py │ │ ├── scetuning/ │ │ │ ├── __init__.py │ │ │ ├── scetuning.py │ │ │ └── scetuning_components.py │ │ ├── side.py │ │ └── utils.py │ ├── ui/ │ │ ├── __init__.py │ │ ├── app.py │ │ ├── base.py │ │ ├── llm_eval/ │ │ │ ├── __init__.py │ │ │ ├── eval.py │ │ │ ├── llm_eval.py │ │ │ ├── model.py │ │ │ └── runtime.py │ │ ├── llm_export/ │ │ │ ├── __init__.py │ │ │ ├── export.py │ │ │ ├── llm_export.py │ │ │ ├── model.py │ │ │ └── runtime.py │ │ ├── llm_grpo/ │ │ │ ├── __init__.py │ │ │ ├── advanced.py │ │ │ ├── dataset.py │ │ │ ├── external_rollout.py │ │ │ ├── external_runtime.py │ │ │ ├── grpo_advanced.py │ │ │ ├── hyper.py │ │ │ ├── llm_grpo.py │ │ │ ├── lora.py │ │ │ ├── model.py │ │ │ ├── optimizer.py │ │ │ ├── quantization.py │ │ │ ├── report_to.py │ │ │ ├── reward.py │ │ │ ├── rollout.py │ │ │ ├── runtime.py │ │ │ ├── save.py │ │ │ ├── target.py │ │ │ └── tuner.py │ │ ├── llm_infer/ │ │ │ ├── __init__.py │ │ │ ├── generate.py │ │ │ ├── llm_infer.py │ │ │ ├── model.py │ │ │ └── runtime.py │ │ ├── llm_rlhf/ │ │ │ ├── __init__.py │ │ │ ├── advanced.py │ │ │ ├── dataset.py │ │ │ ├── hyper.py │ │ │ ├── llm_rlhf.py │ │ │ ├── lora.py │ │ │ ├── model.py │ │ │ ├── optimizer.py │ │ │ ├── quantization.py │ │ │ ├── report_to.py │ │ │ ├── rlhf.py │ │ │ ├── runtime.py │ │ │ ├── save.py │ │ │ ├── target.py │ │ │ └── tuner.py │ │ ├── llm_sample/ │ │ │ ├── __init__.py │ │ │ ├── llm_sample.py │ │ │ ├── model.py │ │ │ ├── runtime.py │ │ │ └── sample.py │ │ └── llm_train/ │ │ ├── __init__.py │ │ ├── advanced.py │ │ ├── dataset.py │ │ ├── hyper.py │ │ ├── llm_train.py │ │ ├── lora.py │ │ ├── model.py │ │ ├── optimizer.py │ │ ├── quantization.py │ │ ├── report_to.py │ │ ├── runtime.py │ │ ├── save.py │ │ ├── self_cog.py │ │ ├── target.py │ │ ├── task.py │ │ ├── tuner.py │ │ └── utils.py │ ├── utils/ │ │ ├── __init__.py │ │ ├── constants.py │ │ ├── dequantizer.py │ │ ├── env.py │ │ ├── hf_config.py │ │ ├── hub_utils.py │ │ ├── import_utils.py │ │ ├── io_utils.py │ │ ├── logger.py │ │ ├── np_utils.py │ │ ├── processor_utils.py │ │ ├── safetensors.py │ │ ├── shutdown_manager.py │ │ ├── tb_utils.py │ │ ├── torch_utils.py │ │ ├── transformers_utils.py │ │ └── utils.py │ └── version.py └── tests/ ├── __init__.py ├── app/ │ └── test_app.py ├── deploy/ │ ├── test_dataset.py │ └── test_logprobs.py ├── eval/ │ └── test_eval.py ├── export/ │ └── test_quant.py ├── general/ │ ├── test_arch.py │ ├── test_dataset.py │ ├── test_model.py │ ├── test_stream.py │ └── test_template.py ├── hub/ │ ├── __init__.py │ └── test_check_model.py ├── infer/ │ ├── test_agent.py │ ├── test_infer.py │ ├── test_logprobs.py │ ├── test_main.py │ ├── test_max_memory.py │ ├── test_mllm.py │ └── test_sglang.py ├── llm/ │ ├── __init__.py │ ├── config/ │ │ ├── infer.json │ │ └── sft.json │ ├── data/ │ │ ├── alpaca.csv │ │ ├── alpaca.jsonl │ │ ├── alpaca2.csv │ │ ├── chatml.jsonl │ │ ├── conversations.jsonl │ │ ├── multi_modal_1.jsonl │ │ ├── multi_modal_2.jsonl │ │ ├── multi_modal_3.jsonl │ │ ├── sharegpt.jsonl │ │ ├── swift_multi.json │ │ ├── swift_multi.jsonl │ │ ├── swift_pre.csv │ │ ├── swift_pre.jsonl │ │ ├── swift_single.csv │ │ └── swift_single.jsonl │ ├── test_custom.py │ ├── test_dataset.py │ ├── test_ollama_export.py │ ├── test_run.py │ ├── test_template.py │ ├── test_utils.py │ └── test_web_ui.py ├── megatron/ │ ├── export/ │ │ └── test_export.py │ ├── test_align/ │ │ ├── test_llm.py │ │ └── test_mllm.py │ ├── test_embedding.py │ ├── test_export.py │ ├── test_gkd.py │ ├── test_grpo.py │ ├── test_kto.py │ ├── test_lora.py │ ├── test_rlhf.py │ └── test_train.py ├── model_tag.py ├── models/ │ ├── test_flash_attn.py │ ├── test_llm.py │ └── test_mllm.py ├── run.py ├── run_config.yaml ├── sample/ │ └── test_client.py ├── test_align/ │ ├── test_cls.py │ ├── test_lmdeploy_vlm.py │ ├── test_padding_side.py │ ├── test_rlhf_loss.py │ ├── test_template/ │ │ ├── test_agent.py │ │ ├── test_audio.py │ │ ├── test_gene.py │ │ ├── test_llm.py │ │ ├── test_template.py │ │ ├── test_tool.py │ │ ├── test_video.py │ │ └── test_vision.py │ └── test_vllm_vlm.py ├── test_utils.py ├── train/ │ ├── test_channel.py │ ├── test_cls.py │ ├── test_embedding.py │ ├── test_export_cached_dataset.py │ ├── test_freeze.py │ ├── test_gkd.py │ ├── test_grounding.py │ ├── test_grpo.py │ ├── test_kto.py │ ├── test_liger.py │ ├── test_multilabel.py │ ├── test_packing.py │ ├── test_ppo.py │ ├── test_pt.py │ ├── test_resume_from_checkpoint.py │ ├── test_rlhf.py │ ├── test_sample.py │ ├── test_sft.py │ ├── test_train_eval.py │ ├── test_vit_lr.py │ └── test_vllm_importance_sampling_basic.py ├── tuners/ │ ├── __init__.py │ ├── test_extra_state_dict.py │ ├── test_merged_linear.py │ ├── test_neft.py │ ├── test_peft.py │ ├── test_scetuning.py │ ├── test_swift_base.py │ ├── test_swift_device_map.py │ └── test_swift_restuning.py └── utils/ ├── __init__.py ├── test_async_rewards.py ├── test_file_utils.py ├── test_io_utils.py ├── test_rewards.py ├── test_split_str_parts_by.py └── test_torch_utils.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .dev_scripts/build_docs.sh ================================================ pip install -r requirements/docs.txt cd docs rm -rf build # update api rst #rm -rf source/api/ #sphinx-apidoc --module-first -o source/api/ ../modelscope/ make html ================================================ FILE: .dev_scripts/ci_container_test.sh ================================================ if [ "$MODELSCOPE_SDK_DEBUG" == "True" ]; then # pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple pip install -r requirements/tests.txt -i https://mirrors.aliyun.com/pypi/simple/ git config --global --add safe.directory /ms-swift git config --global user.email tmp git config --global user.name tmp.com # linter test # use internal project for pre-commit due to the network problem if [ `git remote -v | grep alibaba | wc -l` -gt 1 ]; then pre-commit run -c .pre-commit-config_local.yaml --all-files if [ $? -ne 0 ]; then echo "linter test failed, please run 'pre-commit run --all-files' to check" echo "From the repository folder" echo "Run 'pip install -r requirements/tests.txt' install test dependencies." echo "Run 'pre-commit install' install pre-commit hooks." echo "Finally run linter with command: 'pre-commit run --all-files' to check." echo "Ensure there is no failure!!!!!!!!" exit -1 fi fi pip install -r requirements/framework.txt -U -i https://mirrors.aliyun.com/pypi/simple/ pip install decord einops -U -i https://mirrors.aliyun.com/pypi/simple/ pip uninstall autoawq -y pip install optimum pip install diffusers pip install "transformers<5.0" # pip install autoawq -U --no-deps # test with install pip install . pip install auto_gptq bitsandbytes deepspeed -U -i https://mirrors.aliyun.com/pypi/simple/ else echo "Running case in release image, run case directly!" fi # remove torch_extensions folder to avoid ci hang. rm -rf ~/.cache/torch_extensions if [ $# -eq 0 ]; then ci_command="python tests/run.py --subprocess" else ci_command="$@" fi echo "Running case with command: $ci_command" $ci_command ================================================ FILE: .dev_scripts/dockerci.sh ================================================ #!/bin/bash MODELSCOPE_CACHE_DIR_IN_CONTAINER=/modelscope_cache CODE_DIR=$PWD CODE_DIR_IN_CONTAINER=/ms-swift MODELSCOPE_SDK_DEBUG=True echo "$USER" gpus='0,1 2,3' cpu_sets='0-15 16-31' cpu_sets_arr=($cpu_sets) is_get_file_lock=false CI_COMMAND=${CI_COMMAND:-bash .dev_scripts/ci_container_test.sh python tests/run.py --parallel 2 --run_config tests/run_config.yaml} echo "ci command: $CI_COMMAND" PR_CHANGED_FILES="${PR_CHANGED_FILES:-}" echo "PR modified files: $PR_CHANGED_FILES" PR_CHANGED_FILES=${PR_CHANGED_FILES//[ ]/#} echo "PR_CHANGED_FILES: $PR_CHANGED_FILES" idx=0 for gpu in $gpus do exec {lock_fd}>"/tmp/gpu$gpu" || exit 1 flock -n "$lock_fd" || { echo "WARN: gpu $gpu is in use!" >&2; idx=$((idx+1)); continue; } echo "get gpu lock $gpu" CONTAINER_NAME="swift-ci-$idx" let is_get_file_lock=true # pull image if there are update docker pull ${IMAGE_NAME}:${IMAGE_VERSION} if [ "$MODELSCOPE_SDK_DEBUG" == "True" ]; then echo 'debugging' docker run --rm --name $CONTAINER_NAME --shm-size=16gb \ --cpuset-cpus=${cpu_sets_arr[$idx]} \ --gpus='"'"device=$gpu"'"' \ -v $CODE_DIR:$CODE_DIR_IN_CONTAINER \ -v $MODELSCOPE_CACHE:$MODELSCOPE_CACHE_DIR_IN_CONTAINER \ -v $MODELSCOPE_HOME_CACHE/$idx:/root \ -v /home/admin/pre-commit:/home/admin/pre-commit \ -e CI_TEST=True \ -e TEST_LEVEL=$TEST_LEVEL \ -e MODELSCOPE_CACHE=$MODELSCOPE_CACHE_DIR_IN_CONTAINER \ -e MODELSCOPE_DOMAIN=$MODELSCOPE_DOMAIN \ -e MODELSCOPE_SDK_DEBUG=True \ -e HUB_DATASET_ENDPOINT=$HUB_DATASET_ENDPOINT \ -e TEST_ACCESS_TOKEN_CITEST=$TEST_ACCESS_TOKEN_CITEST \ -e TEST_ACCESS_TOKEN_SDKDEV=$TEST_ACCESS_TOKEN_SDKDEV \ -e TEST_LEVEL=$TEST_LEVEL \ -e MODELSCOPE_ENVIRONMENT='ci' \ -e TEST_UPLOAD_MS_TOKEN=$TEST_UPLOAD_MS_TOKEN \ -e MODEL_TAG_URL=$MODEL_TAG_URL \ -e MODELSCOPE_API_TOKEN=$MODELSCOPE_API_TOKEN \ -e PR_CHANGED_FILES=$PR_CHANGED_FILES \ --workdir=$CODE_DIR_IN_CONTAINER \ ${IMAGE_NAME}:${IMAGE_VERSION} \ $CI_COMMAND else docker run --rm --name $CONTAINER_NAME --shm-size=16gb \ --cpuset-cpus=${cpu_sets_arr[$idx]} \ --gpus='"'"device=$gpu"'"' \ -v $CODE_DIR:$CODE_DIR_IN_CONTAINER \ -v $MODELSCOPE_CACHE:$MODELSCOPE_CACHE_DIR_IN_CONTAINER \ -v $MODELSCOPE_HOME_CACHE/$idx:/root \ -v /home/admin/pre-commit:/home/admin/pre-commit \ -e CI_TEST=True \ -e TEST_LEVEL=$TEST_LEVEL \ -e MODELSCOPE_CACHE=$MODELSCOPE_CACHE_DIR_IN_CONTAINER \ -e MODELSCOPE_DOMAIN=$MODELSCOPE_DOMAIN \ -e HUB_DATASET_ENDPOINT=$HUB_DATASET_ENDPOINT \ -e TEST_ACCESS_TOKEN_CITEST=$TEST_ACCESS_TOKEN_CITEST \ -e TEST_ACCESS_TOKEN_SDKDEV=$TEST_ACCESS_TOKEN_SDKDEV \ -e TEST_LEVEL=$TEST_LEVEL \ -e MODELSCOPE_ENVIRONMENT='ci' \ -e TEST_UPLOAD_MS_TOKEN=$TEST_UPLOAD_MS_TOKEN \ -e MODEL_TAG_URL=$MODEL_TAG_URL \ -e MODELSCOPE_API_TOKEN=$MODELSCOPE_API_TOKEN \ -e PR_CHANGED_FILES=$PR_CHANGED_FILES \ --workdir=$CODE_DIR_IN_CONTAINER \ ${IMAGE_NAME}:${IMAGE_VERSION} \ $CI_COMMAND fi if [ $? -ne 0 ]; then echo "Running test case failed, please check the log!" exit -1 fi break done if [ "$is_get_file_lock" = false ] ; then echo 'No free GPU!' exit 1 fi ================================================ FILE: .dev_scripts/dockerci_npu.sh ================================================ #!/bin/bash MODELSCOPE_CACHE_DIR=/modelscope_cache CODE_DIR=$PWD MODELSCOPE_SDK_DEBUG=True echo "$USER" gpus='0,1 2,3' is_get_file_lock=false CI_COMMAND=${CI_COMMAND:-bash .dev_scripts/ci_container_test.sh python tests/run.py --parallel 2 --run_config tests/run_config.yaml} echo "ci command: $CI_COMMAND" PR_CHANGED_FILES="${PR_CHANGED_FILES:-}" echo "PR modified files: $PR_CHANGED_FILES" PR_CHANGED_FILES=${PR_CHANGED_FILES//[ ]/#} echo "PR_CHANGED_FILES: $PR_CHANGED_FILES" idx=0 for gpu in $gpus do exec {lock_fd}>"/tmp/gpu$gpu" || exit 1 flock -n "$lock_fd" || { echo "WARN: gpu $gpu is in use!" >&2; idx=$((idx+1)); continue; } echo "get gpu lock $gpu" let is_get_file_lock=true # 设置环境变量 export CI_TEST=True export TEST_LEVEL=$TEST_LEVEL export MODELSCOPE_CACHE=${MODELSCOPE_CACHE:-$MODELSCOPE_CACHE_DIR} export MODELSCOPE_DOMAIN=$MODELSCOPE_DOMAIN export HUB_DATASET_ENDPOINT=$HUB_DATASET_ENDPOINT export TEST_ACCESS_TOKEN_CITEST=$TEST_ACCESS_TOKEN_CITEST export TEST_ACCESS_TOKEN_SDKDEV=$TEST_ACCESS_TOKEN_SDKDEV export MODELSCOPE_ENVIRONMENT='ci' export TEST_UPLOAD_MS_TOKEN=$TEST_UPLOAD_MS_TOKEN export MODEL_TAG_URL=$MODEL_TAG_URL export MODELSCOPE_API_TOKEN=$MODELSCOPE_API_TOKEN export PR_CHANGED_FILES=$PR_CHANGED_FILES export CUDA_VISIBLE_DEVICES=$gpu if [ "$MODELSCOPE_SDK_DEBUG" == "True" ]; then export MODELSCOPE_SDK_DEBUG=True echo 'debugging' fi # 切换到代码目录并执行命令 cd $CODE_DIR eval $CI_COMMAND if [ $? -ne 0 ]; then echo "Running test case failed, please check the log!" exit -1 fi break done if [ "$is_get_file_lock" = false ] ; then echo 'No free GPU!' exit 1 fi ================================================ FILE: .github/ISSUE_TEMPLATE/1-bug-report.yml ================================================ name: "🐛 Bug Report" description: Create a bug report to help us improve ms-swift labels: ["bug"] body: - type: markdown attributes: value: | Thank you for supporting ms-swift and taking the time to submit this issue. 感谢你对 ms-swift 的支持和抽出时间提交相关 issue。 - type: checkboxes id: checklist attributes: label: Checklist / 检查清单 options: - label: I have searched existing issues, and this is a new bug report. / 我已经搜索过现有的 issues,确认这是一个新的 bug report。 required: true - type: textarea id: bug-description validations: required: true attributes: label: Bug Description / Bug 描述 description: | Please describe the issue you encountered. It's better to include error screenshots or stack trace information. 请详细描述你遇到的问题,最好包含报错截图或报错栈信息。 - type: textarea id: reproduction-steps validations: required: true attributes: label: How to Reproduce / 如何复现 description: | Please provide steps to reproduce the issue, including ms-swift version, runtime environment, and detailed reproduction steps. 请提供复现问题的步骤,包括 ms-swift 的版本、运行环境、详细的复现步骤等。 - type: textarea id: additional-information attributes: label: Additional Information / 补充信息 description: | Please provide any additional information here. 在这里补充其他相关信息。 ================================================ FILE: .github/ISSUE_TEMPLATE/2-feature-request.yml ================================================ name: "🚀 Feature Request" description: Submit a request for a new feature labels: ["enhancement"] body: - type: markdown attributes: value: | Thank you for supporting ms-swift and taking the time to submit this issue. 感谢你对 ms-swift 的支持和抽出时间提交相关 issue。 - type: checkboxes id: checklist attributes: label: Checklist / 检查清单 options: - label: I have searched existing issues, and this is a new feature request. / 我已经搜索过现有的 issues,确认这是一个新的 Feature Request。 required: true - type: textarea id: feature-request-description validations: required: true attributes: label: Feature Request Description / Feature Request 描述 description: | Please provide a detailed description of the new feature you would like to see added. 请详细描述您希望添加的新功能特性。 - type: textarea id: pull-request attributes: label: Pull Request / Pull Request 信息 description: | Have you already submitted or plan to submit a Pull Request? Please share your plans. 你是否已经提交或即将提交 Pull Request?请说明你的计划。 ================================================ FILE: .github/ISSUE_TEMPLATE/3-question-discussion.yml ================================================ name: "🤔 Question & Discussion" description: Create an issue for questions and discussions labels: ["question"] body: - type: markdown attributes: value: | Thank you for supporting ms-swift and taking the time to submit this issue. 感谢你对 ms-swift 的支持和抽出时间提交相关 issue。 - type: checkboxes id: checklist attributes: label: Checklist / 检查清单 options: - label: I have searched existing issues, and this is a new question or discussion topic. / 我已经搜索过现有的 issues,确认这是一个新的问题与讨论。 required: true - type: textarea id: question-description validations: required: true attributes: label: Question Description / 问题描述 description: | Please describe the question or topic you would like to discuss. 请描述你想要讨论的问题或话题。 ================================================ FILE: .github/ISSUE_TEMPLATE/config.yml ================================================ blank_issues_enabled: false ================================================ FILE: .github/PULL_REQUEST_TEMPLATE.md ================================================ # PR type - [ ] Bug Fix - [ ] New Feature - [ ] Document Updates - [ ] More Models or Datasets Support # PR information Write the detail information belongs to this PR. ## Experiment results Paste your experiment result here(if needed). ================================================ FILE: .github/SECURITY.md ================================================ # Reporting Security Issues Usually security issues of a deep learning project come from non-standard 3rd packages or continuous running services. If you are suffering from security issues from our project, please consider reporting to us. We appreciate your efforts to responsibly disclose your findings, and will make every effort to acknowledge your contributions. ================================================ FILE: .github/workflows/citest.yaml ================================================ name: citest on: push: branches: - master - "release/**" paths-ignore: - "setup.*" - "requirements.txt" - "requirements/**" - "docs/**" - "tools/**" - ".dev_scripts/**" - "README.md" - "README_*.md" - "NOTICE" - ".github/workflows/lint.yaml" - ".github/workflows/publish.yaml" pull_request: paths-ignore: - "setup.*" - "requirements.txt" - "requirements/**" - "docs/**" - "tools/**" - ".dev_scripts/**" - "README.md" - "README_*.md" - "NOTICE" - ".github/workflows/lint.yaml" - ".github/workflows/publish.yaml" concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: unittest: # The type of runner that the job will run on runs-on: [self-hosted] timeout-minutes: 240 steps: - name: ResetFileMode shell: bash run: | # reset filemode to allow action runner to delete files # generated by root in docker set -e source ~/.bashrc sudo chown -R $USER:$USER $GITHUB_WORKSPACE - name: Checkout uses: actions/checkout@v3 env: GIT_CONFIG_PARAMETERS: "'core.hooksPath='" with: lfs: 'true' submodules: 'false' fetch-depth: ${{ github.event_name == 'pull_request' && 2 || 0 }} - name: Get changed files id: changed-files run: | if ${{ github.event_name == 'pull_request' }}; then echo "PR_CHANGED_FILES=$(git diff --name-only -r HEAD^1 HEAD | xargs)" >> $GITHUB_ENV else echo "PR_CHANGED_FILES=$(git diff --name-only ${{ github.event.before }} ${{ github.event.after }} | xargs)" >> $GITHUB_ENV fi - name: Checkout LFS objects run: git lfs checkout - name: Run unittest shell: bash run: | set -e source /mnt/modelscope/ci_env.sh bash .dev_scripts/dockerci.sh ================================================ FILE: .github/workflows/citest_npu.yaml ================================================ name: citest-npu on: push: branches: - master - "release/**" paths-ignore: - "setup.*" - "requirements.txt" - "requirements/**" - "docs/**" - "tools/**" - ".dev_scripts/**" - "README.md" - "README_*.md" - "NOTICE" - ".github/workflows/lint.yaml" - ".github/workflows/publish.yaml" pull_request: paths-ignore: - "setup.*" - "requirements.txt" - "requirements/**" - "docs/**" - "tools/**" - ".dev_scripts/**" - "README.md" - "README_*.md" - "NOTICE" - ".github/workflows/lint.yaml" - ".github/workflows/publish.yaml" concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: unittest: # The type of runner that the job will run on runs-on: [linux-aarch64-a2-1] timeout-minutes: 240 container: image: 'ascendai/cann:8.3.rc2-910b-ubuntu22.04-py3.11' steps: - name: Config mirrors run: | sed -Ei 's@(ports|archive).ubuntu.com@cache-service.nginx-pypi-cache.svc.cluster.local:8081@g' /etc/apt/sources.list pip config set global.index-url http://cache-service.nginx-pypi-cache.svc.cluster.local/pypi/simple pip config set global.trusted-host cache-service.nginx-pypi-cache.svc.cluster.local - name: Checkout uses: actions/checkout@v3 with: fetch-depth: ${{ github.event_name == 'pull_request' && 2 || 0 }} - name: Get changed files id: changed-files run: | if ${{ github.event_name == 'pull_request' }}; then echo "PR_CHANGED_FILES=$(git diff --name-only -r HEAD^1 HEAD | xargs)" >> $GITHUB_ENV else echo "PR_CHANGED_FILES=$(git diff --name-only ${{ github.event.before }} ${{ github.event.after }} | xargs)" >> $GITHUB_ENV fi - name: Run unittest shell: bash run: | set -e export IMAGE_NAME=ascendai/cann export IMAGE_VERSION=8.3.rc2-910b-ubuntu22.04-py3.11 export TEST_LEVEL=0 mkdir -p ~/.cache export MODELSCOPE_CACHE=~/.cache export CI_COMMAND='bash .dev_scripts/ci_container_test.sh python tests/run.py --parallel 2 --subprocess --run_config tests/run_config.yaml' bash .dev_scripts/dockerci_npu.sh ================================================ FILE: .github/workflows/close_tale_issue.yaml ================================================ name: Close Stale Issues on: schedule: - cron: '0 0 * * *' workflow_dispatch: jobs: close-stale: runs-on: ubuntu-latest steps: - name: Close stale issues uses: actions/stale@v8 with: repo-token: ${{ secrets.GITHUB_TOKEN }} days-before-stale: 90 days-before-close: 7 stale-issue-message: 'This issue has been inactive for over 3 months and will be automatically closed in 7 days. If this issue is still relevant, please reply to this message.' close-issue-message: 'This issue has been automatically closed due to inactivity. If needed, it can be reopened.' stale-issue-label: 'stale' exempt-all-issue-labels: true ================================================ FILE: .github/workflows/lint.yaml ================================================ name: Lint test on: [push, pull_request] concurrency: group: ${{ github.workflow }}-${{ github.ref }} cancel-in-progress: true jobs: lint: runs-on: ubuntu-latest steps: - uses: actions/checkout@v2 - name: Set up Python 3.10 uses: actions/setup-python@v2 with: python-version: '3.10' - name: Install pre-commit hook run: | pip install pre-commit - name: Linting run: pre-commit run --all-files ================================================ FILE: .github/workflows/publish.yaml ================================================ name: release on: push: tags: - 'v**' concurrency: group: ${{ github.workflow }}-${{ github.ref }}-publish cancel-in-progress: true jobs: build-n-publish: runs-on: ubuntu-22.04 #if: startsWith(github.event.ref, 'refs/tags') steps: - uses: actions/checkout@v2 - name: Set up Python 3.10 uses: actions/setup-python@v2 with: python-version: '3.10' - name: Install wheel run: pip install wheel packaging setuptools==69.5.1 - name: Build ModelScope Swift run: python setup.py sdist bdist_wheel - name: Publish package to PyPI run: | pip install twine twine upload dist/* --skip-existing -u __token__ -p ${{ secrets.PYPI_API_TOKEN }} ================================================ FILE: .gitignore ================================================ # Byte-compiled / optimized / DLL files tmp *.ttf __pycache__/ *.py[cod] *$py.class test.py # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg /package /temp MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ .pytest_cache/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ .vscode .idea .run # custom *.pkl *.pkl.json *.log.json *.whl *.tar.gz *.swp *.log *.tar.gz source.sh tensorboard.sh .DS_Store replace.sh result.png result.jpg result.mp4 output/ outputs/ wandb/ swanlog/ *.out benchmarks/ eval_output/ eval_outputs/ vlmeval/ my_model/ /data result/ images /custom/ megatron_output/ /*-mcore/ /*-hf/ /*_cached_dataset/ /sample_output/ # Pytorch *.pth *.pt # ast template ast_index_file.py ================================================ FILE: .pre-commit-config.yaml ================================================ repos: - repo: https://github.com/pycqa/flake8.git rev: 7.3.0 hooks: - id: flake8 - repo: https://github.com/PyCQA/isort.git rev: 8.0.0 hooks: - id: isort - repo: https://github.com/pre-commit/mirrors-yapf.git rev: v0.32.0 hooks: - id: yapf - repo: https://github.com/pre-commit/pre-commit-hooks.git rev: v6.0.0 hooks: - id: trailing-whitespace - id: check-yaml - id: end-of-file-fixer - id: requirements-txt-fixer - id: double-quote-string-fixer - id: check-merge-conflict - id: mixed-line-ending args: ["--fix=lf"] ================================================ FILE: .pre-commit-config_local.yaml ================================================ repos: - repo: /home/admin/pre-commit/flake8 rev: 7.3.0 hooks: - id: flake8 - repo: /home/admin/pre-commit/isort rev: 8.0.0 hooks: - id: isort - repo: /home/admin/pre-commit/mirrors-yapf rev: v0.32.0 hooks: - id: yapf - repo: /home/admin/pre-commit/pre-commit-hooks rev: v6.0.0 hooks: - id: trailing-whitespace - id: check-yaml - id: end-of-file-fixer - id: requirements-txt-fixer - id: double-quote-string-fixer - id: check-merge-conflict - id: mixed-line-ending args: ["--fix=lf"] ================================================ FILE: CODE_OF_CONDUCT.md ================================================ # Contributor Covenant Code of Conduct ## Our Pledge We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation. We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. ## Our Standards Examples of behavior that contributes to a positive environment for our community include: * Demonstrating empathy and kindness toward other people * Being respectful of differing opinions, viewpoints, and experiences * Giving and gracefully accepting constructive feedback * Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience * Focusing on what is best not just for us as individuals, but for the overall community Examples of unacceptable behavior include: * The use of sexualized language or imagery, and sexual attention or advances of any kind * Trolling, insulting or derogatory comments, and personal or political attacks * Public or private harassment * Publishing others' private information, such as a physical or email address, without their explicit permission * Other conduct which could reasonably be considered inappropriate in a professional setting ## Enforcement Responsibilities Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful. Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. ## Scope This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. ## Enforcement Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at contact@modelscope.cn. All complaints will be reviewed and investigated promptly and fairly. All community leaders are obligated to respect the privacy and security of the reporter of any incident. ## Enforcement Guidelines Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct: ### 1. Correction **Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. **Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. ### 2. Warning **Community Impact**: A violation through a single incident or series of actions. **Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. ### 3. Temporary Ban **Community Impact**: A serious violation of community standards, including sustained inappropriate behavior. **Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. ### 4. Permanent Ban **Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. **Consequence**: A permanent ban from any sort of public interaction within the community. ## Attribution This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 2.1, available at [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1]. Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder][Mozilla CoC]. For answers to common questions about this code of conduct, see the FAQ at [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at [https://www.contributor-covenant.org/translations][translations]. [homepage]: https://www.contributor-covenant.org [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html [Mozilla CoC]: https://github.com/mozilla/diversity [FAQ]: https://www.contributor-covenant.org/faq [translations]: https://www.contributor-covenant.org/translations ================================================ FILE: CONTRIBUTING.md ================================================ # Contributor Guide _Welcome to offer PRs, bug reports, documentation supplements or other types of contributions to SWIFT!_ ## Table of Contents - [Code of Conduct](#-code-of-conduct) - [Contribution Process](#-contribution-process) - [Hardware support](#-Hardware-support) ## 📖 Code of Conduct Please refer to our [Code of Conduct documentation](./CODE_OF_CONDUCT.md). ## 🔁 Contribution Process ### What We Need - New Technologies and New Models: SWIFT needs to support more open-source models and datasets, or new technologies that we have not paid attention to. If you are interested please submit a PR to us. - Technical Propagation: If you are interested in technical propagation, you are welcome to help us write tutorials, documents or videos on any website, and send us the link. - Community Contribution: You can write technical articles related to SWIFT, and submit them to us. After review and approval, we will publish them on the official ModelScope accounts (Zhihu, WeChat, etc.), with your name assigned. ### Incentives - we will issue electronic certificates to contributors on behalf of the ModelScope community, to encourage your selfless contributions. - We will offer small souvenirs related to the ModelScope Community. - We will provide free A10 computing power during the development period. For more details, please refer to [Hardware-support](#-Hardware-support) section. ### Submitting PR (Pull Requests) Any feature development is carried out in the form of Fork and then PR on GitHub. 1. Fork: Go to the [ms-swift](https://github.com/modelscope/ms-swift) page and click the **Fork button**. After completion, a SWIFT code repository will be cloned under your personal organization. 2. Clone: Clone the code repository generated in the first step to your local machine and **create a new branch** for development. During development, please click the **Sync Fork button** in time to synchronize with the `main` branch to prevent code expiration and conflicts. 3. Submit PR: After development and testing, push the code to the remote branch. On GitHub, go to the **Pull Requests page**, create a new PR, select your code branch as the source branch, and the `modelscope/ms-swift:main` branch as the target branch. 4. Write Description: It is necessary to provide a good feature description in the PR, so that the reviewers know the content of your modification. 5. Review: We hope that the code to be merged is concise and efficient, so we may raise some questions and discuss them. Please note that any issues raised in the review are aimed at the code itself, not at you personally. Once all issues are discussed and resolved, your code will be approved. ### Code Standards and Development Approach SWIFT has conventional variable naming conventions and development approaches. Please follow these approaches as much as possible during development. 1. Variable names are separated by underscores, and class names are named with the first letter of each word capitalized. 2. All Python indentation uses four spaces instead of a tab. 3. Choose well-known open-source libraries, avoid using closed-source libraries or unstable open-source libraries, and avoid repeating the existing code. After the PR is submitted, SWIFT will perform two types of tests: - Code Lint Test: A static code compliance check test. please make sure that you have performed code lint locally in advance. ```shell pip install pre-commit # In the swift folder pre-commit run --all-files # Fix the errors reported by pre-commit until all checks are successful ``` - CI Tests: Smoke tests and unit tests, please refer to the next section. ### Running CI Tests Before submitting the PR, please ensure that your development code is protected by test cases, such as smoke tests for new features, or unit tests for various edge cases. Reviewers will also pay attention to this during code review. At the same time, there will be dedicated services running CI Tests, running all test cases, and the code can only be merged after the test cases pass. ## ✅ Hardware support SWIFT will provide hardware support for developers, including free GPUs. If needed, please email us ([contact@modelscope.cn](mailto:contact@modelscope.cn)) or join our WeChat group:

================================================ FILE: CONTRIBUTING_CN.md ================================================ # 贡献者指引 *欢迎帮SWIFT提供Feature PR、Bug反馈、文档补充或其他类型的贡献!* ## 目录 - [代码规约](#-代码规约) - [贡献流程](#-贡献流程) - [资源支持](#-资源支持) ## 📖 代码规约 请查看我们的[代码规约文档](./CODE_OF_CONDUCT.md). ## 🔁 贡献流程 ### 我们需要什么 - 新技术和新模型:SWIFT需要支持更多的开源模型和数据集,或我们没有关注到的新技术,如果您对此有兴趣,可以提交PR给我们。 - 技术布道:如果您对技术布道有兴趣,欢迎在任何网站上帮我们撰写教程文档或视频等,并将链接发给我们。 - 社区供稿:您可以撰写和SWIFT有关的技术文章,并供稿给我们,我们审核通过后会在魔搭官方账号(知乎、公众号等)上进行发布,并属上您的名字。 ### 激励 - 我们会以魔搭社区的身份给贡献者颁发电子证书,以鼓励您的无私贡献。 - 我们会赠送相关魔搭社区相关周边小礼品。 - 我们会赠送开发期间的免费A10算力,具体可以查看[资源支持](#-资源支持)章节。 ### 提交PR(Pull Requests) 任何feature开发都在github上以先Fork后PR的形式进行。 1. Fork:进入[ms-swift](https://github.com/modelscope/ms-swift)页面后,点击**Fork按钮**执行。完成后会在您的个人组织下克隆出一个SWIFT代码库 2. Clone:将第一步产生的代码库clone到本地并**拉新分支**进行开发,开发中请及时点击**Sync Fork按钮**同步`main`分支,防止代码过期并冲突 3. 提交PR:开发、测试完成后将代码推送到远程分支。在github上点击**Pull Requests页面**,新建一个PR,源分支选择您提交的代码分支,目标分支选择`modelscope/ms-swift:main`分支 4. 撰写描述:在PR中填写良好的feature描述是必要的,让Reviewers知道您的修改内容 5. Review:我们希望合入的代码简洁高效,因此可能会提出一些问题并讨论。请注意,任何review中提出的问题是针对代码本身,而非您个人。在所有问题讨论通过后,您的代码会被通过 ### 代码规范和开发方式 SWIFT有约定俗成的变量命名方式和开发方式。在开发中请尽量遵循这些方式。 1. 变量命名以下划线分割,类名以所有单词首字母大写方式命名 2. 所有的python缩进都是四个空格取代一个tab 3. 选用知名的开源库,避免使用闭源库或不稳定的开源库,避免重复造轮子 SWIFT在PR提交后会进行两类测试: - Code Lint测试 对代码进行静态规范走查的测试,为保证改测试通过,请保证本地预先进行了Code lint。方法是: ```shell pip install pre-commit # 在swift文件夹内 pre-commit run --all-files # 对pre-commit报的错误进行修改,直到所有的检查都是成功状态 ``` - CI Tests 冒烟测试和单元测试,请查看下一章节 ### Running CI Tests 在提交PR前,请保证您的开发代码已经受到了测试用例的保护。例如,对新功能的冒烟测试,或者各种边缘case的单元测试等。在代码review时Reviewers也会关注这一点。同时,也会有服务专门运行CI Tests,运行所有的测试用例,测试用例通过后代码才可以合并。 ## ✅ 资源支持 SWIFT会为开发者提供资源支持,包括免费的GPU算力。如果需要请邮件联系我们([contact@modelscope.cn](mailto:contact@modelscope.cn))或加入我们的微信群:

================================================ FILE: LICENSE ================================================ Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: MANIFEST.in ================================================ recursive-include requirements *.txt ================================================ FILE: Makefile ================================================ WHL_BUILD_DIR :=package DOC_BUILD_DIR :=docs/build/ # default rule default: whl docs .PHONY: docs docs: bash .dev_scripts/build_docs.sh .PHONY: linter linter: bash .dev_scripts/linter.sh .PHONY: test test: bash .dev_scripts/citest.sh .PHONY: whl whl: python setup.py sdist bdist_wheel .PHONY: clean clean: rm -rf $(WHL_BUILD_DIR) $(DOC_BUILD_DIR) ================================================ FILE: README.md ================================================ # SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning)



ModelScope Community Website
中文   |   English  

modelscope%2Fswift | Trendshift

Paper   | English Documentation   |   中文文档  

## 📖 Table of Contents - [Groups](#-Groups) - [Introduction](#-introduction) - [News](#-news) - [Installation](#%EF%B8%8F-installation) - [Quick Start](#-quick-Start) - [Usage](#-Usage) - [License](#-License) - [Citation](#-citation) ## ☎ Groups You can contact us and communicate with us by adding our group: [Discord Group](https://discord.com/invite/D27yfEFVz5) | WeChat Group :-------------------------:|:-------------------------: | ## 📝 Introduction 🍲 **ms-swift** is a large model and multimodal large model fine-tuning and deployment framework provided by the ModelScope community. It now supports training (pre-training, fine-tuning, human alignment), inference, evaluation, quantization, and deployment for 600+ text-only large models and 400+ multimodal large models. Large models include: Qwen3, Qwen3.5, InternLM3, GLM4.5, Mistral, DeepSeek-R1, Llama4, etc. Multimodal large models include: Qwen3-VL, Qwen3-Omni, Llava, InternVL3.5, MiniCPM-V-4, Ovis2.5, GLM4.5-V, DeepSeek-VL2, etc. 🍔 In addition, ms-swift integrates the latest training technologies, including Megatron parallelism techniques such as TP, PP, CP, EP to accelerate training, as well as numerous GRPO algorithm family reinforcement learning algorithms including: GRPO, DAPO, GSPO, SAPO, CISPO, RLOO, Reinforce++, etc. to enhance model intelligence. ms-swift supports a wide range of training tasks, including preference learning algorithms such as DPO, KTO, RM, CPO, SimPO, ORPO, as well as Embedding, Reranker, and sequence classification tasks. ms-swift provides full-pipeline support for large model training, including acceleration for inference, evaluation, and deployment modules using vLLM, SGLang, and LMDeploy, as well as model quantization using GPTQ, AWQ, BNB, and FP8 technologies. **Why Choose ms-swift?** - 🍎 **Model Types**: Supports **600+ text-only large models**, **400+ multimodal large models**, and All-to-All full modality models from training to deployment full pipeline, with Day-0 support for popular models. - **Dataset Types**: Built-in 150+ datasets for pre-training, fine-tuning, human alignment, multimodal and various other tasks, with support for custom datasets. Users only need to prepare datasets for one-click training. - **Hardware Support**: Supports A10/A100/H100, RTX series, T4/V100, CPU, MPS, and domestic hardware Ascend NPU, etc. - **Lightweight Training**: Supports lightweight fine-tuning methods such as LoRA, QLoRA, DoRA, LoRA+, LLaMAPro, LongLoRA, LoRA-GA, ReFT, RS-LoRA, Adapter, LISA, etc. - **Quantized Training**: Supports training on BNB, AWQ, GPTQ, AQLM, HQQ, EETQ quantized models, requiring only 9GB training resources for 7B models. - **Memory Optimization**: GaLore, Q-Galore, UnSloth, Liger-Kernel, Flash-Attention 2/3, and **Ulysses and Ring-Attention sequence parallelism techniques** support, reducing memory consumption for long-text training. - **Distributed Training**: Supports distributed data parallelism (DDP), device_map simple model parallelism, DeepSpeed ZeRO2 ZeRO3, FSDP/FSDP2, and Megatron distributed training technologies. - 🍓 **Multimodal Training**: Supports multimodal packing technology to improve training speed by 100%+, supports mixed modality data training with text, images, video and audio, and supports independent control of vit/aligner/llm. - **Agent Training**: Supports Agent templates, allowing one dataset to be used for training different models. - 🍊 **Training Tasks**: Supports pre-training and instruction fine-tuning, as well as training tasks such as DPO, GKD, KTO, RM, CPO, SimPO, ORPO, and supports **Embedding/Reranker** and sequence classification tasks. - 🥥 **Megatron Parallelism**: Provides TP/PP/SP/CP/ETP/EP/VPP parallel strategies to significantly boost **MoE model training speed**. Supports full-parameter and LoRA training methods for 300+ pure text large models and 100+ multimodal large models. Supports CPT/SFT/GRPO/DPO/KTO/RM training tasks. - 🍉 **Reinforcement Learning**: Built-in **rich GRPO family algorithms**, including GRPO, DAPO, GSPO, SAPO, CISPO, CHORD, RLOO, Reinforce++, etc. Supports synchronous and asynchronous vLLM engine inference acceleration, with extensible reward functions, multi-turn inference Schedulers, and environments through plugins. - **Full-Pipeline Capabilities**: Covers the entire workflow of training, inference, evaluation, quantization, and deployment. - **UI Training**: Provides Web-UI interface for training, inference, evaluation, and quantization, completing the full pipeline for large models. - **Inference Acceleration**: Supports Transformers, vLLM, SGLang, and LmDeploy inference acceleration engines, providing OpenAI interfaces for accelerating inference, deployment, and evaluation modules. - **Model Evaluation**: Uses EvalScope as the evaluation backend, supporting 100+ evaluation datasets for evaluating text-only and multimodal models. - **Model Quantization**: Supports quantization export for AWQ, GPTQ, FP8, and BNB. Exported models support inference acceleration using vLLM/SGLang/LmDeploy. ## 🎉 News - 🎁 2026.03.03: **ms-swift v4.0** major version is officially released. For release notes, please refer to [here](https://github.com/modelscope/ms-swift/releases/tag/v4.0.0). You can provide your suggestions to us in [this issue](https://github.com/modelscope/ms-swift/issues/7250). Thank you for your support. - 🎁 2025.11.14: Megatron GRPO is now available! Check out the [docs](./docs/source_en/Megatron-SWIFT/GRPO.md) and [examples](examples/megatron/grpo). - 🎁 2025.11.04: Support for [Mcore-Bridge](docs/source_en/Megatron-SWIFT/Mcore-Bridge.md), making Megatron training as simple and easy to use as transformers. - 🎁 2025.10.28: Ray [here](docs/source_en/Instruction/Ray.md). - 🎁 2025.09.07: Added support for CHORD training algorithm. See the [documentation](./docs/source_en/Instruction/GRPO/AdvancedResearch/CHORD.md). - 🎁 2025.09.06: Ulysses can now be used with ring-attention, allowing sequences to be sharded into any number of chunks (no longer limited by the number of heads). The argument remains `--sequence_parallel_size N`. - 🎁 2025.09.02: Megatron-SWIFT now supports multimodal model training. Documentation can be found [here](./docs/source_en/Megatron-SWIFT/Multimodal-Model.md). - 🎁 2025.08.12: Support [Dynamic Fine-Tuning](https://arxiv.org/abs/2508.05629)(DFT) in SFT training, use parameter `--enable_dft_loss true`. Training scripts can be found [here](https://github.com/modelscope/ms-swift/blob/main/examples/train/full/dft.sh). - 🎁 2025.07.09: Megatron-SWIFT supports LoRA training. Compared to ms-swift, it achieves significant speedup on MoE models. Training scripts can be found [here](https://github.com/modelscope/ms-swift/blob/main/examples/megatron/lora). - 🎁 2025.06.23: Fine-tuning of reranker models is supported. Training scripts can be found here: [Reranker](https://github.com/modelscope/ms-swift/blob/main/examples/train/reranker/train_reranker.sh). - 🎁 2025.06.15: Support for GKD training on both pure text large models and multimodal models. Training scripts can be found here: [Pure Text](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd), [Multimodal](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd).
More - 🎁 2025.06.11: Support for using Megatron parallelism techniques for RLHF training. The training script can be found [here](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf). - 🎁 2025.05.29: Support sequence parallel in pretrain, sft, dpo and grpo, check script [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/sequence_parallel). - 🎁 2025.05.11: GRPO now supports custom processing logic for reward models. See the GenRM example [here](./docs/source_en/Instruction/GRPO/DeveloperGuide/reward_model.md). - 🎁 2025.04.15: The ms-swift paper has been accepted by AAAI 2025. You can find the paper at [this link](https://ojs.aaai.org/index.php/AAAI/article/view/35383). - 🎁 2025.03.23: Multi-round GRPO is now supported for training multi-turn dialogue scenarios (e.g., agent tool calling). Please refer to the [doc](./docs/source_en/Instruction/GRPO/DeveloperGuide/multi_turn.md). - 🎁 2025.03.16: Support for Megatron's parallel training techniques is now available. Please see the [Megatron-SWIFT training documentation](https://swift.readthedocs.io/en/latest/Megatron-SWIFT/Quick-start.html). - 🎁 2025.03.15: Fine-tuning of embedding models for both pure text and multimodal models is supported. Please check the [training script](examples/train/embedding). - 🎁 2025.03.05: The hybrid mode for GRPO is supported, with a script for training a 72B model on 4 GPUs (4*80G) available [here](examples/train/grpo/internal/vllm_72b_4gpu.sh). Tensor parallelism with vllm is also supported, with the training script available [here](examples/train/grpo/internal). - 🎁 2025.02.21: The GRPO algorithm now supports LMDeploy, with the training script available [here](examples/train/grpo/internal/full_lmdeploy.sh). Additionally, the performance of the GRPO algorithm has been tested, achieving a training speed increase of up to 300% using various tricks. Please check the WanDB table [here](https://wandb.ai/tastelikefeet/grpo_perf_test?nw=nwuseryuzezyz). - 🎁 2025.02.21: The `swift sample` command is now supported. The reinforcement fine-tuning script can be found [here](docs/source_en/Instruction/Reinforced-Fine-tuning.md), and the large model API distillation sampling script is available [here](examples/sampler/distill/distill.sh). - 🔥 2025.02.12: Support for the GRPO (Group Relative Policy Optimization) training algorithm has been added. Documentation is available [here](docs/source_en/Instruction/GRPO/GetStarted/GRPO.md). - 🎁 2024.12.04: Major update to **ms-swift 3.0**. Please refer to the [release notes and changes](docs/source_en/Instruction/ReleaseNote3.0.md). - 🎉 2024.08.12: The ms-swift paper has been published on arXiv and can be read [here](https://arxiv.org/abs/2408.05517). - 🔥 2024.08.05: Support for using [evalscope](https://github.com/modelscope/evalscope/) as a backend for evaluating large models and multimodal models. - 🔥 2024.07.29: Support for using [vllm](https://github.com/vllm-project/vllm) and [lmdeploy](https://github.com/InternLM/lmdeploy) to accelerate inference for large models and multimodal models. When performing infer/deploy/eval, you can specify `--infer_backend vllm/lmdeploy`. - 🔥 2024.07.24: Support for human preference alignment training for multimodal large models, including DPO/ORPO/SimPO/CPO/KTO/RM/PPO. - 🔥 2024.02.01: Support for Agent training! The training algorithm is derived from [this paper](https://arxiv.org/pdf/2309.00986.pdf).
## 🛠️ Installation To install using pip: ```shell pip install ms-swift -U # Using uv pip install uv uv pip install ms-swift -U --torch-backend=auto ``` To install from source: ```shell # pip install git+https://github.com/modelscope/ms-swift.git git clone https://github.com/modelscope/ms-swift.git cd ms-swift # The main branch is for swift 4.x. To install swift 3.x, please run the following command: # git checkout release/3.12 pip install -e . # Using uv uv pip install -e . --torch-backend=auto ``` Running Environment: | | Range | Recommended | Notes | |--------------|--------------|---------------------|-------------------------------------------| | python | >=3.9 | 3.11/3.12 | | | cuda | | cuda12 | No need to install if using CPU, NPU, MPS | | torch | >=2.0 | 2.8.0/2.10.0 | | | transformers | >=4.33 | 4.57.6/5.2.0 | | | modelscope | >=1.23 | | | | peft | >=0.11,<0.19 | | | | flash_attn | | 2.8.3/3.0.0b1 | | | trl | >=0.15,<0.29 | 0.28.0 | RLHF | | deepspeed | >=0.14 | 0.18.8 | Training | | vllm | >=0.5.1 | 0.11.0/0.17.1 | Inference/Deployment | | sglang | >=0.4.6 | | Inference/Deployment | | lmdeploy | >=0.5 | 0.10.1 | Inference/Deployment | | evalscope | >=1.0 | | Evaluation | | gradio | | 5.32.1 | Web-UI/App | For more optional dependencies, you can refer to [here](https://github.com/modelscope/ms-swift/blob/main/requirements/install_all.sh). ## 🚀 Quick Start 10 minutes of self-cognition fine-tuning of Qwen3-4B-Instruct-2507 on a single 3090 GPU: ### Command Line Interface (Recommended) ```shell # 13GB CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model Qwen/Qwen3-4B-Instruct-2507 \ --tuner_type lora \ --dataset 'AI-ModelScope/alpaca-gpt4-data-zh#500' \ 'AI-ModelScope/alpaca-gpt4-data-en#500' \ 'swift/self-cognition#500' \ --torch_dtype bfloat16 \ --num_train_epochs 1 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --learning_rate 1e-4 \ --lora_rank 8 \ --lora_alpha 32 \ --target_modules all-linear \ --gradient_accumulation_steps 16 \ --eval_steps 50 \ --save_steps 50 \ --save_total_limit 2 \ --logging_steps 5 \ --max_length 2048 \ --output_dir output \ --warmup_ratio 0.05 \ --dataloader_num_workers 4 \ --model_author swift \ --model_name swift-robot ``` Tips: - If you want to train with a custom dataset, you can refer to [this guide](https://swift.readthedocs.io/en/latest/Customization/Custom-dataset.html) to organize your dataset format and specify `--dataset `. - The `--model_author` and `--model_name` parameters are only effective when the dataset includes `swift/self-cognition`. - To train with a different model, simply modify `--model `. - By default, **ModelScope** is used for downloading models and datasets. If you want to use HuggingFace, simply specify `--use_hf true`. After training is complete, use the following command to infer with the trained weights: - Here, `--adapters` should be replaced with the last checkpoint folder generated during training. Since the adapters folder contains the training parameter file `args.json`, there is no need to specify `--model`, `--system` separately; Swift will automatically read these parameters. To disable this behavior, you can set `--load_args false`. ```shell # Using an interactive command line for inference. CUDA_VISIBLE_DEVICES=0 \ swift infer \ --adapters output/vx-xxx/checkpoint-xxx \ --stream true \ --temperature 0 \ --max_new_tokens 2048 # merge-lora and use vLLM for inference acceleration CUDA_VISIBLE_DEVICES=0 \ swift infer \ --adapters output/vx-xxx/checkpoint-xxx \ --stream true \ --merge_lora true \ --infer_backend vllm \ --vllm_max_model_len 8192 \ --temperature 0 \ --max_new_tokens 2048 ``` Finally, use the following command to push the model to ModelScope: ```shell CUDA_VISIBLE_DEVICES=0 \ swift export \ --adapters output/vx-xxx/checkpoint-xxx \ --push_to_hub true \ --hub_model_id '' \ --hub_token '' \ --use_hf false ``` ### Web-UI The Web-UI is a **zero-threshold** training and deployment interface solution based on Gradio interface technology. For more details, you can check [here](https://swift.readthedocs.io/en/latest/GetStarted/Web-UI.html). ```shell SWIFT_UI_LANG=en swift web-ui ``` ![image.png](./docs/resources/web-ui-en.jpg) ### Using Python ms-swift also supports training and inference using Python. Below is pseudocode for training and inference. For more details, you can refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/notebook/qwen2_5-self-cognition/self-cognition-sft.ipynb). Training: ```python from peft import LoraConfig, get_peft_model from swift import get_model_processor, get_template, load_dataset, EncodePreprocessor from swift.trainers import Seq2SeqTrainer, Seq2SeqTrainingArguments # Retrieve the model and template, and add a trainable LoRA module model, tokenizer = get_model_processor(model_id_or_path, ...) template = get_template(tokenizer, ...) lora_config = LoraConfig(...) model = get_peft_model(model, lora_config) # Download and load the dataset, and encode the text into tokens train_dataset, val_dataset = load_dataset(dataset_id_or_path, ...) train_dataset = EncodePreprocessor(template=template)(train_dataset, num_proc=num_proc) val_dataset = EncodePreprocessor(template=template)(val_dataset, num_proc=num_proc) # Train the model training_args = Seq2SeqTrainingArguments(...) trainer = Seq2SeqTrainer( model=model, args=training_args, template=template, train_dataset=train_dataset, eval_dataset=val_dataset, ) trainer.train() ``` Inference: ```python from swift import TransformersEngine, InferRequest, RequestConfig # Perform inference using the native Transformers engine engine = TransformersEngine(model_id_or_path, adapters=[lora_checkpoint]) infer_request = InferRequest(messages=[{'role': 'user', 'content': 'who are you?'}]) request_config = RequestConfig(max_tokens=max_new_tokens, temperature=temperature) resp_list = engine.infer([infer_request], request_config) print(f'response: {resp_list[0].choices[0].message.content}') ``` ## ✨ Usage Here is a minimal example of training to deployment using ms-swift. For more details, you can check the [examples](https://github.com/modelscope/ms-swift/tree/main/examples). - If you want to use other models or datasets (including multimodal models and datasets), you only need to modify `--model` to specify the corresponding model's ID or path, and modify `--dataset` to specify the corresponding dataset's ID or path. - By default, ModelScope is used for downloading models and datasets. If you want to use HuggingFace, simply specify `--use_hf true`. | Useful Links | | ------ | | [🔥Command Line Parameters](https://swift.readthedocs.io/en/latest/Instruction/Command-line-parameters.html) | | [Megatron-SWIFT](https://swift.readthedocs.io/en/latest/Megatron-SWIFT/Quick-start.html) | | [GRPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/GetStarted/GRPO.html) | | [Supported Models and Datasets](https://swift.readthedocs.io/en/latest/Instruction/Supported-models-and-datasets.html) | | [Custom Models](https://swift.readthedocs.io/en/latest/Customization/Custom-model.html), [🔥Custom Datasets](https://swift.readthedocs.io/en/latest/Customization/Custom-dataset.html) | | [LLM Tutorial](https://github.com/modelscope/modelscope-classroom/tree/main/LLM-tutorial) | ### Training Supported Training Methods: | Method | Full-Parameter | LoRA | QLoRA | Deepspeed | Multi-Machine | Multimodal | | ------------------------------------------------------------ | ------------------------------------------------------------ | ---- | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | ------------------------------------------------------------ | | [Pre-training](https://github.com/modelscope/ms-swift/blob/main/examples/train/pretrain) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Supervised Fine-Tuning](https://github.com/modelscope/ms-swift/blob/main/examples/train/lora_sft.sh) | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/full/train.sh) | ✅ | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/qlora) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multi-gpu/deepspeed) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multi-node) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal) | | [GRPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/grpo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [GKD](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd) | | [PPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/ppo) | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | | [DPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/dpo) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/dpo) | | [KTO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/kto.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/kto.sh) | | [Reward Model](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/rm.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [CPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/cpo.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [SimPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/simpo.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [ORPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Embedding](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Reranker](https://github.com/modelscope/ms-swift/tree/main/examples/train/reranker) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Sequence Classification](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | Pre-training: ```shell # 8*A100 NPROC_PER_NODE=8 \ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ swift pt \ --model Qwen/Qwen2.5-7B \ --dataset swift/chinese-c4 \ --streaming true \ --tuner_type full \ --deepspeed zero2 \ --output_dir output \ --max_steps 10000 \ ... ``` Fine-tuning: ```shell CUDA_VISIBLE_DEVICES=0 swift sft \ --model Qwen/Qwen2.5-7B-Instruct \ --dataset AI-ModelScope/alpaca-gpt4-data-en \ --tuner_type lora \ --output_dir output \ ... ``` RLHF: ```shell CUDA_VISIBLE_DEVICES=0 swift rlhf \ --rlhf_type dpo \ --model Qwen/Qwen2.5-7B-Instruct \ --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji \ --tuner_type lora \ --output_dir output \ ... ``` ### Megatron-SWIFT ms-swift supports using Megatron parallelism techniques to accelerate training, including large-scale cluster training and MoE model training. The following training methods are supported: | Method | Full-Parameter | LoRA | MoE | Multimodal | FP8 | | ---------------------- | -------------- | ---- | ---- | ---------- | ---- | | Pre-training | ✅ | ✅ | ✅ | ✅ | ✅ | | [Supervised Fine-Tuning](https://github.com/modelscope/ms-swift/tree/main/examples/megatron) | ✅ | ✅ | ✅ | ✅ | ✅ | | [GRPO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/grpo) | ✅ | ✅ | ✅ | ✅ | ✅ | | [GKD](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/gkd) | ✅ | ✅ | ✅ | ✅ | ✅ | | [DPO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/dpo) | ✅ | ✅ | ✅ | ✅ | ✅ | | [KTO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/kto) | ✅ | ✅ | ✅ | ✅ | ✅ | | [RM](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/rm) | ✅ | ✅ | ✅ | ✅ | ✅ | | [Embedding](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/embedding) | ✅ | ✅| ✅ | ✅ | ✅ | | [Reranker](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/reranker) | ✅ | ✅| ✅ | ✅ | ✅ | | [Sequence Classification](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/seq_cls) | ✅ | ✅ | ✅ | ✅ | ✅ | ```shell NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 megatron sft \ --model Qwen/Qwen2.5-7B-Instruct \ --save_safetensors true \ --dataset AI-ModelScope/alpaca-gpt4-data-zh \ --tuner_type lora \ --output_dir output \ ... ``` ### Reinforcement Learning ms-swift supports a rich set of GRPO family algorithms: | Method | Full-Parameter | LoRA | Multimodal | Multi-Machine | | ------------------------------------------------------------ | -------------- | ---- | ---------- | ------------- | | [GRPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/GetStarted/GRPO.html) | ✅ | ✅ | ✅ | ✅ | | [DAPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/DAPO.html) | ✅ | ✅ | ✅ | ✅ | | [GSPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/GSPO.html) | ✅ | ✅ | ✅ | ✅ | | [SAPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/SAPO.html) | ✅ | ✅ | ✅ | ✅ | | [CISPO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/CISPO.html) | ✅ | ✅ | ✅ | ✅ | | [CHORD](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/CHORD.html) | ✅ | ✅ | ✅ | ✅ | | [RLOO](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/RLOO.html) | ✅ | ✅ | ✅ | ✅ | | [Reinforce++](https://swift.readthedocs.io/en/latest/Instruction/GRPO/AdvancedResearch/REINFORCEPP.html) | ✅ | ✅ | ✅ | ✅ | ```shell CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 \ swift rlhf \ --rlhf_type grpo \ --model Qwen/Qwen2.5-7B-Instruct \ --tuner_type lora \ --use_vllm true \ --vllm_mode colocate \ --dataset AI-MO/NuminaMath-TIR#10000 \ --output_dir output \ ... ``` ### Inference ```shell CUDA_VISIBLE_DEVICES=0 swift infer \ --model Qwen/Qwen2.5-7B-Instruct \ --stream true \ --infer_backend transformers \ --max_new_tokens 2048 # LoRA CUDA_VISIBLE_DEVICES=0 swift infer \ --model Qwen/Qwen2.5-7B-Instruct \ --adapters swift/test_lora \ --stream true \ --infer_backend transformers \ --temperature 0 \ --max_new_tokens 2048 ``` ### Interface Inference ```shell CUDA_VISIBLE_DEVICES=0 swift app \ --model Qwen/Qwen2.5-7B-Instruct \ --stream true \ --infer_backend transformers \ --max_new_tokens 2048 ``` ### Deployment ```shell CUDA_VISIBLE_DEVICES=0 swift deploy \ --model Qwen/Qwen2.5-7B-Instruct \ --infer_backend vllm ``` ### Sampling ```shell CUDA_VISIBLE_DEVICES=0 swift sample \ --model LLM-Research/Meta-Llama-3.1-8B-Instruct \ --sampler_engine transformers \ --num_return_sequences 5 \ --dataset AI-ModelScope/alpaca-gpt4-data-zh#5 ``` ### Evaluation ```shell CUDA_VISIBLE_DEVICES=0 swift eval \ --model Qwen/Qwen2.5-7B-Instruct \ --infer_backend lmdeploy \ --eval_backend OpenCompass \ --eval_dataset ARC_c ``` ### Quantization ```shell CUDA_VISIBLE_DEVICES=0 swift export \ --model Qwen/Qwen2.5-7B-Instruct \ --quant_bits 4 --quant_method awq \ --dataset AI-ModelScope/alpaca-gpt4-data-zh \ --output_dir Qwen2.5-7B-Instruct-AWQ ``` ### Push Model ```shell swift export \ --model \ --push_to_hub true \ --hub_model_id '' \ --hub_token '' ``` ## 🏛 License This framework is licensed under the [Apache License (Version 2.0)](https://github.com/modelscope/ms-swift/blob/master/LICENSE). For models and datasets, please refer to the original resource page and follow the corresponding License. ## 📎 Citation ```bibtex @misc{zhao2024swiftascalablelightweightinfrastructure, title={SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning}, author={Yuze Zhao and Jintao Huang and Jinghan Hu and Xingjun Wang and Yunlin Mao and Daoze Zhang and Zeyinzi Jiang and Zhikai Wu and Baole Ai and Ang Wang and Wenmeng Zhou and Yingda Chen}, year={2024}, eprint={2408.05517}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.05517}, } ``` ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=modelscope/ms-swift&type=Date)](https://star-history.com/#modelscope/ms-swift&Date) ================================================ FILE: README_CN.md ================================================ # SWIFT (Scalable lightWeight Infrastructure for Fine-Tuning)



魔搭社区官网
中文  |  English 

modelscope%2Fswift | Trendshift

论文   | English Documentation   |   中文文档  

## 📖 目录 - [用户群](#-用户群) - [简介](#-简介) - [新闻](#-新闻) - [安装](#%EF%B8%8F-安装) - [快速开始](#-快速开始) - [如何使用](#-如何使用) - [License](#-license) - [引用](#-引用) ## ☎ 用户群 请扫描下面的二维码来加入我们的交流群: [Discord Group](https://discord.com/invite/D27yfEFVz5) | 微信群 :-------------------------:|:-------------------------: | ## 📝 简介 🍲 **ms-swift**是魔搭社区提供的大模型与多模态大模型微调部署框架,现已支持600+纯文本大模型与400+多模态大模型的训练(预训练、微调、人类对齐)、推理、评测、量化与部署。其中大模型包括:Qwen3、Qwen3.5、InternLM3、GLM4.5、Mistral、DeepSeek-R1、Llama4等模型,多模态大模型包括:Qwen3-VL、Qwen3-Omni、Llava、InternVL3.5、MiniCPM-V-4、Ovis2.5、GLM4.5-V、DeepSeek-VL2等模型。 🍔 除此之外,ms-swift汇集了最新的训练技术,包括集成Megatron并行技术,包括TP、PP、CP、EP等为训练提供加速,以及众多GRPO算法族强化学习的算法,包括:GRPO、DAPO、GSPO、SAPO、CISPO、RLOO、Reinforce++等提升模型智能。ms-swift支持广泛的训练任务,包括DPO、KTO、RM、CPO、SimPO、ORPO等偏好学习算法,以及Embedding、Reranker、序列分类任务。ms-swift提供了大模型训练全链路的支持,包括使用vLLM、SGLang和LMDeploy对推理、评测、部署模块提供加速,以及使用GPTQ、AWQ、BNB、FP8技术对大模型进行量化。 **为什么选择ms-swift?** - 🍎 **模型类型**:支持**600+纯文本大模型**、**400+多模态大模型**以及All-to-All全模态模型训练到部署全流程,热门模型Day0支持。 - **数据集类型**:内置150+预训练、微调、人类对齐、多模态等各种任务数据集,并支持自定义数据集,用户只需准备数据集即可一键训练。 - **硬件支持**:支持A10/A100/H100、RTX系列、T4/V100、CPU、MPS以及国产硬件Ascend NPU等。 - **轻量训练**:支持了LoRA、QLoRA、DoRA、LoRA+、LLaMAPro、LongLoRA、LoRA-GA、ReFT、RS-LoRA、Adapter、LISA等轻量微调方式。 - **量化训练**:支持对BNB、AWQ、GPTQ、AQLM、HQQ、EETQ量化模型进行训练,7B模型训练只需9GB训练资源。 - **显存优化**: GaLore、Q-Galore、UnSloth、Liger-Kernel、Flash-Attention 2/3 以及 **Ulysses和Ring-Attention序列并行技术**支持,降低长文本训练显存占用。 - **分布式训练**:支持分布式数据并行(DDP)、device_map简易模型并行、DeepSpeed ZeRO2 ZeRO3、FSDP/FSDP2以及Megatron等分布式训练技术。 - 🍓 **多模态训练**:支持多模态packing技术提升训练速度100%+,支持文本、图像、视频和语音混合模态数据训练,支持vit/aligner/llm单独控制。 - **Agent训练**:支持Agent template,准备一套数据集可用于不同模型的训练。 - 🍊 **训练任务**:支持预训练和指令微调,以及DPO、GKD、KTO、RM、CPO、SimPO、ORPO等训练任务,支持**Embedding/Reranker**和序列分类任务。 - 🥥 **Megatron并行技术**:提供TP/PP/SP/CP/ETP/EP/VPP并行策略,显著提升**MoE模型训练速度**。支持300+纯文本大模型和100+多模态大模型的全参数和LoRA训练方法。支持CPT/SFT/GRPO/DPO/KTO/RM训练任务。 - 🍉 **强化学习**:内置**丰富GRPO族算法**,包括GRPO、DAPO、GSPO、SAPO、CISPO、CHORD、RLOO、Reinforce++等,支持同步和异步vLLM引擎推理加速,可使用插件拓展奖励函数、多轮推理调度器以及环境等。 - **全链路能力**:覆盖训练、推理、评测、量化和部署全流程。 - **界面训练**:提供使用Web-UI界面的方式进行训练、推理、评测、量化,完成大模型的全链路。 - **推理加速**:支持Transformers、vLLM、SGLang和LmDeploy推理加速引擎,并提供OpenAI接口,为推理、部署和评测模块提供加速。 - **模型评测**:以EvalScope作为评测后端,支持100+评测数据集对纯文本和多模态模型进行评测。 - **模型量化**:支持AWQ、GPTQ、FP8和BNB的量化导出,导出的模型支持使用vLLM/SGLang/LmDeploy推理加速。 ## 🎉 新闻 - 🎁 2026.03.03: **ms-swift v4.0**大版本正式发布,release note参考[这里](https://github.com/modelscope/ms-swift/releases/tag/v4.0.0),您的建议可以在[这个issue](https://github.com/modelscope/ms-swift/issues/7250)中反馈给我们,感谢您的支持。 - 🎁 2025.11.14: Megatron GRPO现已支持!查看[文档](./docs/source/Megatron-SWIFT/GRPO.md)和[示例](examples/megatron/grpo)。 - 🎁 2025.11.04: 支持[Mcore-Bridge](docs/source/Megatron-SWIFT/Mcore-Bridge.md),使Megatron训练像transformers一样简单易用。 - 🎁 2025.10.28: Ray [已支持](docs/source/Instruction/Ray.md)。 - 🎁 2025.09.07: 支持CHORD训练算法,请查看[文档](docs/source/Instruction/GRPO/AdvancedResearch/CHORD.md)。 - 🎁 2025.09.06: Ulysses现已支持与ring-attention结合使用,使得输入序列可以被切分成任意数量的块(不再受限于num_heads),命令参数仍然是`--sequence_parallel_size N`。 - 🎁 2025.09.02: Megatron-SWIFT支持多模态模型训练。文档参考[这里](./docs/source/Megatron-SWIFT/Mcore-Bridge.md)。 - 🎁 2025.08.12: 支持在SFT训练中使用[Dynamic Fine-Tuning](https://arxiv.org/abs/2508.05629)(DFT),使用参数 `--enable_dft_loss true`。训练脚本参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/full/dft.sh) - 🎁 2025.07.09: Megatron-SWIFT支持LoRA训练。相比ms-swift,在MoE模型提速显著。训练脚本参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/megatron/lora)。 - 🎁 2025.06.23: 支持Reranker模型训练,训练脚本参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/reranker/train_reranker.sh)。 - 🎁 2025.06.15: 支持对纯文本大模型和多模态模型进行GKD训练。训练脚本参考这里:[纯文本](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd), [多模态](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd)。
更多 - 🎁 2025.06.11: 支持使用Megatron并行技术进行RLHF训练,训练脚本参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf)。 - 🎁 2025.05.29: 支持pt、sft、dpo、grpo的序列并行,具体请查看[脚本](https://github.com/modelscope/ms-swift/tree/main/examples/train/sequence_parallel)。 - 🎁 2025.05.11: GRPO中的奖励模型支持自定义处理逻辑,GenRM的例子参考[这里](./docs/source/Instruction/GRPO/DeveloperGuide/reward_model.md)。 - 🎁 2025.04.15: ms-swift论文已经被AAAI 2025接收,论文地址在[这里](https://ojs.aaai.org/index.php/AAAI/article/view/35383)。 - 🎁 2025.03.23: 支持了多轮GRPO,用于构建多轮对话场景的训练(例如agent tool calling),请查看[文档](docs/source/Instruction/GRPO/DeveloperGuide/multi_turn.md)。 - 🎁 2025.03.16: 支持了Megatron的并行技术进行训练,请查看[Megatron-SWIFT训练文档](https://swift.readthedocs.io/zh-cn/latest/Megatron-SWIFT/Quick-start.html)。 - 🎁 2025.03.15: 支持纯文本和多模态模型的embedding模型的微调,请查看[训练脚本](examples/train/embedding)。 - 🎁 2025.03.05: 支持GRPO的hybrid模式,4GPU(4*80G)训练72B模型的脚本参考[这里](examples/train/grpo/internal/vllm_72b_4gpu.sh)。同时支持vllm的tensor并行,训练脚本参考[这里](examples/train/grpo/internal)。 - 🎁 2025.02.21: GRPO算法支持使用LMDeploy,训练脚本参考[这里](examples/train/grpo/internal/full_lmdeploy.sh)。此外测试了GRPO算法的性能,使用一些tricks使训练速度提高到300%。WanDB表格请查看[这里](https://wandb.ai/tastelikefeet/grpo_perf_test?nw=nwuseryuzezyz)。 - 🎁 2025.02.21: 支持`swift sample`命令。强化微调脚本参考[这里](docs/source/Instruction/Reinforced-Fine-tuning.md),大模型API蒸馏采样脚本参考[这里](examples/sampler/distill/distill.sh)。 - 🔥 2025.02.12: 支持GRPO (Group Relative Policy Optimization) 训练算法,文档参考[这里](docs/source/Instruction/GRPO/GetStarted/GRPO.md)。 - 🎁 2024.12.04: **ms-swift3.0**大版本更新。请查看[发布说明和更改](docs/source/Instruction/ReleaseNote3.0.md)。 - 🎉 2024.08.12: ms-swift论文已经发布到arXiv上,可以点击[这里](https://arxiv.org/abs/2408.05517)阅读。 - 🔥 2024.08.05: 支持使用[evalscope](https://github.com/modelscope/evalscope/)作为后端进行大模型和多模态模型的评测。 - 🔥 2024.07.29: 支持使用[vllm](https://github.com/vllm-project/vllm), [lmdeploy](https://github.com/InternLM/lmdeploy)对大模型和多模态大模型进行推理加速,在infer/deploy/eval时额外指定`--infer_backend vllm/lmdeploy`即可。 - 🔥 2024.07.24: 支持对多模态大模型进行人类偏好对齐训练,包括DPO/ORPO/SimPO/CPO/KTO/RM/PPO。 - 🔥 2024.02.01: 支持Agent训练!训练算法源自这篇[论文](https://arxiv.org/pdf/2309.00986.pdf)。
## 🛠️ 安装 使用pip进行安装: ```shell pip install ms-swift -U # 使用uv pip install uv uv pip install ms-swift -U --torch-backend=auto ``` 从源代码安装: ```shell # pip install git+https://github.com/modelscope/ms-swift.git git clone https://github.com/modelscope/ms-swift.git cd ms-swift # main分支为swift4.x。若安装swift3.x,请运行以下命令 # git checkout release/3.12 pip install -e . # 使用uv uv pip install -e . --torch-backend=auto ``` 运行环境: | | 范围 | 推荐 | 备注 | |--------------|--------------|---------------------|--------------------| | python | >=3.9 | 3.11/3.12 | | | cuda | | cuda12 | 使用cpu、npu、mps则无需安装 | | torch | >=2.0 | 2.8.0/2.10.0 | | | transformers | >=4.33 | 4.57.6/5.2.0 | | | modelscope | >=1.23 | | | | peft | >=0.11,<0.19 | | | | flash_attn | | 2.8.3/3.0.0b1 | | | trl | >=0.15,<0.29 | 0.28.0 | RLHF | | deepspeed | >=0.14 | 0.18.8 | 训练 | | vllm | >=0.5.1 | 0.11.0/0.17.1 | 推理/部署 | | sglang | >=0.4.6 | | 推理/部署 | | lmdeploy | >=0.5 | 0.10.1 | 推理/部署 | | evalscope | >=1.0 | | 评测 | | gradio | | 5.32.1 | Web-UI/App | 更多可选依赖可以参考[这里](https://github.com/modelscope/ms-swift/blob/main/requirements/install_all.sh)。 ## 🚀 快速开始 **10分钟**在单卡3090上对Qwen3-4B-Instruct-2507进行自我认知微调: ### 命令行(推荐) ```shell # 13GB CUDA_VISIBLE_DEVICES=0 \ swift sft \ --model Qwen/Qwen3-4B-Instruct-2507 \ --tuner_type lora \ --dataset 'AI-ModelScope/alpaca-gpt4-data-zh#500' \ 'AI-ModelScope/alpaca-gpt4-data-en#500' \ 'swift/self-cognition#500' \ --torch_dtype bfloat16 \ --num_train_epochs 1 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --learning_rate 1e-4 \ --lora_rank 8 \ --lora_alpha 32 \ --target_modules all-linear \ --gradient_accumulation_steps 16 \ --eval_steps 50 \ --save_steps 50 \ --save_total_limit 2 \ --logging_steps 5 \ --max_length 2048 \ --output_dir output \ --warmup_ratio 0.05 \ --dataloader_num_workers 4 \ --model_author swift \ --model_name swift-robot ``` 小贴士: - 如果要使用自定义数据集进行训练,你可以参考[这里](https://swift.readthedocs.io/zh-cn/latest/Customization/Custom-dataset.html)组织数据集格式,并指定`--dataset `。 - `--model_author`和`--model_name`参数只有当数据集中包含`swift/self-cognition`时才生效。 - 如果要使用其他模型进行训练,你只需要修改`--model `即可。 - 默认使用**ModelScope**进行模型和数据集的下载。如果要使用HuggingFace,指定`--use_hf true`即可。 训练完成后,使用以下命令对训练后的权重进行推理: - 这里的`--adapters`需要替换成训练生成的last checkpoint文件夹。由于adapters文件夹中包含了训练的参数文件`args.json`,因此不需要额外指定`--model`,`--system`,swift会自动读取这些参数。如果要关闭此行为,可以设置`--load_args false`。 ```shell # 使用交互式命令行进行推理 CUDA_VISIBLE_DEVICES=0 \ swift infer \ --adapters output/vx-xxx/checkpoint-xxx \ --stream true \ --temperature 0 \ --max_new_tokens 2048 # merge-lora并使用vLLM进行推理加速 CUDA_VISIBLE_DEVICES=0 \ swift infer \ --adapters output/vx-xxx/checkpoint-xxx \ --stream true \ --merge_lora true \ --infer_backend vllm \ --vllm_max_model_len 8192 \ --temperature 0 \ --max_new_tokens 2048 ``` 最后,使用以下命令将模型推送到ModelScope: ```shell CUDA_VISIBLE_DEVICES=0 \ swift export \ --adapters output/vx-xxx/checkpoint-xxx \ --push_to_hub true \ --hub_model_id '' \ --hub_token '' \ --use_hf false ``` ### Web-UI Web-UI是基于gradio界面技术的**零门槛**训练、部署界面方案,具体可以查看[这里](https://swift.readthedocs.io/zh-cn/latest/GetStarted/Web-UI.html)。 ```shell swift web-ui ``` ![image.png](./docs/resources/web-ui.jpg) ### 使用Python ms-swift也支持使用python的方式进行训练和推理。下面给出训练和推理的**伪代码**,具体可以查看[这里](https://github.com/modelscope/ms-swift/blob/main/examples/notebook/qwen2_5-self-cognition/self-cognition-sft.ipynb)。 训练: ```python from peft import LoraConfig, get_peft_model from swift import get_model_processor, get_template, load_dataset, EncodePreprocessor from swift.trainers import Seq2SeqTrainer, Seq2SeqTrainingArguments # 获取模型和template,并加入可训练的LoRA模块 model, tokenizer = get_model_processor(model_id_or_path, ...) template = get_template(tokenizer, ...) lora_config = LoraConfig(...) model = get_peft_model(model, lora_config) # 下载并载入数据集,并将文本encode成tokens train_dataset, val_dataset = load_dataset(dataset_id_or_path, ...) train_dataset = EncodePreprocessor(template=template)(train_dataset, num_proc=num_proc) val_dataset = EncodePreprocessor(template=template)(val_dataset, num_proc=num_proc) # 进行训练 training_args = Seq2SeqTrainingArguments(...) trainer = Seq2SeqTrainer( model=model, args=training_args, template=template, train_dataset=train_dataset, eval_dataset=val_dataset, ) trainer.train() ``` 推理: ```python from swift import TransformersEngine, InferRequest, RequestConfig # 使用原生 transformers 引擎进行推理 engine = TransformersEngine(model_id_or_path, adapters=[lora_checkpoint]) infer_request = InferRequest(messages=[{'role': 'user', 'content': 'who are you?'}]) request_config = RequestConfig(max_tokens=max_new_tokens, temperature=temperature) resp_list = engine.infer([infer_request], request_config) print(f'response: {resp_list[0].choices[0].message.content}') ``` ## ✨ 如何使用 这里给出使用ms-swift进行训练到部署的最简示例,具体可以查看[examples](https://github.com/modelscope/ms-swift/tree/main/examples)。 - 若想使用其他模型或者数据集(含多模态模型和数据集),你只需要修改`--model`指定对应模型的id或者path,修改`--dataset`指定对应数据集的id或者path即可。 - 默认使用ModelScope进行模型和数据集的下载。如果要使用HuggingFace,指定`--use_hf true`即可。 | 常用链接 | | ------ | | [🔥命令行参数](https://swift.readthedocs.io/zh-cn/latest/Instruction/Command-line-parameters.html) | | [Megatron-SWIFT](https://swift.readthedocs.io/zh-cn/latest/Megatron-SWIFT/Quick-start.html) | | [GRPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/GetStarted/GRPO.html) | | [支持的模型和数据集](https://swift.readthedocs.io/zh-cn/latest/Instruction/Supported-models-and-datasets.html) | | [自定义模型](https://swift.readthedocs.io/zh-cn/latest/Customization/Custom-model.html), [🔥自定义数据集](https://swift.readthedocs.io/zh-cn/latest/Customization/Custom-dataset.html) | | [大模型教程](https://github.com/modelscope/modelscope-classroom/tree/main/LLM-tutorial) | ### 训练 支持的训练方法: | 方法 | 全参数 | LoRA | QLoRA | Deepspeed | 多机 | 多模态 | | ------ | ------ |---------------------------------------------------------------------------------------------| ----- | ------ | ------ |----------------------------------------------------------------------------------------------| | [预训练](https://github.com/modelscope/ms-swift/blob/main/examples/train/pretrain) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [指令监督微调](https://github.com/modelscope/ms-swift/blob/main/examples/train/lora_sft.sh) | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/full/train.sh) | ✅ | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/qlora) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multi-gpu/deepspeed) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multi-node) | [✅](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal) | | [GRPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/grpo) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [GKD](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/gkd) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/gkd) | | [PPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/ppo) | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | | [DPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/dpo) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/dpo) | | [KTO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/kto.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | [✅](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/rlhf/kto.sh) | | [奖励模型](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/rm.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [CPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/cpo.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [SimPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/simpo.sh) | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | | [ORPO](https://github.com/modelscope/ms-swift/blob/main/examples/train/rlhf/orpo.sh) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Embedding](https://github.com/modelscope/ms-swift/blob/main/examples/train/embedding) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [Reranker](https://github.com/modelscope/ms-swift/tree/main/examples/train/reranker) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | [序列分类](https://github.com/modelscope/ms-swift/blob/main/examples/train/seq_cls) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | 预训练: ```shell # 8*A100 NPROC_PER_NODE=8 \ CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \ swift pt \ --model Qwen/Qwen2.5-7B \ --dataset swift/chinese-c4 \ --streaming true \ --tuner_type full \ --deepspeed zero2 \ --output_dir output \ --max_steps 10000 \ ... ``` 微调: ```shell CUDA_VISIBLE_DEVICES=0 swift sft \ --model Qwen/Qwen2.5-7B-Instruct \ --dataset AI-ModelScope/alpaca-gpt4-data-zh \ --tuner_type lora \ --output_dir output \ ... ``` RLHF: ```shell CUDA_VISIBLE_DEVICES=0 swift rlhf \ --rlhf_type dpo \ --model Qwen/Qwen2.5-7B-Instruct \ --dataset hjh0119/shareAI-Llama3-DPO-zh-en-emoji \ --tuner_type lora \ --output_dir output \ ... ``` ### Megatron-SWIFT ms-swift支持使用Megatron并行技术加速训练,包括大规模集群训练和MoE模型训练。以下为支持的训练方法: | 方法 | 全参数 | LoRA | MoE | 多模态 | FP8 | | ------ | ------ | ---- | ----- | ----- | ----- | | 预训练 | ✅ | ✅| ✅ | ✅ | ✅ | | [指令监督微调](https://github.com/modelscope/ms-swift/tree/main/examples/megatron) | ✅ | ✅| ✅ | ✅ | ✅ | | [GRPO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/grpo) | ✅ | ✅| ✅ | ✅ | ✅ | | [GKD](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/gkd) | ✅ | ✅| ✅ | ✅ | ✅ | | [DPO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/dpo) | ✅ | ✅| ✅ | ✅ | ✅ | | [KTO](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/kto) | ✅ | ✅| ✅ | ✅ | ✅ | | [RM](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/rlhf/rm) | ✅ | ✅| ✅ | ✅ | ✅ | | [Embedding](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/embedding) | ✅ | ✅| ✅ | ✅ | ✅ | | [Reranker](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/reranker) | ✅ | ✅| ✅ | ✅ | ✅ | | [序列分类](https://github.com/modelscope/ms-swift/tree/main/examples/megatron/seq_cls) | ✅ | ✅| ✅ | ✅ | ✅ | ```shell NPROC_PER_NODE=2 CUDA_VISIBLE_DEVICES=0,1 megatron sft \ --model Qwen/Qwen2.5-7B-Instruct \ --save_safetensors true \ --dataset AI-ModelScope/alpaca-gpt4-data-zh \ --tuner_type lora \ --output_dir output \ ... ``` ### 强化学习 ms-swift支持丰富GRPO族算法: | 方法 | 全参数 | LoRA | 多模态 | 多机 | | ------ | ------ | ---- | ----- | ----- | | [GRPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/GetStarted/GRPO.html) | ✅ | ✅| ✅ | ✅ | | [DAPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/DAPO.html) | ✅ | ✅| ✅ | ✅ | | [GSPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/GSPO.html) | ✅ | ✅| ✅ | ✅ | | [SAPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/SAPO.html) | ✅ | ✅| ✅ | ✅ | | [CISPO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/CISPO.html) | ✅ | ✅| ✅ | ✅ | | [CHORD](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/CHORD.html) | ✅ | ✅| ✅ | ✅ | | [RLOO](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/RLOO.html) | ✅ | ✅| ✅ | ✅ | | [Reinforce++](https://swift.readthedocs.io/zh-cn/latest/Instruction/GRPO/AdvancedResearch/REINFORCEPP.html) | ✅ | ✅| ✅ | ✅ | ```shell CUDA_VISIBLE_DEVICES=0,1,2,3 NPROC_PER_NODE=4 \ swift rlhf \ --rlhf_type grpo \ --model Qwen/Qwen2.5-7B-Instruct \ --tuner_type lora \ --use_vllm true \ --vllm_mode colocate \ --dataset AI-MO/NuminaMath-TIR#10000 \ --output_dir output \ ... ``` ### 推理 ```shell CUDA_VISIBLE_DEVICES=0 swift infer \ --model Qwen/Qwen2.5-7B-Instruct \ --stream true \ --infer_backend transformers \ --max_new_tokens 2048 # LoRA CUDA_VISIBLE_DEVICES=0 swift infer \ --model Qwen/Qwen2.5-7B-Instruct \ --adapters swift/test_lora \ --stream true \ --infer_backend transformers \ --temperature 0 \ --max_new_tokens 2048 ``` ### 界面推理 ```shell CUDA_VISIBLE_DEVICES=0 swift app \ --model Qwen/Qwen2.5-7B-Instruct \ --stream true \ --infer_backend transformers \ --max_new_tokens 2048 \ --lang zh ``` ### 部署 ```shell CUDA_VISIBLE_DEVICES=0 swift deploy \ --model Qwen/Qwen2.5-7B-Instruct \ --infer_backend vllm ``` ### 采样 ```shell CUDA_VISIBLE_DEVICES=0 swift sample \ --model LLM-Research/Meta-Llama-3.1-8B-Instruct \ --sampler_engine transformers \ --num_return_sequences 5 \ --dataset AI-ModelScope/alpaca-gpt4-data-zh#5 ``` ### 评测 ```shell CUDA_VISIBLE_DEVICES=0 swift eval \ --model Qwen/Qwen2.5-7B-Instruct \ --infer_backend lmdeploy \ --eval_backend OpenCompass \ --eval_dataset ARC_c ``` ### 量化 ```shell CUDA_VISIBLE_DEVICES=0 swift export \ --model Qwen/Qwen2.5-7B-Instruct \ --quant_bits 4 --quant_method awq \ --dataset AI-ModelScope/alpaca-gpt4-data-zh \ --output_dir Qwen2.5-7B-Instruct-AWQ ``` ### 推送模型 ```shell swift export \ --model \ --push_to_hub true \ --hub_model_id '' \ --hub_token '' ``` ## 🏛 License 本框架使用[Apache License (Version 2.0)](https://github.com/modelscope/ms-swift/blob/master/LICENSE)进行许可。模型和数据集请查看原资源页面并遵守对应License。 ## 📎 引用 ```bibtex @misc{zhao2024swiftascalablelightweightinfrastructure, title={SWIFT:A Scalable lightWeight Infrastructure for Fine-Tuning}, author={Yuze Zhao and Jintao Huang and Jinghan Hu and Xingjun Wang and Yunlin Mao and Daoze Zhang and Zeyinzi Jiang and Zhikai Wu and Baole Ai and Ang Wang and Wenmeng Zhou and Yingda Chen}, year={2024}, eprint={2408.05517}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2408.05517}, } ``` ## Star History [![Star History Chart](https://api.star-history.com/svg?repos=modelscope/ms-swift&type=Date)](https://star-history.com/#modelscope/ms-swift&Date) ================================================ FILE: docs/Makefile ================================================ # Minimal makefile for Sphinx documentation # # You can set these variables from the command line, and also # from the environment for the first two. SPHINXOPTS ?= SPHINXBUILD ?= sphinx-build SOURCEDIR = source BUILDDIR = build # Put it first so that "make" without argument is like "make help". help: @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) .PHONY: help Makefile # Catch-all target: route all unknown targets to Sphinx using the new # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS). %: Makefile @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O) ================================================ FILE: docs/README.md ================================================ ## maintain docs 1. build docs ```shell # in root directory: make docs ``` 2. doc string format We adopt the google style docstring format as the standard, please refer to the following documents. 1. Google Python style guide docstring [link](http://google.github.io/styleguide/pyguide.html#381-docstrings) 2. Google docstring example [link](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) 3. sample:torch.nn.modules.conv [link](https://pytorch.org/docs/stable/_modules/torch/nn/modules/conv.html#Conv1d) 4. load function as an example: ```python def load(file, file_format=None, **kwargs): """Load data from json/yaml/pickle files. This method provides a unified api for loading data from serialized files. Args: file (str or :obj:`Path` or file-like object): Filename or a file-like object. file_format (str, optional): If not specified, the file format will be inferred from the file extension, otherwise use the specified one. Currently supported formats include "json", "yaml/yml". Examples: >>> load('/path/of/your/file') # file is stored in disk >>> load('https://path/of/your/file') # file is stored on internet >>> load('oss://path/of/your/file') # file is stored in petrel Returns: The content from the file. """ ``` ================================================ FILE: docs/make.bat ================================================ @ECHO OFF pushd %~dp0 REM Command file for Sphinx documentation if "%SPHINXBUILD%" == "" ( set SPHINXBUILD=sphinx-build ) set SOURCEDIR=source set BUILDDIR=build if "%1" == "" goto help %SPHINXBUILD% >NUL 2>NUL if errorlevel 9009 ( echo. echo.The 'sphinx-build' command was not found. Make sure you have Sphinx echo.installed, then set the SPHINXBUILD environment variable to point echo.to the full path of the 'sphinx-build' executable. Alternatively you echo.may add the Sphinx directory to PATH. echo. echo.If you don't have Sphinx installed, grab it from echo.http://sphinx-doc.org/ exit /b 1 ) %SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% goto end :help %SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O% :end popd ================================================ FILE: docs/source/.readthedocs.yaml ================================================ # .readthedocs.yaml # Read the Docs configuration file # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details # Required version: 2 # Set the OS, Python version and other tools you might need build: os: ubuntu-22.04 tools: python: "3.10" # Build documentation in the "docs/" directory with Sphinx sphinx: configuration: docs/source/conf.py # Optionally build your docs in additional formats such as PDF and ePub # formats: # - pdf # - epub # Optional but recommended, declare the Python requirements required # to build your documentation # See https://docs.readthedocs.io/en/stable/guides/reproducible-builds.html python: install: - requirements: requirements/docs.txt - requirements: requirements/framework.txt ================================================ FILE: docs/source/BestPractices/Elastic.md ================================================ # Elastic ## 安装依赖 集群部署K8S,并在集群中部署DLrover,[DLRover](https://github.com/intelligent-machine-learning/dlrover), `pip install dlrover && pip install tornado && pip install kubernetes && pip install ms-swift` 经过反复测试验证的训练镜像中的其它依赖以及版本: deepspeed 0.16.5(需参考https://github.com/deepspeedai/DeepSpeed/pull/7585/files 修复universal checkpoint 相关问题) pytorch 2.6.0 ## 如何启动 通过在`--callbacks`中添加`deepspeed_elastic`(可选`graceful_exit`)启用弹性训练,并配置DeepSpeed弹性参数。 命令组成=dlrover-run +dlrover 命令参数+swift 启动命令 +swift参数,dlrover-run除自定义的参数外,其他参数与torchrun一致; dlrover-run 参数如下: ``` usage: dlrover-run [-h] [--nnodes NNODES] [--nproc-per-node NPROC_PER_NODE] [--rdzv-backend RDZV_BACKEND] [--rdzv-endpoint RDZV_ENDPOINT] [--rdzv-id RDZV_ID] [--rdzv-conf RDZV_CONF] [--standalone] [--max-restarts MAX_RESTARTS] [--monitor-interval MONITOR_INTERVAL] [--start-method {spawn,fork,forkserver}] [--role ROLE] [-m] [--no-python] [--run-path] [--log-dir LOG_DIR] [-r REDIRECTS] [-t TEE] [--local-ranks-filter LOCAL_RANKS_FILTER] [--node-rank NODE_RANK] [--master-addr MASTER_ADDR] [--master-port MASTER_PORT] [--local-addr LOCAL_ADDR] [--logs-specs LOGS_SPECS] [--precheck {0,1,2}] [--node_unit NODE_UNIT] [--auto_config] [--auto_tunning] [--exclude-straggler] [--save_at_breakpoint] [--accelerator {nvidia.com/gpu,ascend-npu}] [--training_port TRAINING_PORT] [--switchbox-check] [--box-pairs PAIR [PAIR ...]] [--min-bandwidth MIN_BANDWIDTH] [--min-channels MIN_CHANNELS] [--numa-affinity] [--network-check] [--comm-perf-test] [--ucp_device_type UCP_DEVICE_TYPE] training_script ``` 在弹性训练中我们需要关注的参数为: --nnodes NNODES Number of nodes, or the range of nodes in form :. --nproc-per-node NPROC_PER_NODE Number of processes per node. 示例: ```bash model=your model path dataset=your dataset output= your output dir export CUDA_VISIBLE_DEVICES=0 根据实际使用的GPU情况设置 deepspeed_config_or_type=deepspeed类型或者配置文件的路径,如 zero1 或者/xxx/ms-swift/swift/llm/ds_config/zero1.json dlrover-run --nnodes 1:$NODE_NUM --nproc_per_node=1 \ /opt/conda/lib/python3.10/site-packages/swift/cli/sft.py --model $model \ --model_type qwen3 \ --tuner_type lora \ --torch_dtype bfloat16 \ --dataset $dataset \ --num_train_epochs 4 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --learning_rate 5e-7 \ --gradient_accumulation_steps 8 \ --eval_steps 500 \ --save_steps 10 \ --save_total_limit 20 \ --logging_steps 1 \ --output_dir $output \ --warmup_ratio 0.01 \ --dataloader_num_workers 4 \ --temperature 1.0 \ --system 'You are a helpful assistant.' \ --lora_rank 8 \ --lora_alpha 32 \ --target_modules all-linear \ --dataset_num_proc 1 \ --use_flash_ckpt true \ --callbacks deepspeed_elastic graceful_exit \ --deepspeed $deepspeed_config_or_type \ ``` ## 配置文件示例 默认情况下的zero1为以下示例配置, ```json { "fp16": { "enabled": "auto", "loss_scale": 0, "loss_scale_window": 1000, "initial_scale_power": 16, "hysteresis": 2, "min_loss_scale": 1 }, "bf16": { "enabled": "auto" }, "zero_optimization": { "stage": 1, "offload_optimizer": { "device": "none", "pin_memory": true }, "allgather_partitions": true, "allgather_bucket_size": 2e8, "overlap_comm": false, "reduce_scatter": true, "reduce_bucket_size": 2e8, "contiguous_gradients": true }, "gradient_accumulation_steps": "auto", "gradient_clipping": "auto", "steps_per_print": 2000, "train_batch_size": "auto", "train_micro_batch_size_per_gpu": "auto", "wall_clock_breakdown": false, "elasticity": { "ignore_non_elastic_batch_info": true, "enabled": true, "max_train_batch_size": 8, "micro_batch_sizes": [ 4, 2 ], "min_gpus": 1, "max_gpus": 4, "min_time": 20, "version": 0.1 } } ``` 如果用户需要自定义,可以在启动命令中deepspeed_config_or_type指定自定义的zero1.json的存放路径,其中弹性相关的配置为: ```json ... "elasticity": { "ignore_non_elastic_batch_info": true, "enabled": true, "max_train_batch_size": 8, "micro_batch_sizes": [ 4, 2 ], "min_gpus": 1, "max_gpus": 4, "min_time": 20, "version": 0.1 } ``` - ignore_non_elastic_batch_info:代表在elasticity里的配置会忽略外层的batch_size相关的配置,训练过程中会根据实际的训练进程个数实时修改batch_size等相关的参数 计算原则为:  global-training-batch-size = micro-batch-size * gradient-accumulation-steps * world-size - max_train_batch_size:最大batch_size数 - micro_batch_sizes:elasticity下允许的每卡micro-batch size列表,相当于train_micro_batch_size_per_gpu的候选值 - min_gpus:最小gpu数目 - max_gpus:最大gpu数目 更详细的内容见:[Deepspeed](https://www.deepspeed.ai/docs/config-json/#elastic-training-config-v01-and-v02) ## 启动训练 ```yaml --- apiVersion: elastic.iml.github.io/v1alpha1 kind: ElasticJob metadata: name: deepspeed-elastic-swift namespace: dlrover spec: distributionStrategy: AllreduceStrategy optimizeMode: single-job replicaSpecs: worker: replicas: 1 #【这里需要与启动命令中的--nnodes NNODES的最大值一致】 template: spec: restartPolicy: Never containers: - name: main image: #【训练镜像,需要安装deepspeed,dlrover 和swift 】 imagePullPolicy: IfNotPresent command: - /bin/bash - -c - sh start.sh # 启动脚本 resources: limits: cpu: '8' memory: 16Gi nvidia.com/gpu: '1' volumeMounts: - mountPath: /model name: volume-model - mountPath: /dev/shm name: volume-shm restartPolicy: Never volumes: - hostPath: path: /model type: Directory name: volume-model - emptyDir: medium: Memory sizeLimit: 200Gi name: volume-shm ``` ================================================ FILE: docs/source/BestPractices/Embedding.md ================================================ # Embedding训练 SWIFT已经支持Embedding模型的训练,包括纯文本和多模态两个类型。目前已经支持的模型有: 1. modernbert embedding模型 - [ModelScope](https://modelscope.cn/models/iic/gte-modernbert-base) [Hugging Face](https://huggingface.co/Alibaba-NLP/gte-modernbert-base) 2. gte embedding模型 - 1.5B: [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-1.5B-instruct) [Hugging Face](https://huggingface.co/Alibaba-NLP/gte-Qwen2-1.5B-instruct) - 7B: [ModelScope](https://www.modelscope.cn/models/iic/gte_Qwen2-7B-instruct) [Hugging Face](https://huggingface.co/Alibaba-NLP/gte-Qwen2-7B-instruct) 3. gme embedding模型 - 2B: [ModelScope](https://www.modelscope.cn/models/iic/gme-Qwen2-VL-2B-Instruct) [Hugging Face](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-2B-Instruct) - 7B: [ModelScope](https://www.modelscope.cn/models/iic/gme-Qwen2-VL-7B-Instruct) [Hugging Face](https://huggingface.co/Alibaba-NLP/gme-Qwen2-VL-7B-Instruct) 4. qwen3-embedding模型 - 0.6B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Embedding-0.6B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B) - 4B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Embedding-4B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Embedding-4B) - 8B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-Embedding-8B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-Embedding-8B) 5. qwen3-vl-embedding模型 - 2B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-VL-Embedding-2B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-VL-Embedding-2B) - 8B: [ModelScope](https://www.modelscope.cn/models/Qwen/Qwen3-VL-Embedding-8B) [Hugging Face](https://huggingface.co/Qwen/Qwen3-VL-Embedding-8B) 开发者可以自行集成自己的模型,模型forward输出值需要满足: ```text {"last_hidden_state": some-embedding-tensor} ``` 返回值是一个json,具有`last_hidden_state` key,value是embedding tensor即可,输入部分可以使用我们已经支持的template。用户也可以通过指定 ```shell --task_type embedding ``` 参数来将任意一个其他模型转换为embedding模型进行训练。 需要注意的是,SWIFT目前支持的embedding模型均为符合纯文本或多模态LLM,目前并不支持CLIP类型的模型训练。 此外,SWIFT支持的所有embedding模型在模型forward最后都增加了normalize,如自行增加新模型请注意增加normalize层。 ## loss 目前SWIFT支持的Embedding模型可以使用的loss有: - cosine_similarity: cosine相似度loss,计算两个embedding的相似度,并根据label的值拟合,实际为MSE loss - contrastive: 可调margin的对比学习loss,label仅支持0和1两个值 - online_contrastive: 考虑hard negative和hard positive部分的contrastive loss,label仅支持0和1两个值 - infonce: 在同一个batch中不同row两两计算cosine相似度,并使row内部相似度最大,不同row相似度最小,不需要label loss的源代码可以在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/loss/mapping.py)找到。 ## 数据集格式 > 注: > 1. ``标签可以出现在`messages`/`positive_messages`/`negative_messages`的任意位置;它们各自拥有独立的`images`/`positive_images`/`negative_images`字段用于提供图片路径或URL。 > 2. 不再需要跨字段的“对应顺序”。对齐规则为:`images`的长度等于`messages`中``标签的数量;`positive_images`与`negative_images`均为“list of list”,其外层长度分别等于`positive_messages`与`negative_messages`的长度;并且外层每一项的内层列表长度等于该条消息序列中``标签的数量。 > 3. `messages`代表anchor样本(anchor sample);`positive_messages`/`negative_messages`为“list of messages”(因此多一层`[]`);相应地,`positive_images`/`negative_images`也多一层`[]`并与之逐项对齐。 > 4. 也支持`