按day12-Xtuner分布式微调大模型课程,使用Xtuner微调qwen时
模型选Qwen1.5-1.8B-Chat,Qwen2.5-1.5B-Instruct和DeepSeek-R1-Distill-Qwen-1.5B都可以正常训练。
但选Qwen3-1.7B时,
pretrained_model_name_or_path = '/mnt/workspace/models/Qwen/Qwen3-1.7B'
抛出KeyError: 'qwen3'异常,具体错误信息如下:
work_dir = './work_dirs/qwen_chat_qlora_json_qwen3'
Traceback (most recent call last):
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1071, in from_pretrained
config_class = CONFIG_MAPPING[config_dict["model_type"]]
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 773, in __getitem__
raise KeyError(key)
KeyError: 'qwen3'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/workspace/xtuner/xtuner/tools/train.py", line 392, in <module>
main()
File "/mnt/workspace/xtuner/xtuner/tools/train.py", line 381, in main
runner = Runner.from_cfg(cfg)
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/runner.py", line 462, in from_cfg
runner = cls(
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/runner.py", line 429, in __init__
self.model = self.build_model(model)
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/runner/runner.py", line 836, in build_model
model = MODELS.build(model)
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/registry/registry.py", line 570, in build
return self.build_func(cfg, *args, **kwargs, registry=self)
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 234, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/mmengine/registry/build_functions.py", line 123, in build_from_cfg
obj = obj_cls(**args) # type: ignore
File "/mnt/workspace/xtuner/xtuner/model/sft.py", line 97, in __init__
self.llm = self.build_llm_from_cfg(
File "/mnt/workspace/xtuner/xtuner/model/sft.py", line 142, in build_llm_from_cfg
llm = self._dispatch_lm_model_cfg(llm_cfg, max_position_embeddings)
File "/mnt/workspace/xtuner/xtuner/model/sft.py", line 281, in _dispatch_lm_model_cfg
llm_cfg = AutoConfig.from_pretrained(
File "/mnt/workspace/miniconda3/envs/xtuner-env/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py", line 1073, in from_pretrained
raise ValueError(
ValueError: The checkpoint you are trying to load has model type `qwen3` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`