opencompass评估7B或者8B模型

老师，用opencompass评估7B或者8B模型对显卡要求很高吗？我用3090*2都评估不了，用debug模式一起卡在Starting inference process...那里，换成1.8B的模型很快就可以开始，如果说需要A100一类的显卡，那

后面如果遇到70B+一类的模型又应该如何评估？

python run.py --models hf_deepseek_r1_distill_qwen_7b \

--custom-dataset-path /root/autodl-fs/models/FinCorpus/opencompass_eval.jsonl \

--custom-dataset-data-type mcq \

--custom-dataset-infer-method gen \

--max-out-len 16 \

--hf-num-gpus 2 \

--generation-kwargs do_sample=True temperature=0.6 \

--debug

07/18 17:39:09 - OpenCompass - INFO - Loading hf_deepseek_r1_distill_qwen_7b: /root/autodl-tmp/opencompass/opencompass/configs/./models/deepseek/hf_deepseek_r1_distill_qwen_7b.py

07/18 17:39:09 - OpenCompass - INFO - Loading example: /root/autodl-tmp/opencompass/opencompass/configs/./summarizers/example.py

07/18 17:39:09 - OpenCompass - INFO - Current exp folder: outputs/default/20250718_173909

07/18 17:39:09 - OpenCompass - WARNING - SlurmRunner is not used, so the partition argument is ignored.

07/18 17:39:09 - OpenCompass - INFO - Partitioned into 1 tasks.

07/18 17:39:11 - OpenCompass - WARNING - Only use 1 GPUs for total 2 available GPUs in debug mode.

07/18 17:39:11 - OpenCompass - INFO - Task [deepseek-r1-distill-qwen-7b-hf/opencompass_eval]

Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.

Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [03:55<00:00, 117.90s/it]

07/18 17:43:09 - OpenCompass - INFO - using stop words: ['<｜end▁of▁sentence｜>']

Map: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1743/1743 [00:00<00:00, 14292.70 examples/s]

07/18 17:43:09 - OpenCompass - INFO - Start inferencing [deepseek-r1-distill-qwen-7b-hf/opencompass_eval]

[2025-07-18 17:43:09,617] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting build dataloader

[2025-07-18 17:43:09,618] [opencompass.openicl.icl_inferencer.icl_gen_inferencer] [INFO] Starting inference process...

0%| | 0/218 [00:00<?, ?it/s]07/18 17:43:09 - OpenCompass - INFO - Generation Args of Huggingface:

07/18 17:43:09 - OpenCompass - INFO - {'stopping_criteria': [<opencompass.models.huggingface_above_v4_33._get_stopping_criteria.<locals>.MultiTokenEOSCriteria object at 0x7fd50750b0d0>], 'max_new_tokens': 16384, 'pad_token_id': 151643}

/root/autodl-tmp/conda/envs/opencompass/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:631: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.6` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.

warnings.warn(

/root/autodl-tmp/conda/envs/opencompass/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:636: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.95` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.

warnings.warn(

0%|

后面就用ctrl+c强制结束了，我试验过两个模型DeepSeek-R1-Distill-Qwen-7B和DeepSeek-R1-Distill-Llama-8B, 都没有成功跑起来

如果需要测试数据集可以从这里取https://pan.baidu.com/s/1-rc8N-ZzkyjIzHPFqs6DkA?pwd=578v 里面的opencompass_eval.jsonl，文件都是我按照框架要求处理过的

07月18日 22:48 | 45人阅读

回答 | 共 2 个

按点赞量排序