Skip to content

Qwen2-7B infer error! #6

@zzhdbw

Description

@zzhdbw

This project was able to successfully run the qwen1.5B model.
However, when running the qwen7b model, there is a probability of generating softmax overflow errors during inference, resulting in runtime failure.
The model comes from the script:
python3 convert.py --model_id qwen/Qwen2-7B-Instruct --precision int4 --output {your_path}/Qwen2-7B-Instruct-ov --modelscope

Device is: Intel ultra 7 GPU

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions