Qwen2-7B infer error！

This project was able to successfully run the qwen1.5B model.
 However, when running the qwen7b model, there is a probability of generating softmax overflow errors during inference, resulting in runtime failure.
The model comes from the script:
python3 convert.py --model_id qwen/Qwen2-7B-Instruct --precision int4 --output {your_path}/Qwen2-7B-Instruct-ov --modelscope

Device is: Intel ultra 7 GPU

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen2-7B infer error！ #6

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Qwen2-7B infer error！ #6

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions