-
Notifications
You must be signed in to change notification settings - Fork 5
Open
Description
硬件型号
11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz 2.42 GHz,集成显卡 Iris
问题现象
使用 gpu 推理结果如下
使用 cpu 推理结果正常
转化指令
optimum-cli export openvino --model './local_dir' --task text-generation-with-past --weight-format int4 --group-size 128 --ratio 0.8 Qwen2.5-7B-Instruct-int4-ov
推理代码
import openvino_genai as ov_genai
pipe = ov_genai.LLMPipeline("Qwen2.5-7B-Instruct-int4-ov", "GPU")
def streamer(subword):
print(subword, end='', flush=True)
return False
pipe.start_chat()
while True:
try:
prompt = input('question:\n')
except EOFError:
break
pipe.generate(prompt, eos_token_id=151645, max_length=500, streamer=streamer)
print('\n----------')
pipe.finish_chat()
Metadata
Metadata
Assignees
Labels
No labels