-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Open
Description
My Device:
system: ubuntu 24.03
ipex-llm: ipex-llm-ollama


Error performance:
An error occurs when executing ollama run qwen2.5:0.5b
A more complete error log:
The current error message is displayed in the terminal command-line interface of ollama serve
.
[GIN] 2025/09/15 - 19:29:14 | 500 | 442.334217ms | 127.0.0.1 | POST "/api/generate"
^Croot@dcg:/opt/aog/engine/ollama/ollama# ./ollama serve
time=2025-09-15T19:29:45.094+08:00 level=INFO source=routes.go:1235 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:16677 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/var/lib/aog/engine/ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"
time=2025-09-15T19:29:45.094+08:00 level=INFO source=images.go:476 msg="total blobs: 7"
time=2025-09-15T19:29:45.094+08:00 level=INFO source=images.go:483 msg="total unused blobs removed: 0"
[GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached.
[GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production.
- using env: export GIN_MODE=release
- using code: gin.SetMode(gin.ReleaseMode)
[GIN-debug] HEAD / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func1 (5 handlers)
[GIN-debug] GET / --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func2 (5 handlers)
[GIN-debug] HEAD /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func3 (5 handlers)
[GIN-debug] GET /api/version --> github.com/ollama/ollama/server.(*Server).GenerateRoutes.func4 (5 handlers)
[GIN-debug] POST /api/pull --> github.com/ollama/ollama/server.(*Server).PullHandler-fm (5 handlers)
[GIN-debug] POST /api/push --> github.com/ollama/ollama/server.(*Server).PushHandler-fm (5 handlers)
[GIN-debug] HEAD /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] GET /api/tags --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (5 handlers)
[GIN-debug] POST /api/show --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (5 handlers)
[GIN-debug] DELETE /api/delete --> github.com/ollama/ollama/server.(*Server).DeleteHandler-fm (5 handlers)
[GIN-debug] POST /api/create --> github.com/ollama/ollama/server.(*Server).CreateHandler-fm (5 handlers)
[GIN-debug] POST /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).CreateBlobHandler-fm (5 handlers)
[GIN-debug] HEAD /api/blobs/:digest --> github.com/ollama/ollama/server.(*Server).HeadBlobHandler-fm (5 handlers)
[GIN-debug] POST /api/copy --> github.com/ollama/ollama/server.(*Server).CopyHandler-fm (5 handlers)
[GIN-debug] GET /api/ps --> github.com/ollama/ollama/server.(*Server).PsHandler-fm (5 handlers)
[GIN-debug] POST /api/generate --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (5 handlers)
[GIN-debug] POST /api/chat --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (5 handlers)
[GIN-debug] POST /api/embed --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (5 handlers)
[GIN-debug] POST /api/embeddings --> github.com/ollama/ollama/server.(*Server).EmbeddingsHandler-fm (5 handlers)
[GIN-debug] POST /v1/chat/completions --> github.com/ollama/ollama/server.(*Server).ChatHandler-fm (6 handlers)
[GIN-debug] POST /v1/completions --> github.com/ollama/ollama/server.(*Server).GenerateHandler-fm (6 handlers)
[GIN-debug] POST /v1/embeddings --> github.com/ollama/ollama/server.(*Server).EmbedHandler-fm (6 handlers)
[GIN-debug] GET /v1/models --> github.com/ollama/ollama/server.(*Server).ListHandler-fm (6 handlers)
[GIN-debug] GET /v1/models/:model --> github.com/ollama/ollama/server.(*Server).ShowHandler-fm (6 handlers)
time=2025-09-15T19:29:45.094+08:00 level=INFO source=routes.go:1288 msg="Listening on [::]:16677 (version 0.9.3)"
time=2025-09-15T19:29:45.094+08:00 level=INFO source=gpu.go:217 msg="looking for compatible GPUs"
time=2025-09-15T19:29:45.094+08:00 level=INFO source=gpu.go:218 msg="using Intel GPU"
time=2025-09-15T19:29:45.100+08:00 level=INFO source=types.go:130 msg="inference compute" id=0 library=cpu variant="" compute="" driver=0.0 name="" total="30.9 GiB" available="25.8 GiB"
[GIN] 2025/09/15 - 19:29:48 | 200 | 27.735µs | 127.0.0.1 | HEAD "/"
[GIN] 2025/09/15 - 19:29:48 | 200 | 28.881689ms | 127.0.0.1 | POST "/api/show"
time=2025-09-15T19:29:48.779+08:00 level=INFO source=server.go:135 msg="system memory" total="30.9 GiB" free="25.8 GiB" free_swap="0 B"
time=2025-09-15T19:29:48.780+08:00 level=INFO source=server.go:187 msg=offload library=cpu layers.requested=-1 layers.model=25 layers.offload=0 layers.split="" memory.available="[25.8 GiB]" memory.gpu_overhead="0 B" memory.required.full="732.6 MiB" memory.required.partial="0 B" memory.required.kv="48.0 MiB" memory.required.allocations="[732.6 MiB]" memory.weights.total="373.7 MiB" memory.weights.repeating="235.8 MiB" memory.weights.nonrepeating="137.9 MiB" memory.graph.full="298.5 MiB" memory.graph.partial="405.0 MiB"
llama_model_loader: loaded meta data with 34 key-value pairs and 290 tensors from /var/lib/aog/engine/ollama/models/blobs/sha256-c5396e06af294bd101b30dce59131a76d2b773e76950acc870eda801d3ab0515 (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen2
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen2.5 0.5B Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Qwen2.5
llama_model_loader: - kv 5: general.size_label str = 0.5B
llama_model_loader: - kv 6: general.license str = apache-2.0
llama_model_loader: - kv 7: general.license.link str = https://huggingface.co/Qwen/Qwen2.5-0...
llama_model_loader: - kv 8: general.base_model.count u32 = 1
llama_model_loader: - kv 9: general.base_model.0.name str = Qwen2.5 0.5B
llama_model_loader: - kv 10: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 11: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen2.5-0.5B
llama_model_loader: - kv 12: general.tags arr[str,2] = ["chat", "text-generation"]
llama_model_loader: - kv 13: general.languages arr[str,1] = ["en"]
llama_model_loader: - kv 14: qwen2.block_count u32 = 24
llama_model_loader: - kv 15: qwen2.context_length u32 = 32768
llama_model_loader: - kv 16: qwen2.embedding_length u32 = 896
llama_model_loader: - kv 17: qwen2.feed_forward_length u32 = 4864
llama_model_loader: - kv 18: qwen2.attention.head_count u32 = 14
llama_model_loader: - kv 19: qwen2.attention.head_count_kv u32 = 2
llama_model_loader: - kv 20: qwen2.rope.freq_base f32 = 1000000.000000
llama_model_loader: - kv 21: qwen2.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 22: general.file_type u32 = 15
llama_model_loader: - kv 23: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 24: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 25: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 26: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 27: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 28: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 29: tokenizer.ggml.padding_token_id u32 = 151643
llama_model_loader: - kv 30: tokenizer.ggml.bos_token_id u32 = 151643
llama_model_loader: - kv 31: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 32: tokenizer.chat_template str = {%- if tools %}\n {{- '<|im_start|>...
llama_model_loader: - kv 33: general.quantization_version u32 = 2
llama_model_loader: - type f32: 121 tensors
llama_model_loader: - type q5_0: 132 tensors
llama_model_loader: - type q8_0: 13 tensors
llama_model_loader: - type q4_K: 12 tensors
llama_model_loader: - type q6_K: 12 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 373.71 MiB (6.35 BPW)
load: special tokens cache size = 22
load: token to piece cache size = 0.9310 MB
print_info: arch = qwen2
print_info: vocab_only = 1
print_info: model type = ?B
print_info: model params = 494.03 M
print_info: general.name = Qwen2.5 0.5B Instruct
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 151643 '<|endoftext|>'
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151643 '<|endoftext|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
llama_model_load: vocab only - skipping tensors
time=2025-09-15T19:29:48.917+08:00 level=INFO source=server.go:458 msg="starting llama server" cmd="/opt/aog/engine/ollama/ollama/ollama-bin runner --model /var/lib/aog/engine/ollama/models/blobs/sha256-c5396e06af294bd101b30dce59131a76d2b773e76950acc870eda801d3ab0515 --ctx-size 4096 --batch-size 512 --n-gpu-layers 999 --threads 8 --no-mmap --parallel 2 --port 42509"
time=2025-09-15T19:29:48.917+08:00 level=INFO source=sched.go:483 msg="loaded runners" count=1
time=2025-09-15T19:29:48.917+08:00 level=INFO source=server.go:618 msg="waiting for llama runner to start responding"
time=2025-09-15T19:29:48.917+08:00 level=INFO source=server.go:652 msg="waiting for server to become available" status="llm server not responding"
using override patterns: []
time=2025-09-15T19:29:48.949+08:00 level=INFO source=runner.go:851 msg="starting go runner"
terminate called after throwing an instance of 'sycl::_V1::exception'
what(): No device of requested type available. Please check https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-dpcpp-system-requirements.html
SIGABRT: abort
PC=0x7f96256a70fc m=0 sigcode=18446744073709551610
signal arrived during cgo execution
goroutine 1 gp=0xc000002380 m=0 mp=0x1ecf7c0 [syscall]:
runtime.cgocall(0x1157b90, 0xc000439538)
/usr/local/go/src/runtime/cgocall.go:167 +0x4b fp=0xc000439510 sp=0xc0004394d8 pc=0x48398b
github.com/ollama/ollama/ml/backend/ggml/ggml/src._Cfunc_ggml_backend_load_all_from_path(0x26b6a910)
_cgo_gotypes.go:195 +0x3a fp=0xc000439538 sp=0xc000439510 pc=0x830dfa
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1.1({0xc000042014, 0x1d})
/home/arda/ruonan/ollama-internal/ml/backend/ggml/ggml/src/ggml.go:97 +0xf5 fp=0xc0004395d0 sp=0xc000439538 pc=0x830895
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.func1()
/home/arda/ruonan/ollama-internal/ml/backend/ggml/ggml/src/ggml.go:98 +0x526 fp=0xc000439860 sp=0xc0004395d0 pc=0x8306e6
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func2()
/usr/local/go/src/sync/oncefunc.go:27 +0x62 fp=0xc0004398a8 sp=0xc000439860 pc=0x8300e2
sync.(*Once).doSlow(0x0?, 0x0?)
/usr/local/go/src/sync/once.go:78 +0xab fp=0xc000439900 sp=0xc0004398a8 pc=0x4991ab
sync.(*Once).Do(0x0?, 0x0?)
/usr/local/go/src/sync/once.go:69 +0x19 fp=0xc000439920 sp=0xc000439900 pc=0x4990d9
github.com/ollama/ollama/ml/backend/ggml/ggml/src.init.OnceFunc.func3()
/usr/local/go/src/sync/oncefunc.go:32 +0x2d fp=0xc000439950 sp=0xc000439920 pc=0x83004d
github.com/ollama/ollama/llama.BackendInit()
/home/arda/ruonan/ollama-internal/llama/llama.go:57 +0x16 fp=0xc000439960 sp=0xc000439950 pc=0x8349f6
github.com/ollama/ollama/runner/llamarunner.Execute({0xc0001aa020, 0xf, 0x10})
/home/arda/ruonan/ollama-internal/runner/llamarunner/runner.go:853 +0x7d4 fp=0xc000439d08 sp=0xc000439960 pc=0x8f3bf4
github.com/ollama/ollama/runner.Execute({0xc0001aa010?, 0x0?, 0x0?})
/home/arda/ruonan/ollama-internal/runner/runner.go:22 +0xd4 fp=0xc000439d30 sp=0xc000439d08 pc=0x979374
github.com/ollama/ollama/cmd.NewCLI.func2(0xc000271400?, {0x140da22?, 0x4?, 0x140da26?})
/home/arda/ruonan/ollama-internal/cmd/cmd.go:1529 +0x45 fp=0xc000439d58 sp=0xc000439d30 pc=0x10d5b45
github.com/spf13/cobra.(*Command).execute(0xc000114f08, {0xc0004b4ff0, 0xf, 0xf})
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:940 +0x894 fp=0xc000439e78 sp=0xc000439d58 pc=0x5ff694
github.com/spf13/cobra.(*Command).ExecuteC(0xc0000e6908)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:1068 +0x3a5 fp=0xc000439f30 sp=0xc000439e78 pc=0x5ffee5
github.com/spf13/cobra.(*Command).Execute(...)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:992
github.com/spf13/cobra.(*Command).ExecuteContext(...)
/home/arda/go/pkg/mod/github.com/spf13/[email protected]/command.go:985
main.main()
/home/arda/ruonan/ollama-internal/main.go:12 +0x4d fp=0xc000439f50 sp=0xc000439f30 pc=0x10d65cd
runtime.main()
/usr/local/go/src/runtime/proc.go:283 +0x28b fp=0xc000439fe0 sp=0xc000439f50 pc=0x45390b
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000439fe8 sp=0xc000439fe0 pc=0x48eca1
goroutine 2 gp=0xc000002e00 m=nil [force gc (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000072fa8 sp=0xc000072f88 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.forcegchelper()
/usr/local/go/src/runtime/proc.go:348 +0xb3 fp=0xc000072fe0 sp=0xc000072fa8 pc=0x453c53
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000072fe8 sp=0xc000072fe0 pc=0x48eca1
created by runtime.init.7 in goroutine 1
/usr/local/go/src/runtime/proc.go:336 +0x1a
goroutine 18 gp=0xc0000aa380 m=nil [GC sweep wait]:
runtime.gopark(0x1?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00006e780 sp=0xc00006e760 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.bgsweep(0xc0000b8000)
/usr/local/go/src/runtime/mgcsweep.go:316 +0xdf fp=0xc00006e7c8 sp=0xc00006e780 pc=0x43e45f
runtime.gcenable.gowrap1()
/usr/local/go/src/runtime/mgc.go:204 +0x25 fp=0xc00006e7e0 sp=0xc00006e7c8 pc=0x4328c5
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00006e7e8 sp=0xc00006e7e0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
/usr/local/go/src/runtime/mgc.go:204 +0x66
goroutine 19 gp=0xc0000aa540 m=nil [GC scavenge wait]:
runtime.gopark(0x10000?, 0x15d2cc8?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00006ef78 sp=0xc00006ef58 pc=0x486e0e
runtime.goparkunlock(...)
/usr/local/go/src/runtime/proc.go:441
runtime.(*scavengerState).park(0x1ecc9a0)
/usr/local/go/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc00006efa8 sp=0xc00006ef78 pc=0x43bea9
runtime.bgscavenge(0xc0000b8000)
/usr/local/go/src/runtime/mgcscavenge.go:658 +0x59 fp=0xc00006efc8 sp=0xc00006efa8 pc=0x43c439
runtime.gcenable.gowrap2()
/usr/local/go/src/runtime/mgc.go:205 +0x25 fp=0xc00006efe0 sp=0xc00006efc8 pc=0x432865
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00006efe8 sp=0xc00006efe0 pc=0x48eca1
created by runtime.gcenable in goroutine 1
/usr/local/go/src/runtime/mgc.go:205 +0xa5
goroutine 34 gp=0xc000184380 m=nil [finalizer wait]:
runtime.gopark(0x1b8?, 0xc000002380?, 0x1?, 0x23?, 0xc000072688?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000072630 sp=0xc000072610 pc=0x486e0e
runtime.runfinq()
/usr/local/go/src/runtime/mfinal.go:196 +0x107 fp=0xc0000727e0 sp=0xc000072630 pc=0x431887
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0000727e8 sp=0xc0000727e0 pc=0x48eca1
created by runtime.createfing in goroutine 1
/usr/local/go/src/runtime/mfinal.go:166 +0x3d
goroutine 35 gp=0xc000184e00 m=nil [chan receive]:
runtime.gopark(0xc0001f5a40?, 0xc000116018?, 0x60?, 0x47?, 0x56cac8?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000304718 sp=0xc0003046f8 pc=0x486e0e
runtime.chanrecv(0xc000180310, 0x0, 0x1)
/usr/local/go/src/runtime/chan.go:664 +0x445 fp=0xc000304790 sp=0xc000304718 pc=0x4232a5
runtime.chanrecv1(0x0?, 0x0?)
/usr/local/go/src/runtime/chan.go:506 +0x12 fp=0xc0003047b8 sp=0xc000304790 pc=0x422e32
runtime.unique_runtime_registerUniqueMapCleanup.func2(...)
/usr/local/go/src/runtime/mgc.go:1796
runtime.unique_runtime_registerUniqueMapCleanup.gowrap1()
/usr/local/go/src/runtime/mgc.go:1799 +0x2f fp=0xc0003047e0 sp=0xc0003047b8 pc=0x435a0f
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003047e8 sp=0xc0003047e0 pc=0x48eca1
created by unique.runtime_registerUniqueMapCleanup in goroutine 1
/usr/local/go/src/runtime/mgc.go:1794 +0x79
goroutine 36 gp=0xc000185180 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000304f38 sp=0xc000304f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000304fc8 sp=0xc000304f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000304fe0 sp=0xc000304fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000304fe8 sp=0xc000304fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 20 gp=0xc0000aa700 m=nil [GC worker (idle)]:
runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc00006f738 sp=0xc00006f718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc00006f7c8 sp=0xc00006f738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc00006f7e0 sp=0xc00006f7c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc00006f7e8 sp=0xc00006f7e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 37 gp=0xc000185340 m=nil [GC worker (idle)]:
runtime.gopark(0x17e6a908c21f?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000305738 sp=0xc000305718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0003057c8 sp=0xc000305738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0003057e0 sp=0xc0003057c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003057e8 sp=0xc0003057e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 50 gp=0xc000102380 m=nil [GC worker (idle)]:
runtime.gopark(0x17e6a9090238?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000300738 sp=0xc000300718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0003007c8 sp=0xc000300738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0003007e0 sp=0xc0003007c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003007e8 sp=0xc0003007e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 51 gp=0xc000102540 m=nil [GC worker (idle)]:
runtime.gopark(0x17e6a9066e73?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000300f38 sp=0xc000300f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000300fc8 sp=0xc000300f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000300fe0 sp=0xc000300fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000300fe8 sp=0xc000300fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 52 gp=0xc000102700 m=nil [GC worker (idle)]:
runtime.gopark(0x17e6a9088da0?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000301738 sp=0xc000301718 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc0003017c8 sp=0xc000301738 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc0003017e0 sp=0xc0003017c8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc0003017e8 sp=0xc0003017e0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 53 gp=0xc0001028c0 m=nil [GC worker (idle)]:
runtime.gopark(0x17e6a9087c16?, 0x0?, 0x0?, 0x0?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000301f38 sp=0xc000301f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000301fc8 sp=0xc000301f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000301fe0 sp=0xc000301fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000301fe8 sp=0xc000301fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
goroutine 38 gp=0xc000185500 m=nil [GC worker (idle)]:
runtime.gopark(0x1f7a980?, 0x1?, 0x8c?, 0xb6?, 0x0?)
/usr/local/go/src/runtime/proc.go:435 +0xce fp=0xc000305f38 sp=0xc000305f18 pc=0x486e0e
runtime.gcBgMarkWorker(0xc000181730)
/usr/local/go/src/runtime/mgc.go:1423 +0xe9 fp=0xc000305fc8 sp=0xc000305f38 pc=0x434d29
runtime.gcBgMarkStartWorkers.gowrap1()
/usr/local/go/src/runtime/mgc.go:1339 +0x25 fp=0xc000305fe0 sp=0xc000305fc8 pc=0x434c05
runtime.goexit({})
/usr/local/go/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000305fe8 sp=0xc000305fe0 pc=0x48eca1
created by runtime.gcBgMarkStartWorkers in goroutine 1
/usr/local/go/src/runtime/mgc.go:1339 +0x105
rax 0x0
rbx 0xa4c6
rcx 0x7f96256a70fc
rdx 0x6
rdi 0xa4c6
rsi 0xa4c6
rbp 0x7f9626d33480
rsp 0x7ffd1e2d4bf0
r8 0x0
r9 0x7ffd1e2d4780
r10 0x8
r11 0x246
r12 0x26b735b0
r13 0x6
r14 0x0
r15 0x7f9612f38790
rip 0x7f96256a70fc
rflags 0x246
cs 0x33
fs 0x0
gs 0x0
time=2025-09-15T19:29:49.167+08:00 level=ERROR source=sched.go:489 msg="error loading llama server" error="llama runner process has terminated: exit status 2"
[GIN] 2025/09/15 - 19:29:49 | 500 | 432.59435ms | 127.0.0.1 | POST "/api/generate"
Metadata
Metadata
Assignees
Labels
No labels