Skip to content

🙏 Instruction provide please for build llama.cpp and ollama #13309

@savvadesogle

Description

@savvadesogle

Dear ipex-llm development team. Please, provide compilation instructions for llama.cpp and ollama from the source codes.

We really need your help with this.

Please share the flags or documentation on how to achieve the same performance as in ipex-llm (portable).

xeon 2699v3
A770 (16gb)
Windows 11 (Version 10.0.22631.5189)
driver 7029

default 190W

model size params backend ngl fa test t/s
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 0 pp512 903.49 + 3.16
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 0 tg128 50.90 + 0.06
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 1 pp512 363.82 + 0.95
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 1 tg128 48.30 + 0.09

updated to 228W

model size params backend ngl fa test t/s
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 0 pp512 986.03 + 6.03
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 0 tg128 52.04 + 0.20
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 1 pp512 372.22 + 0.79
llama 7B Q4_0 3.56 GiB 6.74 B RPC,Vulkan 100 1 tg128 49.66 + 0.17

build: 5bb4a3ed (6528)

SYCL (llama.cpp)

model size params backend ngl fa test t/s
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 0 pp512 1866.14 + 2.85
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 0 tg128 52.62 + 0.30
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 1 pp512 480.14 + 4.71
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 1 tg128 13.23 + 0.02

build: 1eeb523c (6529)

SYCL (ipex-llm)

model size params backend ngl fa test t/s
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 0 pp512 2564.25 + 3.71
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 0 tg128 72.54 + 0.12
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 1 pp512 2545.16 + 40.50
llama 7B Q4_0 3.56 GiB 6.74 B SYCL 100 1 tg128 71.32 + 0.16

build: d2c8ed1 (1)

@cyita @rnwang04 @liu-shaojun

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions