|
| 1 | +> [!IMPORTANT] |
| 2 | +> This build documentation is specific only to RISC-V SpacemiT SOCs. |
| 3 | +
|
| 4 | +## Build llama.cpp locally (for riscv64) |
| 5 | + |
| 6 | +1. Prepare Toolchain For RISCV |
| 7 | +~~~ |
| 8 | +wget https://archive.spacemit.com/toolchain/spacemit-toolchain-linux-glibc-x86_64-v1.1.2.tar.xz |
| 9 | +~~~ |
| 10 | + |
| 11 | +2. Build |
| 12 | +Below is the build script: it requires utilizing RISC-V vector instructions for acceleration. Ensure the `GGML_CPU_RISCV64_SPACEMIT` compilation option is enabled. The currently supported optimization version is `RISCV64_SPACEMIT_IME1`, corresponding to the `RISCV64_SPACEMIT_IME_SPEC` compilation option. Compiler configurations are defined in the `riscv64-spacemit-linux-gnu-gcc.cmake` file. Please ensure you have installed the RISC-V compiler and set the environment variable via `export RISCV_ROOT_PATH={your_compiler_path}`. |
| 13 | +```bash |
| 14 | + |
| 15 | +cmake -B build \ |
| 16 | + -DCMAKE_BUILD_TYPE=Release \ |
| 17 | + -DGGML_CPU_RISCV64_SPACEMIT=ON \ |
| 18 | + -DLLAMA_CURL=OFF \ |
| 19 | + -DGGML_RVV=ON \ |
| 20 | + -DGGML_RV_ZFH=ON \ |
| 21 | + -DGGML_RV_ZICBOP=ON \ |
| 22 | + -DRISCV64_SPACEMIT_IME_SPEC=RISCV64_SPACEMIT_IME1 \ |
| 23 | + -DCMAKE_TOOLCHAIN_FILE=${PWD}/cmake/riscv64-spacemit-linux-gnu-gcc.cmake \ |
| 24 | + -DCMAKE_INSTALL_PREFIX=build/installed |
| 25 | + |
| 26 | +cmake --build build --parallel $(nproc) --config Release |
| 27 | + |
| 28 | +pushd build |
| 29 | +make install |
| 30 | +popd |
| 31 | +``` |
| 32 | + |
| 33 | +## Simulation |
| 34 | +You can use QEMU to perform emulation on non-RISC-V architectures. |
| 35 | + |
| 36 | +1. Download QEMU |
| 37 | +~~~ |
| 38 | +wget https://archive.spacemit.com/spacemit-ai/qemu/jdsk-qemu-v0.0.14.tar.gz |
| 39 | +~~~ |
| 40 | + |
| 41 | +2. Run Simulation |
| 42 | +After build your llama.cpp, you can run the executable file via QEMU for simulation, for example: |
| 43 | +~~~ |
| 44 | +export QEMU_ROOT_PATH={your QEMU file path} |
| 45 | +export RISCV_ROOT_PATH_IME1={your RISC-V compiler path} |
| 46 | +
|
| 47 | +${QEMU_ROOT_PATH}/bin/qemu-riscv64 -L ${RISCV_ROOT_PATH_IME1}/sysroot -cpu max,vlen=256,elen=64,vext_spec=v1.0 ${PWD}/build/bin/llama-cli -m ${PWD}/models/Qwen2.5-0.5B-Instruct-Q4_0.gguf -t 1 |
| 48 | +~~~ |
| 49 | +## Performance |
| 50 | +#### Quantization Support For Matrix |
| 51 | +~~~ |
| 52 | +model name : Spacemit(R) X60 |
| 53 | +isa : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintpause_zihpm_zfh_zfhmin_zca_zcd_zba_zbb_zbc_zbs_zkt_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_zvkt_sscofpmf_sstc_svinval_svnapot_svpbmt |
| 54 | +mmu : sv39 |
| 55 | +uarch : spacemit,x60 |
| 56 | +mvendorid : 0x710 |
| 57 | +marchid : 0x8000000058000001 |
| 58 | +~~~ |
| 59 | + |
| 60 | +Q4_0 |
| 61 | +| Model | Size | Params | backend | threads | test | t/s | |
| 62 | +| -----------| -------- | ------ | ------- | ------- | ---- |------| |
| 63 | +Qwen2.5 0.5B |403.20 MiB|630.17 M| cpu | 4 | pp512|64.12 ± 0.26| |
| 64 | +Qwen2.5 0.5B |403.20 MiB|630.17 M| cpu | 4 | tg128|10.03 ± 0.01| |
| 65 | +Qwen2.5 1.5B |1011.16 MiB| 1.78 B | cpu | 4 | pp512|24.16 ± 0.02| |
| 66 | +Qwen2.5 1.5B |1011.16 MiB| 1.78 B | cpu | 4 | tg128|3.83 ± 0.06| |
| 67 | +Qwen2.5 3B | 1.86 GiB | 3.40 B | cpu | 4 | pp512|12.08 ± 0.02| |
| 68 | +Qwen2.5 3B | 1.86 GiB | 3.40 B | cpu | 4 | tg128|2.23 ± 0.02| |
| 69 | + |
| 70 | +Q4_1 |
| 71 | +| Model | Size | Params | backend | threads | test | t/s | |
| 72 | +| -----------| -------- | ------ | ------- | ------- | ---- |------| |
| 73 | +Qwen2.5 0.5B |351.50 MiB|494.03 M| cpu | 4 | pp512|62.07 ± 0.12| |
| 74 | +Qwen2.5 0.5B |351.50 MiB|494.03 M| cpu | 4 | tg128|9.91 ± 0.01| |
| 75 | +Qwen2.5 1.5B |964.06 MiB| 1.54 B | cpu | 4 | pp512|22.95 ± 0.25| |
| 76 | +Qwen2.5 1.5B |964.06 MiB| 1.54 B | cpu | 4 | tg128|4.01 ± 0.15| |
| 77 | +Qwen2.5 3B | 1.85 GiB | 3.09 B | cpu | 4 | pp512|11.55 ± 0.16| |
| 78 | +Qwen2.5 3B | 1.85 GiB | 3.09 B | cpu | 4 | tg128|2.25 ± 0.04| |
| 79 | + |
| 80 | + |
| 81 | +Q4_K |
| 82 | +| Model | Size | Params | backend | threads | test | t/s | |
| 83 | +| -----------| -------- | ------ | ------- | ------- | ---- |------| |
| 84 | +Qwen2.5 0.5B |462.96 MiB|630.17 M| cpu | 4 | pp512|9.29 ± 0.05| |
| 85 | +Qwen2.5 0.5B |462.96 MiB|630.17 M| cpu | 4 | tg128|5.67 ± 0.04| |
| 86 | +Qwen2.5 1.5B | 1.04 GiB | 1.78 B | cpu | 4 | pp512|10.38 ± 0.10| |
| 87 | +Qwen2.5 1.5B | 1.04 GiB | 1.78 B | cpu | 4 | tg128|3.17 ± 0.08| |
| 88 | +Qwen2.5 3B | 1.95 GiB | 3.40 B | cpu | 4 | pp512|4.23 ± 0.04| |
| 89 | +Qwen2.5 3B | 1.95 GiB | 3.40 B | cpu | 4 | tg128|1.73 ± 0.00| |
0 commit comments