Skip to content

Conversation

alex-spacemit
Copy link
Contributor

This pull request adds spacemit backend. Specific optimizations have been made for the SpacemiT X60 CPU. The SpacemiT IME extended instructions are used to accelerate matrix calculations for Q4_0/Q4_1/Q4_K, along with general RVV optimizations.

model name      : Spacemit(R) X60
isa             : rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintpause_zihpm_zfh_zfhmin_zca_zcd_zba_zbb_zbc_zbs_zkt_zve32f_zve32x_zve64d_zve64f_zve64x_zvfh_zvfhmin_zvkt_sscofpmf_sstc_svinval_svnapot_svpbmt
mmu             : sv39
uarch           : spacemit,x60
mvendorid       : 0x710
marchid         : 0x8000000058000001

Q4_0

Model Size Params backend threads test t/s
Qwen2.5 0.5B 403.20 MiB 630.17 M cpu 4 pp512 64.12 ± 0.26
Qwen2.5 0.5B 403.20 MiB 630.17 M cpu 4 tg128 10.03 ± 0.01
Qwen2.5 1.5B 1011.16 MiB 1.78 B cpu 4 pp512 24.16 ± 0.02
Qwen2.5 1.5B 1011.16 MiB 1.78 B cpu 4 tg128 3.83 ± 0.06
Qwen2.5 3B 1.86 GiB 3.40 B cpu 4 pp512 12.08 ± 0.02
Qwen2.5 3B 1.86 GiB 3.40 B cpu 4 tg128 2.23 ± 0.02

more information at build-riscv64-spacemit.md

Change-Id: I249bdc043485d815a9c351867137bc1e27cc2e23
@github-actions github-actions bot added documentation Improvements or additions to documentation build Compilation issues ggml changes relating to the ggml tensor library for machine learning labels Aug 13, 2025
Change-Id: I889ed1c85fb45e62350ecde0c06f70450cadfbe2
Change-Id: I321eb200f859751727afe5cae13074dfce2bb0ce
@alex-spacemit
Copy link
Contributor Author

The file format and the limitations of the riscv zba extension have been resolved. @qnixsynapse

…o add-spacemit-backend

Change-Id: I170a4a9b1b9854e164119ee0844c309b29e9632b
@alex-spacemit
Copy link
Contributor Author

The PR has now been updated to the master. Could you please review it? And will it be merged soon? @ggerganov

Change-Id: Ia20b6ec24a36638e62e0fe07cf100916a7cce3ce
…emit-backend

Change-Id: I9b3c1d31cd495371c56a9e2104f9bf66139469a4
@alex-spacemit
Copy link
Contributor Author

All the comments have been implemented. @ggerganov

@ggerganov
Copy link
Member

ggerganov commented Sep 22, 2025

@alex-spacemit Thanks for the implementation and the interest.

The backend can be accepted, but it would need a few changes to get into mergable state:

  • Adopt the coding and naming style of the project, at the very least in the parts of the code that interface directly with ggml (currently, ime.h and ime.cpp)
  • Provide CI workflows and self-hosted runner(s) to exercise the backend on each commit to master
  • Add codeowner(s) who would be responsible for the backend maintenance in the future

For more information, please check the CONTRIBUTING.md and ci/README.md documents. These were recently updated (#16113, #16116) with additional information about the points above. Let me know if you have any questions.

Change-Id: I5dc33a0412da3d3f2d77075d8939185d3009eca2
Change-Id: I039fb02bb95270e641bc4442204e658735859d43
Change-Id: I711c1033061df1a289ea77891b2997599dfe8279
@github-actions github-actions bot added the devops improvements to build systems and github actions label Sep 23, 2025
…emit-backend

Change-Id: I4c52314a0836a59c85fb5c15afd58110f6dfe2d9
@alex-spacemit
Copy link
Contributor Author

The revision comments are mostly implemented. Here are some sample images. @ggerganov

llama-cli -m Qwen3-0.6B-Q4_K_M.gguf -t 4

img_v3_02qd_f3002f28-1dfc-4774-b9bc-12b98878db7g
img_v3_02qd_3b076c68-eecb-4a0e-ba97-c59f552c63dg
Qwen3-0.6B-Q4_K_M

@CISC
Copy link
Collaborator

CISC commented Sep 23, 2025

developed by Google 🙄

Change-Id: Ifb2b891e2fca57b5da604fce2ac255f27731179a
Change-Id: If0dc3ca30a958631ccca0a28b62e0b825f9fb0c3
Change-Id: Ibf2fa74c1064408974cb5b45f044d40987e5fb45
Change-Id: I80d74909941d41cb9cd09e51d8baf01c985cbfc6
@alex-spacemit
Copy link
Contributor Author

to fix the errors in CI, I added gcc ime instruction extensions for native compile, and cross compile. @ggerganov

@CISC
Copy link
Collaborator

CISC commented Sep 26, 2025

I don't think we need cross compile in addition to native?

@alex-spacemit
Copy link
Contributor Author

since cross compile passed, i will remove native compile later
image
Cloud-V's RISCV Machine seems that not use bianbu repo for apt @CISC @ggerganov

Change-Id: I01920afafdc73fa7424014fd648d243f8ec9e25e
@CISC
Copy link
Collaborator

CISC commented Sep 26, 2025

Cloud-V's RISCV Machine seems that not use bianbu repo for apt @CISC @ggerganov

What do you need? Perhaps @alitariq4589 can resolve it?

@alex-spacemit
Copy link
Contributor Author

Cloud-V's RISCV Machine seems that not use bianbu repo for apt @CISC @ggerganov

What do you need? Perhaps @alitariq4589 can resolve it?
Cloud-V's RISCV Machine

image

For bianbu repo
image

Although the Cloud-V's RISCV Machine also runs on the Bianbu OS and is based on SpacemiT K1, I think it would be better to use the public repository and public gcc-14. I don't want to make any changes to this setup. @CISC

For another thing, the K3 is coming soon, and the native build is not compatible with K1, cross compile would be better.

@alitariq4589
Copy link
Contributor

since cross compile passed, i will remove native compile later

Are you referring to removing the native compile CI for RISC-V? Native compile is added so that the CI can run on RVV1.0 (currently banana pi f3 16GB memory) hardware for performance evaluation with RISC-V vector extension. It is not suitable for emulating RVV1.0 in QEMU and running the CI on it, as that gives no insight into execution on RVV1.0 of hardware

Cloud-V's RISCV Machine seems that not use bianbu repo for apt

Is there a need for fetching apt packages from bianbu repositories? Currently, the builds are running on a podman container using Debian Trixie for reusability and the latest versions of packages released for RISC-V. Bianbu images follow LTS Ubuntu paths, so package versions are lagging a bit (like older versions of GCC and so on, etc.). I can set up the environment so that builds run on bianbu, but this will require us to create a podman/docker container for bianbu and keep it up to date every time upstream is synced with a new version of bianbu (which adds maintenance overhead 🥲 )

Although the Cloud-V's RISCV Machine also runs on the Bianbu OS and is based on SpacemiT K1, I think it would be better to use the public repository and public gcc-14.

We are using public gcc-14, which is in upstream. Are you referring to a specific port of GCC?

$ gcc --version
gcc (Debian 14.2.0-19) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I don't understand how using upstream GCC will cause a problem in native hardware CI builds and how it should work fine with emulation.

For another thing, the K3 is coming soon, and the native build is not compatible with K1, cross compile would be better.

How are optimizations added for K1 going to affect K3 (or vice versa)? I am not sure I get this

@alex-spacemit
Copy link
Contributor Author

since cross compile passed, i will remove native compile later

Are you referring to removing the native compile CI for RISC-V? Native compile is added so that the CI can run on RVV1.0 (currently banana pi f3 16GB memory) hardware for performance evaluation with RISC-V vector extension. It is not suitable for emulating RVV1.0 in QEMU and running the CI on it, as that gives no insight into execution on RVV1.0 of hardware

Cloud-V's RISCV Machine seems that not use bianbu repo for apt

Is there a need for fetching apt packages from bianbu repositories? Currently, the builds are running on a podman container using Debian Trixie for reusability and the latest versions of packages released for RISC-V. Bianbu images follow LTS Ubuntu paths, so package versions are lagging a bit (like older versions of GCC and so on, etc.). I can set up the environment so that builds run on bianbu, but this will require us to create a podman/docker container for bianbu and keep it up to date every time upstream is synced with a new version of bianbu (which adds maintenance overhead 🥲 )

Although the Cloud-V's RISCV Machine also runs on the Bianbu OS and is based on SpacemiT K1, I think it would be better to use the public repository and public gcc-14.

We are using public gcc-14, which is in upstream. Are you referring to a specific port of GCC?

$ gcc --version
gcc (Debian 14.2.0-19) 14.2.0
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I don't understand how using upstream GCC will cause a problem in native hardware CI builds and how it should work fine with emulation.

For another thing, the K3 is coming soon, and the native build is not compatible with K1, cross compile would be better.

How are optimizations added for K1 going to affect K3 (or vice versa)? I am not sure I get this

Bianbu provides an assembly parser in binutils for SpacemiT IME instructions for GCC, such as vmadot, vfwmadot, vmadot1, etc. This binutils is currently only available in Bianbu's repo. so, if you wish to use native compiled backend with SpacemiT IME instructions, you can currently only use Bianbu's repo.

When K3 shows up, binutils that can natively compile IME instructions might be available a little later.By the way, K3 will be more suitable as a native compilation machine. ^^

@alitariq4589
Copy link
Contributor

Bianbu provides an assembly parser in binutils for SpacemiT IME instructions for GCC, such as vmadot, vfwmadot, vmadot1, etc. This binutils is currently only available in Bianbu's repo. so, if you wish to use native compiled backend with SpacemiT IME instructions, you can currently only use Bianbu's repo.

I see... I will try to figure out a way to fetch the GCC from bianbu so that we can use it inside the current CI. I think, till then, it is okay to have it tested inside QEMU. My concern is that if GCC of Bianbu is not pushed to upstream, with time the consequence is that it will lag behind the upstream. If this goes on, then it won't be reliable in the long run.

When K3 shows up, binutils that can natively compile IME instructions might be available a little later.By the way, K3 will be more suitable as a native compilation machine. ^^

No problem 😄 . We are constantly adding new RISC-V compute machines to Cloud-V as they arrive.

@alex-spacemit
Copy link
Contributor Author

is there any other issue with this PR, ready to be merged? @ggerganov

@ggerganov
Copy link
Member

Let's wait for approval from @CISC and @slaren

Change-Id: Ic54a192019a2fd982bbd58225ce3bbc38f4053de
Change-Id: I28c42e10b6fff053bb6580926ca2353448cb042a
@ggerganov ggerganov merged commit b77e6c1 into ggml-org:master Sep 29, 2025
112 of 122 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Compilation issues devops improvements to build systems and github actions documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants