Skip to content

Conversation

CISC
Copy link
Collaborator

@CISC CISC commented Sep 21, 2025

Implements support for I32 index in set_rows, added as many backends as I could.

  • CPU
  • CUDA
  • Metal
  • OpenCL
  • SYCL
  • Vulkan
  • WebGPU - Couldn't quite grasp how to implement it. @reeselevine
  • CANN - Not sure if it already supports it? @noemotiovon

Fixes #16001

@github-actions github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language Apple Metal https://en.wikipedia.org/wiki/Metal_(API) OpenCL Issues specific to the OpenCL backend labels Sep 21, 2025
@jeffbolznv
Copy link
Collaborator

Vulkan does not currently support this. See this code:

#if defined(SET_ROWS)
#include "generic_binary_head.comp"
layout (binding = 1) readonly buffer C {uvec2 data_i[];};

It's extracting the LSBs of the 64b index using a uvec2. We'd need to add a variant using uint here.

@CISC
Copy link
Collaborator Author

CISC commented Sep 21, 2025

Vulkan does not currently support this. See this code:

Yep, just noticed, can you help me implement it?

@jeffbolznv
Copy link
Collaborator

Sure. Do you want pointers or do you want me to make a PR?

@CISC
Copy link
Collaborator Author

CISC commented Sep 21, 2025

Sure. Do you want pointers or do you want me to make a PR?

If you want to make a PR that would be great.

@jeffbolznv
Copy link
Collaborator

Sure, will do soon.

@noemotiovon
Copy link
Collaborator

CANN doesn’t support this yet, but adding support shouldn’t be too difficult. I’d be happy to work on it.
Thanks a lot for your contribution and for pointing this out regarding the CANN backend!

@noemotiovon
Copy link
Collaborator

I found that our kernel already supports index with I32, so there’s no additional work needed.

Copy link
Collaborator

@qnixsynapse qnixsynapse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for SYCL.

@lhez
Copy link
Collaborator

lhez commented Sep 22, 2025

Other than the compiler warnings (see comments), everything looks good for OpenCL.

@foldl
Copy link
Contributor

foldl commented Sep 22, 2025

Don't let Vulkan down.

@theo77186
Copy link

Don't let Vulkan down.

Vulkan is covered by a separate PR: #16162

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Metal changes are good.

Copy link
Collaborator

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CUDA changes look correct to me but please add a template function to switch either one of the types instead of 2-layered if else statements.

@CISC
Copy link
Collaborator Author

CISC commented Sep 22, 2025

The CUDA changes look correct to me but please add a template function to switch either one of the types instead of 2-layered if else statements.

Great suggestion, I'll do that for SYCL too, thanks!

@CISC
Copy link
Collaborator Author

CISC commented Sep 22, 2025

Hmmm, why doesn't [no-ci] work anymore?

@CISC CISC requested a review from 0cc4m as a code owner September 22, 2025 14:08
@github-actions github-actions bot added the Vulkan Issues specific to the Vulkan backend label Sep 22, 2025
@ggerganov
Copy link
Member

Hmmm, why doesn't [no-ci] work anymore?

I think it is [no ci]

@reeselevine
Copy link
Collaborator

set_rows isn't fully implemented in the WebGPU backend, but I will make sure i32 indexes are supported when it is fully implemented

@CISC CISC requested a review from slaren as a code owner September 22, 2025 17:06
@CISC CISC removed request for slaren and 0cc4m September 22, 2025 17:08
@CISC CISC merged commit 3ecb2f6 into master Sep 22, 2025
1 check passed
@CISC CISC deleted the cisc/set-rows-i32-idx branch September 22, 2025 17:13
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Sep 23, 2025
* origin/master: (39 commits)
ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200)
ci : enable Vulkan workflow on Mac (ggml-org#16194)
ggml-cpu: Respect cpumask settings (ggml-org#16164)
ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928)
zdnn: refactor codebase + add docs (ggml-org#16178)
codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190)
devops: add s390x containers (ggml-org#15915)
ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189)
feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177)
clang-tidy : disable warning about performance enum size (ggml-org#16127)
ggml : implement set_rows with i32 index (ggml-org#16159)
codeowners : update + cleanup (ggml-org#16174)
common : enable `--offline` mode without curl support (ggml-org#16137)
webui : fix handling incomplete chunks (ggml-org#16107)
embedding : fix typos in README (ggml-org#16171)
common : remove unused local variables (ggml-org#16140)
ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123)
ggml : add ggml_op_is_empty (ggml-org#16122)
codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128)
Vulkan: add conv_transpose_2d operation (ggml-org#16022)
...
struct pushed a commit to struct/llama.cpp that referenced this pull request Sep 26, 2025
* implement set_rows with i32 index

* template fix

* test quantized path

warnings--

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

* forgotten name change

* deduplicate cuda/sycl and test-fix

* indent++

* vulkan: support set_rows with i32 index type (ggml-org#16162)

* disable i32 index for webgpu for now

---------

Co-authored-by: Georgi Gerganov <[email protected]>
Co-authored-by: Jeff Bolz <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs OpenCL Issues specific to the OpenCL backend SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

result type of argsort: make it I64?
10 participants