-
Notifications
You must be signed in to change notification settings - Fork 13.2k
ggml : implement set_rows with i32 index #16159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Vulkan does not currently support this. See this code:
It's extracting the LSBs of the 64b index using a uvec2. We'd need to add a variant using uint here. |
Yep, just noticed, can you help me implement it? |
Sure. Do you want pointers or do you want me to make a PR? |
If you want to make a PR that would be great. |
Sure, will do soon. |
CANN doesn’t support this yet, but adding support shouldn’t be too difficult. I’d be happy to work on it. |
I found that our kernel already supports index with I32, so there’s no additional work needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for SYCL.
Other than the compiler warnings (see comments), everything looks good for OpenCL. |
warnings--
Co-authored-by: Georgi Gerganov <[email protected]>
Don't let Vulkan down. |
Vulkan is covered by a separate PR: #16162 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Metal changes are good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CUDA changes look correct to me but please add a template function to switch either one of the types instead of 2-layered if else statements.
Great suggestion, I'll do that for SYCL too, thanks! |
Hmmm, why doesn't |
I think it is |
set_rows isn't fully implemented in the WebGPU backend, but I will make sure i32 indexes are supported when it is fully implemented |
* origin/master: (39 commits) ci : disable AMD workflows + update NVIDIA workflows (ggml-org#16200) ci : enable Vulkan workflow on Mac (ggml-org#16194) ggml-cpu: Respect cpumask settings (ggml-org#16164) ggml : fix uninitialized is_on_grid in quantize_row_iq3_xxs_impl (ggml-org#15928) zdnn: refactor codebase + add docs (ggml-org#16178) codeowners : add @danbev to model-conversion example [no ci] (ggml-org#16190) devops: add s390x containers (ggml-org#15915) ggml-cpu : fix typo in gemm comments [no ci] (ggml-org#16189) feat: Add conversion support in GraniteHybrid for non-hybrid (all attn) (ggml-org#16177) clang-tidy : disable warning about performance enum size (ggml-org#16127) ggml : implement set_rows with i32 index (ggml-org#16159) codeowners : update + cleanup (ggml-org#16174) common : enable `--offline` mode without curl support (ggml-org#16137) webui : fix handling incomplete chunks (ggml-org#16107) embedding : fix typos in README (ggml-org#16171) common : remove unused local variables (ggml-org#16140) ggml : extend ggml_can_fuse to work with non-sequential nodes (ggml-org#16123) ggml : add ggml_op_is_empty (ggml-org#16122) codeowners : update ownership for @ngxson and @allozuar (ggml-org#16128) Vulkan: add conv_transpose_2d operation (ggml-org#16022) ...
* implement set_rows with i32 index * template fix * test quantized path warnings-- * Apply suggestions from code review Co-authored-by: Georgi Gerganov <[email protected]> * forgotten name change * deduplicate cuda/sycl and test-fix * indent++ * vulkan: support set_rows with i32 index type (ggml-org#16162) * disable i32 index for webgpu for now --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: Jeff Bolz <[email protected]>
Implements support for I32 index in
set_rows
, added as many backends as I could.Fixes #16001