Skip to content

Conversation

0cc4m
Copy link
Collaborator

@0cc4m 0cc4m commented Aug 4, 2024

I fixed Vulkan quantized matrix vector multiplication test failure on AMD GPUs (warp size 64) when there are not enough blocks to fill the warp. This was caught by the tests added in #8800 , but I noticed that for k-quants they run the same test twice, so I added a check whether the new test is actually required. Let me know if that's okay.

@github-actions github-actions bot added the testing Everything test related label Aug 4, 2024
@0cc4m 0cc4m changed the title 0cc4m/vulkan fix mmv tests Fix Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Aug 4, 2024
@JohannesGaessler JohannesGaessler added the Vulkan Issues specific to the Vulkan backend label Aug 4, 2024
@0cc4m 0cc4m changed the title Fix Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Fix Vulkan Quantized Matrix Vector Multiplication on AMD GPUs when ncols < 64 Aug 5, 2024
@ggerganov ggerganov merged commit 064cdc2 into master Aug 5, 2024
@0cc4m 0cc4m deleted the 0cc4m/vulkan-fix-mmv-tests branch August 5, 2024 06:03
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Aug 7, 2024
…g#8855)

* Fix Vulkan mul mat vec invalid results when ncols < warp size

* Only run backend ops mul mat vec block size test if block size not already covered
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing Everything test related Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants