-
Notifications
You must be signed in to change notification settings - Fork 13.2k
vulkan: Add ACC_TYPE_VEC2 implementation #16203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Stefan Savic <[email protected]>
Here are performance results from my devices. It's very good for Nvidia Ampere (which won't be using the code in practice due to coopmat), but neutral or negative on AMD. Not sure why this is. RTX 3090 without coopmat or integer dot
AMD Radeon Pro VII without integer dot
Intel A770 without integer dot
|
This PR adds the implementation for
ACC_TYPE_VEC2
. This change, with non-coopmat
shaders, usingACC_TYPE_VEC2
improves caching behavior, as accessing 32-bit values is generally more efficient than accessing 16-bit values.Performance Comparison (Without
coopmat
andcoopmat2
) NVIDIA GeForce RTX 4060 TiPerformance before(Without
coopmat
andcoopmat2
) NVIDIA GeForce RTX 4060 TiPerformance after(Without
coopmat
andcoopmat2
) NVIDIA GeForce RTX 4060 Ti