-
Notifications
You must be signed in to change notification settings - Fork 5.2k
JIT: Move remaining xarch floating->integral cast implementation to lowering #117571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
endfunction() | ||
|
||
if (CLR_CMAKE_TARGET_ARCH_AMD64 OR CLR_CMAKE_TARGET_ARCH_ARM64 OR (CLR_CMAKE_TARGET_ARCH_I386 AND NOT CLR_CMAKE_HOST_UNIX)) | ||
if (CLR_CMAKE_TARGET_ARCH_AMD64 OR CLR_CMAKE_TARGET_ARCH_ARM64 OR CLR_CMAKE_TARGET_ARCH_I386) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this change was missed in #111861
Since floating->integral casts rely on at least baseline intrinsics support, FEATURE_HW_INTRINSICS
is no longer optional on x86. Seems linux-x86 has been building because the implementation was #ifdef`ed, but it would have failed at run time.
case NI_AVX512_ConvertToVector128SByteWithSaturation: | ||
case NI_AVX512_ConvertToVector128UInt16: | ||
case NI_AVX512_ConvertToVector128UInt16WithSaturation: | ||
case NI_AVX512_ConvertToVector128UInt32WithSaturation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This intrinsic doesn't support float args, so it was misplaced in the previous grouping.
This is more peeled from #116805, with feedback addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC. @dotnet/jit-contrib, @EgorBo for secondary sign-off
This has a couple correctness fixes in addition to the general codegen improvements, so if we're not comfortable taking the whole thing for .NET 10, we likely still need to peel off the fixes that were called out
This changes the lowering of floating->integral casts to always replace the cast node with HWIntrinsics rather than doing fixups ahead of the cast and leaving the node in place as a self-cast or letting it be handled in codegen. Since the self-cast was not always eliminated in codegen, this results in some size and throughput improvements.
Because the cast is always replaced now,
genFloatToIntCast
is no longer necessary on xarch.This is best viewed with whitespace ignored. Most of the changes are simply an extra level of indentation for the pre-AVX10.2 code in
LowerCast
.Diffs