Skip to content

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented Jul 12, 2025

This changes the lowering of floating->integral casts to always replace the cast node with HWIntrinsics rather than doing fixups ahead of the cast and leaving the node in place as a self-cast or letting it be handled in codegen. Since the self-cast was not always eliminated in codegen, this results in some size and throughput improvements.

Because the cast is always replaced now, genFloatToIntCast is no longer necessary on xarch.

This is best viewed with whitespace ignored. Most of the changes are simply an extra level of indentation for the pre-AVX10.2 code in LowerCast.

Diffs

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 12, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jul 12, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

endfunction()

if (CLR_CMAKE_TARGET_ARCH_AMD64 OR CLR_CMAKE_TARGET_ARCH_ARM64 OR (CLR_CMAKE_TARGET_ARCH_I386 AND NOT CLR_CMAKE_HOST_UNIX))
if (CLR_CMAKE_TARGET_ARCH_AMD64 OR CLR_CMAKE_TARGET_ARCH_ARM64 OR CLR_CMAKE_TARGET_ARCH_I386)
Copy link
Member Author

@saucecontrol saucecontrol Jul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this change was missed in #111861

Since floating->integral casts rely on at least baseline intrinsics support, FEATURE_HW_INTRINSICS is no longer optional on x86. Seems linux-x86 has been building because the implementation was #ifdef`ed, but it would have failed at run time.

case NI_AVX512_ConvertToVector128SByteWithSaturation:
case NI_AVX512_ConvertToVector128UInt16:
case NI_AVX512_ConvertToVector128UInt16WithSaturation:
case NI_AVX512_ConvertToVector128UInt32WithSaturation:
Copy link
Member Author

@saucecontrol saucecontrol Jul 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This intrinsic doesn't support float args, so it was misplaced in the previous grouping.

@saucecontrol saucecontrol marked this pull request as ready for review July 13, 2025 00:47
@saucecontrol
Copy link
Member Author

cc @tannergooding

This is more peeled from #116805, with feedback addressed

Copy link
Member

@tannergooding tannergooding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC. @dotnet/jit-contrib, @EgorBo for secondary sign-off

This has a couple correctness fixes in addition to the general codegen improvements, so if we're not comfortable taking the whole thing for .NET 10, we likely still need to peel off the fixes that were called out

@EgorBo EgorBo self-requested a review September 15, 2025 22:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants