LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230

iclsrc · 2025-09-27T02:54:52Z

LLVM: llvm/llvm-project@2f755c5
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@3d58c69

Avoid using extends, and adding the high and low half and use extadd_pairwise instead.

…any order (#157407) I have seen some flakiness in this test where the 2 checked strings appear in a different order. Due to buffering of writes, and that one of these strings is written during the signal handler, I think this is valid. This PR relaxes the test to allow those strings to appear in either order.

Checking `isOperationLegalOrCustom` instead of `isOperationLegal` allows more optimization opportunities. In particular, if a target wants to mark `extract_vector_elt` as `Custom` rather than `Legal` in order to optimize some certain cases, this combiner would otherwise miss some improvements. Previously, using `isOperationLegalOrCustom` was avoided due to the risk of getting stuck in infinite loops (as noted in llvm/llvm-project@61ec738). After testing, the issue no longer reproduces, but the coverage is limited to the regression/unit tests and the test-suite.

In simplifyBlends, when normalizing a blend recipe, the first mask that is used only by the blend and is not all-false is chosen, and its corresponding incoming value becomes the initial value, with the others blended into it. At the same time, the mask that is chosen can be eliminated. However, a multi-user mask might be used by a dead recipe, which prevents this optimization. This patch moves removeDeadRecipes before simplifyBlends to eliminate dead recipes, allowing simplifyBlends to remove more dead masks.

Implements the base of the MemoryLegalizer for a roughly correct GFX1250 memory model. Documentation will come later, and some remaining changes still have to be added, but this is the backbone of the model.

… (#157639) This reverts commit be17791. This is not necessary for gfx1250 anymore.

…#156714) Match `(X * Y) + Z` in `combineAdd`. If target supports and we don't overflow (ie. we know the top 12 bits are unset), rewrite using VPMADD52L Have just done the `L` version for now at least, wanted to get feedback before continuing

Apply the following changes: * Ensure all float types are covered (`f16` and `f128` were often missing) * Switch to more straightforward test names * Remove some CHECK directives that are outdated (prefix changed but the directive did not get removed) * Add common check prefixes to merge similar blocks * Test a more similar set of platforms * Add missing `nounwind` * Test `strictfp` for each libcall where possible This is a pre-test for [1]. [1]: llvm/llvm-project#152684

Align the syntax used for the optimization level argument of the expand-fp pass in textual descriptions of pass pipelines with the syntax used by other passes taking a similar argument. That is, use e.g. `expand-fp<O1>` instead of `expand-fp<opt-level=1>`.

`f16` is more functional than just a storage type on the platform, though it does have some codegen issues [1]. To prepare for future changes, do the following nonfunctional updates to the existing `half` test: * Add tests for passing and returning the type directly. * Add tests showing bitcast behavior, which is currently incorrect but serves as a baseline. * Add tests for `fabs` and `copysign` (trivial operations that shouldn't require libcalls). * Add invocations for big-endian and for PPC32. * Rename the test to `half.ll` to reflect its status, which also matches other backends. [1]: llvm/llvm-project#97975

[AMDGPU] Treat GEP offsets as signed in AMDGPUPromoteAlloca AMDGPUPromoteAlloca can transform i32 GEP offsets that operate on allocas into i64 extractelement indices. Before this patch, negative GEP offsets would be zero-extended, leading to wrong extractelement indices with values around (2**32-1). This fixes failing LlvmLibcCharacterConverterUTF32To8Test tests for AMDGPU.

…7702) The default values for DebugLocs in LoopVectorizer/VPlan were recently updated from empty DebugLocs to DebugLoc::getUnknown, as part of the DebugLoc Coverage Tracking work. However, there are some cases where we also pass an explicit empty DebugLoc, in many cases as a filler argument. This patch updates all of these to `getUnknown` for now, until either valid locations or a suitable categorization can be assigned to each instead. This change is NFC outside of DebugLoc coverage tracking builds.

`Predicates` and `Features` fields serve the same purpose. They should be kept in sync, but not all predicates are based on features. This resulted in introducing dummy features for that only reason. This patch removes `Features` field and changes TableGen emitters to use `Predicates` instead. Historically, predicates were written with the assumption that the checking code will be used in `SelectionDAGISel` subclasses, meaning they will have access to the subclass variables, such as `Subtarget`. There are no such variables in the generated `GenSubtargetInfo::getHwModeSet()`, so we need to provide them. This can be achieved by subclassing `HwModePredicateProlog`, see an example in `Hexagon.td`.

Defaults to "agent" for targets that do not support it. - Add documentation - Register it in MachineModuleInfo - Add MemoryLegalizer support

This test is strange since it's full of decoding failure warnings

…(#157304) * Apply the typo fix as a separate NFC patch from here: https://github.com/llvm/llvm-project/pull/134330/files#r2313015079

Make sure the tested error is the literal error, not for unaligned registers.

This will add label `clang:temporal-safety` to PRs touching the mentioned files.

Removing statefullness also adds the benefit of short circuiting.

During CSE, we don't have to drop all poison-generating flags on mis-match, we can keep the ones common on both recipes. PR: llvm/llvm-project#157664

The test introduced by PR #157408 requires the amdgpu target. Move it to the subdirectory which only runs if the target is available.

The limit 'dfa-max-num-paths' that is used to control number of enumerated paths was not checked against inside getPathsFromStateDefMap. It may lead to large memory consumption for complex enough switch statements. Reland llvm/llvm-project#145482

Required for `BUILD_SHARED_LIBS=ON` builds with optimizations disabled for the new FortranUtils library. Also see #150027 #155422

There are no references to it anymore in the codebase.

…(#157716) In PromoteMem2Reg, we perform a DFS over the CFG and track, for each alloca, its incoming value and its associated incoming DebugLoc, both of which are taken from stores to that alloca; these values and DebugLocs are propagated to PHI nodes when new blocks are reached. In the event that for one incoming edge no store instruction has been seen, we propagate an UndefValue and an empty DebugLoc to the PHI. This is a perfectly valid occurrence, and assigning an empty DebugLoc to the PHI is the correct course of action; therefore, we should pass an annotated DebugLoc instead, so that in DebugLoc coverage tracking we correctly do not expect a valid DebugLoc to be present; we generally mark allocas as having CompilerGenerated locations, so I've chosen to use the same annotation to represent the uninitialized value of that alloca. This change is NFC outside of DebugLoc coverage tracking builds.

…0169) Supporting Min/Max Operations: `min`, `max`, `umin`, `umax`

This interface isn't used anywhere anymore.

Despite several hotfixes, things remain broken, in particular: - installation/distribution (`ninja install / install-distribution`); - downstream projects with bindings exposed. See llvm/llvm-project#157583 (comment) for more details. Reverts #155741, #157583, #157697. Let's make sure things are fixed and re-land as a unit.

jsji · 2025-09-30T22:15:46Z

This is ready for review.

[NFC] XFAIL KernelAndProgram/free_function_kernels.cpp
@intel/llvm-reviewers-runtime
[NFC] Use wildcards instead in file-table-tform-merge.test
@intel/dpcpp-tools-reviewers
[NFC] Update datalayout in nvptx-short-ptr.cpp after e5948b4
@intel/llvm-reviewers-cuda

maarquitos14 · 2025-10-01T09:42:20Z

llvm/test/tools/file-table-tform/file-table-tform-merge.test

@@ -1,5 +1,5 @@
 -- Merge two tables listed in merge-input.txt
-RUN: sh -c "cp %/S/Inputs/a{0,1}.txt ."
+RUN: sh -c "cp %/S/Inputs/a[01].txt ."


Why do we need this change? Is there any case where we expect only one file to exist? It's the only scenario where I would expect this change to be meaningful. Am I missing anything?

We are now using sh instead of builtin shell after llvm/llvm-project#158465 , the syntax {0,1} was for bash only.

The test pass locally when my shell is bash, but failing remote in CI where the shell is not bash (I think).

i approved for tools based on the response above, marcos if you don't agree with the change please follow up with jinsong offline, thanks

Works for me. Just wanted to understand the reason behind the change.

sarnex · 2025-10-01T17:50:07Z

/merge

bb-sycl · 2025-10-01T17:50:31Z

Wed 01 Oct 2025 05:50:30 PM UTC --- Merge failed with error: PR is not clean for merge. Please examine ci check status before merge.

Conflicts: .github/workflows/llvm-project-tests.yml

sarnex · 2025-10-01T17:53:04Z

/merge

bb-sycl · 2025-10-01T17:53:49Z

Wed 01 Oct 2025 05:53:48 PM UTC --- Start to merge the commit into sycl branch. It will take several minutes.

bb-sycl · 2025-10-01T18:05:46Z

Wed 01 Oct 2025 06:05:45 PM UTC --- Merge the branch in this PR to base automatically. Will close the PR later.

pfusik and others added 30 commits September 10, 2025 09:09

[RISCV][NFC] Fix a misnamed variable (#157686)

660441a

[WebAssembly] extadd_pairwise for PartialReduce (#157669)

6dacdc3

Avoid using extends, and adding the high and low half and use extadd_pairwise instead.

[AMDGPU][gfx1250] Implement SIMemoryLegalizer (#154726)

bed9be9

Implements the base of the MemoryLegalizer for a roughly correct GFX1250 memory model. Documentation will come later, and some remaining changes still have to be added, but this is the backbone of the model.

Revert "[AMDGPU][gfx1250] Add cu-store subtarget feature (#150588)"…

dcaa29c

… (#157639) This reverts commit be17791. This is not necessary for gfx1250 anymore.

[AMDGPU][gfx1250] Remove SCOPE_SE for scratch stores (#157640)

d6d0f4f

[AMDGPU][gfx1250] Support "cluster" syncscope (#157641)

49a898f

Defaults to "agent" for targets that do not support it. - Add documentation - Register it in MachineModuleInfo - Add MemoryLegalizer support

AMDGPU: Update baseline test checks in disassembler test (#157816)

1f53cc0

This test is strange since it's full of decoding failure warnings

[NFC][libc++] Fix typo in libcxx/include/__memory/pointer_traits.h …

0b28614

…(#157304) * Apply the typo fix as a separate NFC patch from here: https://github.com/llvm/llvm-project/pull/134330/files#r2313015079

AMDGPU: Fix using unaligned VGPR in literal test (#157817)

8f7e8c4

Make sure the tested error is the literal error, not for unaligned registers.

[LifetimeSafety] Add PR labeler automation (#157820)

0b696a8

This will add label `clang:temporal-safety` to PRs touching the mentioned files.

[analyzer][NFC] Modernize LivenessValues::isLive (#157800)

2fb29f8

Removing statefullness also adds the benefit of short circuiting.

[VPlan] Keep common flags during CSE. (#157664)

c3e76b2

During CSE, we don't have to drop all poison-generating flags on mis-match, we can keep the ones common on both recipes. PR: llvm/llvm-project#157664

[AMDGPU] Fix PR #157408 test failures (#157823)

b40d233

The test introduced by PR #157408 requires the amdgpu target. Move it to the subdirectory which only runs if the target is available.

[Flang][Utils] Fix BUILD_SHARED_LIBS build (#157828)

731ba68

Required for `BUILD_SHARED_LIBS=ON` builds with optimizations disabled for the new FortranUtils library. Also see #150027 #155422

[libc++] Remove the unused cat_files.py script (#157744)

e03fcce

There are no references to it anymore in the codebase.

[AMDGPU] Extending wave reduction intrinsics for i64 types - 1 (#15…

85fb1f1

…0169) Supporting Min/Max Operations: `min`, `max`, `umin`, `umax`

[MLIR] Remove CopyOpInterface (#157711)

6364707

This interface isn't used anywhere anymore.

jsji force-pushed the llvmspirv_pulldown branch from cfed77f to 97a53f8 Compare September 30, 2025 13:04

jsji temporarily deployed to WindowsCILock September 30, 2025 13:04 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock September 30, 2025 13:05 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock September 30, 2025 13:54 — with GitHub Actions Inactive

jsji had a problem deploying to WindowsCILock September 30, 2025 15:20 — with GitHub Actions Failure

jsji temporarily deployed to WindowsCILock September 30, 2025 15:20 — with GitHub Actions Inactive

jsji had a problem deploying to WindowsCILock September 30, 2025 17:58 — with GitHub Actions Failure

jsji had a problem deploying to WindowsCILock September 30, 2025 18:21 — with GitHub Actions Failure

jsji temporarily deployed to WindowsCILock September 30, 2025 22:16 — with GitHub Actions Inactive

maarquitos14 reviewed Oct 1, 2025

View reviewed changes

sarnex approved these changes Oct 1, 2025

View reviewed changes

bb-sycl approved these changes Oct 1, 2025

View reviewed changes

Merge remote-tracking branch 'origin/sycl' into llvmspirv_pulldown

90983fd

Conflicts: .github/workflows/llvm-project-tests.yml

jsji temporarily deployed to WindowsCILock October 1, 2025 17:52 — with GitHub Actions Inactive

jsji had a problem deploying to WindowsCILock October 1, 2025 17:53 — with GitHub Actions Failure

bb-sycl approved these changes Oct 1, 2025

View reviewed changes

bb-sycl merged commit 90983fd into sycl Oct 1, 2025
36 of 41 checks passed

bb-sycl temporarily deployed to WindowsCILock October 1, 2025 18:06 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock October 1, 2025 18:33 — with GitHub Actions Inactive

bb-sycl temporarily deployed to WindowsCILock October 1, 2025 19:13 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock October 1, 2025 19:24 — with GitHub Actions Inactive

sarnex added a commit that referenced this pull request Oct 1, 2025

LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) (#20230)

01d3adc

jsji deleted the llvmspirv_pulldown branch October 1, 2025 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230

LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230

Uh oh!

iclsrc commented Sep 27, 2025

Uh oh!

jsji commented Sep 30, 2025

Uh oh!

maarquitos14 Oct 1, 2025

Uh oh!

jsji Oct 1, 2025 •

edited

Loading

Uh oh!

jsji Oct 1, 2025

Uh oh!

sarnex Oct 1, 2025

Uh oh!

maarquitos14 Oct 2, 2025

Uh oh!

sarnex commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

sarnex commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230

LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230

Uh oh!

Conversation

iclsrc commented Sep 27, 2025

Uh oh!

jsji commented Sep 30, 2025

Uh oh!

maarquitos14 Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

jsji Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jsji Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

sarnex Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

maarquitos14 Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

sarnex commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

sarnex commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

bb-sycl commented Oct 1, 2025

Uh oh!

Uh oh!

Uh oh!

jsji Oct 1, 2025 •

edited

Loading