-
Notifications
You must be signed in to change notification settings - Fork 807
LLVM and SPIRV-LLVM-Translator pulldown (WW39 2025) #20230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
iclsrc
wants to merge
3,392
commits into
sycl
Choose a base branch
from
llvmspirv_pulldown
base: sycl
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
These instructions have a eqz/nez operand like Zicond and XVentanaCondOps so the goal of using bexti seems applicable to them as well.
… (#154770) In relation to the approval and merge of the llvm/llvm-project#76088 specification about multi-image features in Flang. Here is a PR on adding support of the collectives CO_BROADCAST, CO_SUM, CO_MIN and CO_MAX in conformance with the PRIF specification. --------- Co-authored-by: Dan Bonachea <[email protected]>
…ss (#157798) Fixes regression casued by #156817.
This PR deletes the `createLowerGpuOpsToROCDLOpsPass` constructor from the .td file, making the `createConvertGpuOpsToROCDLOps` pass available to users. This has the following effects: 1. `createLowerGpuOpsToROCDLOpsPass` is not available anymore. Instead, `createConvertGpuOpsToROCDLOps` should be used. This makes the interface consistent with ConvertGpuOpsToNVVMOps. 2. To call `createConvertGpuOpsToROCDLOps`, the options must be passed via ConvertGpuOpsToROCDLOpsOptions. This has the side effect of making the `allowed-dialects` option available, which was not accessible via C++ before.
Avoid using extends, and adding the high and low half and use extadd_pairwise instead.
…any order (#157407) I have seen some flakiness in this test where the 2 checked strings appear in a different order. Due to buffering of writes, and that one of these strings is written during the signal handler, I think this is valid. This PR relaxes the test to allow those strings to appear in either order.
Checking `isOperationLegalOrCustom` instead of `isOperationLegal` allows more optimization opportunities. In particular, if a target wants to mark `extract_vector_elt` as `Custom` rather than `Legal` in order to optimize some certain cases, this combiner would otherwise miss some improvements. Previously, using `isOperationLegalOrCustom` was avoided due to the risk of getting stuck in infinite loops (as noted in llvm/llvm-project@61ec738). After testing, the issue no longer reproduces, but the coverage is limited to the regression/unit tests and the test-suite.
In simplifyBlends, when normalizing a blend recipe, the first mask that is used only by the blend and is not all-false is chosen, and its corresponding incoming value becomes the initial value, with the others blended into it. At the same time, the mask that is chosen can be eliminated. However, a multi-user mask might be used by a dead recipe, which prevents this optimization. This patch moves removeDeadRecipes before simplifyBlends to eliminate dead recipes, allowing simplifyBlends to remove more dead masks.
Implements the base of the MemoryLegalizer for a roughly correct GFX1250 memory model. Documentation will come later, and some remaining changes still have to be added, but this is the backbone of the model.
… (#157639) This reverts commit be17791. This is not necessary for gfx1250 anymore.
…#156714) Match `(X * Y) + Z` in `combineAdd`. If target supports and we don't overflow (ie. we know the top 12 bits are unset), rewrite using VPMADD52L Have just done the `L` version for now at least, wanted to get feedback before continuing
Apply the following changes: * Ensure all float types are covered (`f16` and `f128` were often missing) * Switch to more straightforward test names * Remove some CHECK directives that are outdated (prefix changed but the directive did not get removed) * Add common check prefixes to merge similar blocks * Test a more similar set of platforms * Add missing `nounwind` * Test `strictfp` for each libcall where possible This is a pre-test for [1]. [1]: llvm/llvm-project#152684
Align the syntax used for the optimization level argument of the expand-fp pass in textual descriptions of pass pipelines with the syntax used by other passes taking a similar argument. That is, use e.g. `expand-fp<O1>` instead of `expand-fp<opt-level=1>`.
`f16` is more functional than just a storage type on the platform, though it does have some codegen issues [1]. To prepare for future changes, do the following nonfunctional updates to the existing `half` test: * Add tests for passing and returning the type directly. * Add tests showing bitcast behavior, which is currently incorrect but serves as a baseline. * Add tests for `fabs` and `copysign` (trivial operations that shouldn't require libcalls). * Add invocations for big-endian and for PPC32. * Rename the test to `half.ll` to reflect its status, which also matches other backends. [1]: llvm/llvm-project#97975
[AMDGPU] Treat GEP offsets as signed in AMDGPUPromoteAlloca AMDGPUPromoteAlloca can transform i32 GEP offsets that operate on allocas into i64 extractelement indices. Before this patch, negative GEP offsets would be zero-extended, leading to wrong extractelement indices with values around (2**32-1). This fixes failing LlvmLibcCharacterConverterUTF32To8Test tests for AMDGPU.
…7702) The default values for DebugLocs in LoopVectorizer/VPlan were recently updated from empty DebugLocs to DebugLoc::getUnknown, as part of the DebugLoc Coverage Tracking work. However, there are some cases where we also pass an explicit empty DebugLoc, in many cases as a filler argument. This patch updates all of these to `getUnknown` for now, until either valid locations or a suitable categorization can be assigned to each instead. This change is NFC outside of DebugLoc coverage tracking builds.
`Predicates` and `Features` fields serve the same purpose. They should be kept in sync, but not all predicates are based on features. This resulted in introducing dummy features for that only reason. This patch removes `Features` field and changes TableGen emitters to use `Predicates` instead. Historically, predicates were written with the assumption that the checking code will be used in `SelectionDAGISel` subclasses, meaning they will have access to the subclass variables, such as `Subtarget`. There are no such variables in the generated `GenSubtargetInfo::getHwModeSet()`, so we need to provide them. This can be achieved by subclassing `HwModePredicateProlog`, see an example in `Hexagon.td`.
Defaults to "agent" for targets that do not support it. - Add documentation - Register it in MachineModuleInfo - Add MemoryLegalizer support
This test is strange since it's full of decoding failure warnings
…(#157304) * Apply the typo fix as a separate NFC patch from here: https://github.com/llvm/llvm-project/pull/134330/files#r2313015079
Make sure the tested error is the literal error, not for unaligned registers.
This will add label `clang:temporal-safety` to PRs touching the mentioned files.
Removing statefullness also adds the benefit of short circuiting.
During CSE, we don't have to drop all poison-generating flags on mis-match, we can keep the ones common on both recipes. PR: llvm/llvm-project#157664
The test introduced by PR #157408 requires the amdgpu target. Move it to the subdirectory which only runs if the target is available.
The limit 'dfa-max-num-paths' that is used to control number of enumerated paths was not checked against inside getPathsFromStateDefMap. It may lead to large memory consumption for complex enough switch statements. Reland llvm/llvm-project#145482
Required for `BUILD_SHARED_LIBS=ON` builds with optimizations disabled for the new FortranUtils library. Also see #150027 #155422
There are no references to it anymore in the codebase.
InvalidAtomicBuiltins.cl requires an update after llvm/llvm-project#157364 Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@28fc4d6306e812c
…pace (#3351) Global constants cannot be processed in SPIRV. This change enables the translation of global constants declared in the private address space into local variables within the functions where they are used. Original commit: KhronosGroup/SPIRV-LLVM-Translator@667e1cb81b51397
…float-point-constants.ll, local-integers-constants.ll, local-null-constants.ll (#3306) --Test for constant propagation, global and local constants, from constant subdirectory in llvm project. Original commit: KhronosGroup/SPIRV-LLVM-Translator@601cdf7397ee135
In case of missing declaration we shouldn't place DebugInfoNone for OpenCL and Shader.DebugInfo.100 instruction sets. Solves issue from KhronosGroup/SPIRV-LLVM-Translator#3275 Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@fa9bc14efcd68e0
The translator follows itanium mangling and per https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling __bf16 mangling should be 'DF16b'. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@0cdcf6db375afb6
Fix code quality issues reported by a static analysis tool. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b0c1b9b278a82eb
d54f77c5 ("[NFC] Split of SPT and SPIR-V in header parsing (#2316)", 2024-03-11) made a copy of the error log, with the presumably unintended consequence that errors are no longer propagated back to the SPIRVModule itself. Original commit: KhronosGroup/SPIRV-LLVM-Translator@3d58c69cf2f3704
Due to SimplifyCFG change in llvm/llvm-project@ea2f539
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
LLVM: llvm/llvm-project@2f755c5
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@3d58c69