[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor.

### Describe the bug

Hi team,

We received an end-to-end performance issue report from Llama3.1 users. They observed a performance drop when using the Inductor C++ wrapper (AOTInductor) compared to the Python wrapper.

The root cause is that, in the C++ wrapper, Inductor needs to [launch the kernel](https://github.com/pytorch/pytorch/blob/0b59492853b7347ead6b71f52c76e7ac2836ea27/torch/csrc/inductor/aoti_runtime/sycl_runtime_wrappers.h#L43) directly (since Triton is not required in AOTInductor deploy mode). When launching the SPIR-V kernel compiled by Triton, a `build_flag` is required for the Level Zero API `zeModuleCreate` to indicate whether large GRF mode is enabled. However, Inductor currently has no visibility into this flag, which is determined by Triton. As a result, Inductor does not pass the correct `build_flag` to L0, leading to a different binary kernel than the one Triton would build.

To address this, I suggest that Triton store the `build_flag` in the `metadata` of the` CompiledKernel` object returned by `tl.compile()`. This way, Inductor can retrieve and propagate the correct flag.

This is a critical performance issue for AOTInductor users and others relying on the C++ wrapper to reduce host overhead. We would greatly appreciate it if this feature request could be included in PyTorch 2.10.

Thanks

### Environment details

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Describe the bug

Environment details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Pytorch upstream] Feature request: Save SPIR-V Build flag to CompiledKernel metadata for Inductor. #5153

Description

Describe the bug

Environment details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions