Skip to content

Conversation

spectrometerHBH
Copy link
Contributor

This pull request enables support for CutensorMap in both the IR and runtime layers.

@tqchen
Copy link
Member

tqchen commented Jun 27, 2025

cc @Hzfengsy

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, please fix lint


class TensorMapTypeNode : public TypeNode {
public:
void VisitAttrs(AttrVisitor* v) { v->Visit("span", &span); }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not have to be now, see #18098 to transition to new reflection, i can do that after erge as well

@Hzfengsy
Copy link
Member

cc @LeiWang1999

@tqchen tqchen merged commit b6db2ec into apache:main Jun 30, 2025
13 checks passed
@w1049
Copy link
Contributor

w1049 commented Jul 1, 2025

Should we add a version check? CUtensorMap causes compilation failures on CUDA versions below 12.

@tqchen
Copy link
Member

tqchen commented Jul 1, 2025

@w1049 can you elaborate on the failure cases?

@w1049
Copy link
Contributor

w1049 commented Jul 1, 2025

@w1049 can you elaborate on the failure cases?

The CUtensorMap feature was introduced in CUDA 12. When attempting to compile with CUDA 11.8, the build fails because CUtensorMap is not available in cuda.h. The error message appears as:

tvm/src/runtime/cuda/cuda_device_api.cc:525: undefined reference to `cuTensorMapEncodeTiled'

Similar issue: llvm/llvm-project#64529

@tqchen
Copy link
Member

tqchen commented Jul 1, 2025

i see, seems we should guard the registration of runtime.cuTensorMapEncodeTiled with #if (CUDA_VERSION >= 12000) macro.

@tqchen
Copy link
Member

tqchen commented Jul 2, 2025

fixed by #18107 thanks @w1049

ShiboXing pushed a commit to ShiboXing/tvm that referenced this pull request Aug 10, 2025
This PR introduces Cutensor map support in the runtime module. It enables calling kernels whose arguments are cuTensorMap, these arguments are passed as handle(address) and associated with arg_extra_tags that indicate indicate it is tensor map. The TensorMap is allocated on stack with a runtime API
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants