-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[LLVM] Fix offload and update CUDA ABI for all SM values #159354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -931,6 +931,12 @@ enum : unsigned { | |
// Processor selection mask for EF_CUDA_SM* values prior to blackwell. | ||
EF_CUDA_SM = 0xff, | ||
|
||
// Processor selection mask for EF_CUDA_SM* values following blackwell. | ||
EF_CUDA_SM_MASK = 0xff00, | ||
|
||
// Processor selection mask for EF_CUDA_SM* values following blackwell. | ||
EF_CUDA_SM_OFFSET = 8, | ||
|
||
// SM based processor values. | ||
EF_CUDA_SM20 = 0x14, | ||
EF_CUDA_SM21 = 0x15, | ||
|
@@ -950,9 +956,15 @@ enum : unsigned { | |
EF_CUDA_SM80 = 0x50, | ||
EF_CUDA_SM86 = 0x56, | ||
EF_CUDA_SM87 = 0x57, | ||
EF_CUDA_SM88 = 0x58, | ||
EF_CUDA_SM89 = 0x59, | ||
// The sm_90a variant uses the same machine flag. | ||
EF_CUDA_SM90 = 0x5a, | ||
EF_CUDA_SM100 = 0x64, | ||
EF_CUDA_SM101 = 0x65, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. About that sm_101. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have easy access to CUDA 13.0 yet, just an ELF someone else gave me which these work on. Will it be sufficient to just handle both cases? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here's the dump of ELF headers for sm_110/a/f with cuda-13 and 101/a/f with cuda-12.9: https://gist.github.com/Artem-B/1995e3bd80a06b3bee33196e8753d73b The definitions look fine. |
||
EF_CUDA_SM103 = 0x67, | ||
EF_CUDA_SM110 = 0x6e, | ||
EF_CUDA_SM120 = 0x78, | ||
EF_CUDA_SM121 = 0x79, | ||
|
||
// Unified texture binding is enabled. | ||
EF_CUDA_TEXMODE_UNIFIED = 0x100, | ||
|
@@ -968,17 +980,7 @@ enum : unsigned { | |
// Virtual processor selection mask for EF_CUDA_VIRTUAL_SM* values. | ||
EF_CUDA_VIRTUAL_SM = 0xff0000, | ||
|
||
// Processor selection mask for EF_CUDA_SM* values following blackwell. | ||
EF_CUDA_SM_MASK = 0xff00, | ||
|
||
// SM based processor values. | ||
EF_CUDA_SM100 = 0x6400, | ||
EF_CUDA_SM101 = 0x6500, | ||
EF_CUDA_SM103 = 0x6700, | ||
EF_CUDA_SM120 = 0x7800, | ||
EF_CUDA_SM121 = 0x7900, | ||
|
||
// Set when using an accelerator variant like sm_100a. | ||
// Set when using an accelerator variant like sm_100a in the new ABI. | ||
EF_CUDA_ACCELERATORS = 0x8, | ||
}; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVIDIA has added sm_88, too. https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#changes-in-ptx-isa-version-9-0