-
Notifications
You must be signed in to change notification settings - Fork 807
[SYCL][FPGA] Allowing max-concurrency attribute on functions. #3362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
template <int NT> | ||
[[intel::reqd_sub_group_size(NT)]] void func() {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to use [[intel::max_concurrency(NT)]] here?
[[intel::max_concurrency(2)]] for (int i = 0; i != 10; ++i) a[i] = 0; | ||
} | ||
|
||
[[intel::component_max_concurrency(2)]] void foo1 { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are adding new spelling here. You did not add it inside Attr.td file and also CodeGen test is using existing loop attribute spelling [[intel::max_concurrency()]]. Are we going to use same loop attribute spelling as function attribute?
// CHECK: !17 = !{i32 2} | ||
// CHECK: !18 = !{i32 3} | ||
|
||
template <typename name, typename Func> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use updated guidelines for new FE tests:
Please refer section “Guidelines for adding DPC++ in-tree LIT tests (DPC++ Clang FE tests)” in the https://github.com/intel/llvm/blob/sycl/CONTRIBUTING.md guide for the suggested guidelines
|
||
if (const SYCLIntelFPGAMaxConcurrencyAttr *A = | ||
FD->getAttr<SYCLIntelFPGAMaxConcurrencyAttr>()) { | ||
const auto *CE = dyn_cast<ConstantExpr>(A->getNThreadsExpr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you are missing an assert here before ArgVal.
assert(CE && "Not an integer constant expression");
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto *CE = dyn_cast<ConstantExpr>(A->getNThreadsExpr()); | |
const auto *CE = cast<ConstantExpr>(A->getNThreadsExpr()); |
No need for a dyn_cast<>
followed by an assert
.
void bar() { | ||
[[intel::max_concurrency(N)]] for(;;) { } | ||
} | ||
[[intel::component_max_concurrency(N)]] void bar1() { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, new attribute spelling, you did not use this spelling
clang/lib/Sema/SemaSYCL.cpp
Outdated
if (auto *A = FD->getAttr<SYCLIntelFPGAMaxConcurrencyAttr>()) | ||
Attrs.insert(A); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to propagate this attribute to the caller?
Could you please provide description about this PR? Also need SemaSYCL test here. Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All of the sema tests are missing for testing the warning situations (applies to functions and not other types of declarations, etc).
I think we're also missing a "merge" function for the attribute as functions can have redeclarations. e.g., we want to catch things like:
[[intel::max_concurrency(4)]] void func();
[[intel::max_concurrency(5)]] void func() {}
clang/include/clang/Basic/Attr.td
Outdated
CXX11<"intel","max_concurrency">]; | ||
let Subjects = SubjectList<[ForStmt, CXXForRangeStmt, WhileStmt, DoStmt], | ||
let Subjects = SubjectList<[ForStmt, CXXForRangeStmt, WhileStmt, DoStmt, Function], | ||
ErrorDiag, "'for', 'while', and 'do' statements">; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text of the diagnostic needs to be updated as well.
clang/include/clang/Basic/Attr.td
Outdated
} | ||
|
||
def SYCLIntelFPGAMaxConcurrency : StmtAttr { | ||
def SYCLIntelFPGAMaxConcurrency : InheritableAttr { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def SYCLIntelFPGAMaxConcurrency : InheritableAttr { | |
def SYCLIntelFPGAMaxConcurrency : DeclOrStmtAttr { |
clang/include/clang/Sema/Sema.h
Outdated
/// declaration. | ||
void addSYCLIntelPipeIOAttr(Decl *D, const AttributeCommonInfo &CI, Expr *ID); | ||
|
||
/// AddSYCLIntelFPGAMaxConcurrencyAttr - Adds a max_component attribute to a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_component
-- did you mean max_concurrency
?
|
||
if (const SYCLIntelFPGAMaxConcurrencyAttr *A = | ||
FD->getAttr<SYCLIntelFPGAMaxConcurrencyAttr>()) { | ||
const auto *CE = dyn_cast<ConstantExpr>(A->getNThreadsExpr()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const auto *CE = dyn_cast<ConstantExpr>(A->getNThreadsExpr()); | |
const auto *CE = cast<ConstantExpr>(A->getNThreadsExpr()); |
No need for a dyn_cast<>
followed by an assert
.
Fn->setMetadata("stall_enable", llvm::MDNode::get(Context, AttrMDArgs)); | ||
} | ||
|
||
if (const SYCLIntelFPGAMaxConcurrencyAttr *A = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (const SYCLIntelFPGAMaxConcurrencyAttr *A = | |
if (const auto *A = |
Because the type is spelled out explicitly in the initializer, this is a good time to use auto
.
if (const SYCLIntelFPGAMaxConcurrencyAttr *A = | ||
FD->getAttr<SYCLIntelFPGAMaxConcurrencyAttr>()) { | ||
const auto *CE = dyn_cast<ConstantExpr>(A->getNThreadsExpr()); | ||
Optional<llvm::APSInt> ArgVal = CE->getResultAsAPSInt(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional<llvm::APSInt> ArgVal = CE->getResultAsAPSInt(); | |
llvm::APSInt ArgVal = CE->getResultAsAPSInt(); |
The function doesn't return an Optional
, so I assume this was unintentional.
@AaronBallman Forgot to work on this. Sorry. |
clang/lib/Sema/SemaSYCL.cpp
Outdated
case attr::Kind::SYCLIntelFPGAMaxConcurrency: { | ||
auto *SIMCA = cast<SYCLIntelFPGAMaxConcurrencyAttr>(A); | ||
if (auto *Existing = | ||
SYCLKernel->getAttr<SYCLIntelFPGAMaxConcurrencyAttr>()) { | ||
ASTContext &Ctx = getASTContext(); | ||
if (Existing->getNThreadsExpr() > SIMCA->getNThreadsExpr()) { | ||
Diag(SYCLKernel->getLocation(), | ||
diag::err_conflicting_sycl_kernel_attributes); | ||
Diag(Existing->getLocation(), diag::note_conflicting_attribute); | ||
Diag(SIMCA->getLocation(), diag::note_conflicting_attribute); | ||
SYCLKernel->setInvalidDecl(); | ||
} else { | ||
SYCLKernel->addAttr(A); | ||
} | ||
} else { | ||
SYCLKernel->addAttr(A); | ||
} | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i am wondering what is the purpose of adding this diagnostic here? I do not see any test regarding this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added to be able to check on things like this, previous attribute on a function.
[[intel::max_concurrency(4)]] void func();
[[intel::max_concurrency(5)]] void func() {}
I need to add a test case for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add "merge" function to check this instead of SemaSYCL codes here:
SYCLIntelFPGAMaxConcurrencyAttr *Sema::MergeSSYCLIntelFPGAMaxConcurrencyAttr(
Decl *D, const SYCLIntelFPGAMaxConcurrencyAttr &A) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. Studying that! Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding the code here only handles diagnostics when applying attributes to SYCL kernel i.e. when attributes are propagated to kernel from device functions called by kernel. I don't think duplicate attributes on any other function will be handled by this
Based on code in L563, it looks like while the attribute can be explicitly specified on kernel, it is not propagated to kernel from device functions. In this case, I do not think this block of code is required. I think the "merge" function as @smanna12 should cover kernel function as well. However please verify with tests for 'normal' and kernel functions.
void bar() { | ||
[[intel::max_concurrency(N)]] for(;;) { } | ||
} | ||
[[intel::max_concurrency(N)]] void bar1() { } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example is missing the template keyword that introduces N.
E = Res.get(); | ||
|
||
// This attribute requires a strictly positive value. | ||
if (ArgVal <= 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document description says non-negative values are allowed, and that zero means unbounded.
@@ -0,0 +1,89 @@ | |||
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -disable-llvm-passes -triple spir64-unknown-unknown-sycldevice -sycl-std=2020 -emit-llvm -o - %s | FileCheck %s | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here describing what the test does.
int main() { | ||
queue q; | ||
|
||
kernel_single_task_1<class kernel_function>([]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New guidelines suggest using kernel_single_task from the sycl.hpp header instead.
func<2>(); | ||
}); | ||
|
||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add an example that uses '0'?
@@ -0,0 +1,125 @@ | |||
// RUN: %clang_cc1 -fsycl-is-device -internal-isystem %S/Inputs -sycl-std=2020 -fsyntax-only -ast-dump -verify -pedantic %s | FileCheck %s | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a comment here indicating what this test does.
|
||
[[intel::max_concurrency(8)]] void dup(); | ||
[[intel::max_concurrency(9)]] void dup() {} // expected-error {{duplicate Intel FPGA function attribute 'max_concurrency'}} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add an example where the declaration and definition use the same value?
d54f77c5 ("[NFC] Split of SPT and SPIR-V in header parsing (#2316)", 2024-03-11) made a copy of the error log, with the presumably unintended consequence that errors are no longer propagated back to the SPIRVModule itself. Original commit: KhronosGroup/SPIRV-LLVM-Translator@3d58c69cf2f3704
This patch implements the support of a new FPGA function attribute ‘max_concurrency”. The attribute exists already for loops. It takes a single unsigned integer argument and has the following syntax:
[[intel::component_max_concurrency(n)]]
An example of it use:
[[intel::component_max_concurrency(n)]]
void foo() {
}
The LLVM IR representation will be function metadata:
!component_max_concurrency !0
!0 = !{!i32 n}
Max_concurrency applies to functions in device code. It is not be propagated to the caller.