diff --git a/sycl/doc/SYCLCompilerAndRuntimeDesign.md b/sycl/doc/SYCLCompilerAndRuntimeDesign.md index 7d51ed02229a5..182b37bdf0375 100644 --- a/sycl/doc/SYCLCompilerAndRuntimeDesign.md +++ b/sycl/doc/SYCLCompilerAndRuntimeDesign.md @@ -394,6 +394,51 @@ llvm-no-spir-kernel host.bc It returns 0 if no kernels are present and 1 otherwise. +#### Device code split + +Putting all device code into a single SPIRV module does not work well in the +following cases: +1. There are thousands of kernels defined and only small part of them is used at +run-time. Having them all in one SPIR-V module significantly increases JIT time. +2. Device code can be specialized for different devices. For example, kernels +that are supposed to be executed only on FPGA can use extensions avaliable for +FPGA only. This will cause JIT compilation failure on other devices even if this +particular kernel is never called on them. + +To resolve these problems the compiler can split a single module into smaller +ones. The following features is supported: +* Emitting a separate module for source (translation unit) +* Emitting a separate module for each kernel + +The current approach is: +* Generate special meta-data with translation unit ID for each kernel in SYCL +front-end. This ID will be used to group kernels on per-translation unit basis +* Link all device LLVM modules using llvm-link +* Perform split on a fully linked module +* Generate a symbol table (list of kernels) for each produced device module for +proper module selection in runtime +* Perform SPIR-V translation and AOT compilation (if requested) on each produced +module separately +* Add information about presented kernels to a wrappring object for each device +image + +Device code splitting process: +![Device code splitting](images/DeviceCodeSplit.svg) + +The "split" box is implemented as functionality of the dedicated tool +`sycl-post-link`. The tool runs a set of LLVM passes to split input module and +generates a symbol table (list of kernels) for each produced device module. + +To enable device code split, a special option must be passed to the clang +driver: + +`-fsycl-device-code-split=` + +There are three possible values for this option: +* `per_source` - enables emitting a separate module for each source (translation +unit) +* `per_kernel` - enables emitting a separate module for each kernel +* `off` - disables device code split ### Integration with SPIR-V format diff --git a/sycl/doc/images/DeviceCodeSplit.svg b/sycl/doc/images/DeviceCodeSplit.svg new file mode 100755 index 0000000000000..d619c3b87a310 --- /dev/null +++ b/sycl/doc/images/DeviceCodeSplit.svg @@ -0,0 +1,1306 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Page-1 + + + + + Rounded Rectangle + S1.cpp + + + + + + + + + + + + + + + + + + + + + S1.cpp + + Rounded Rectangle.12 + void a() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void a() {…} + + Rounded Rectangle.13 + Kernel x() {b();} + + + + + + + + + + + + + + + + + + + + + + + + + Kernel x() {b();} + + Rounded Rectangle.14 + S2.cpp + + + + + + + + + + + + + + + + + + + + + S2.cpp + + Rounded Rectangle.15 + void b() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void b() {…} + + Rounded Rectangle.16 + Kernel y() {a();} + + + + + + + + + + + + + + + + + + + + + + + + + Kernel y() {a();} + + Rectangle + Front-end + + + + + + + + + + Front-end + + Rectangle.19 + Front-end + + + + + + + + + + Front-end + + Dynamic connector + + + + Dynamic connector.21 + + + + Rounded Rectangle.22 + S1.bc + + + + + + + + + + + + + + + + + + + + + S1.bc + + Rounded Rectangle.23 + void a() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void a() {…} + + Rounded Rectangle.24 + Kernel x() {b();} ‘S1’ + + + + + + + + + + + + + + + + + + + + + + + + + Kernel x() {b();} S1 + + Rounded Rectangle.25 + S2.bc + + + + + + + + + + + + + + + + + + + + + S2.bc + + Rounded Rectangle.26 + void b() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void b() {…} + + Rounded Rectangle.27 + Kernel y() {a();} ‘S2’ + + + + + + + + + + + + + + + + + + + + + + + + + Kernel y() {a();} S2 + + Dynamic connector.28 + + + + Dynamic connector.29 + + + + Rectangle.30 + llvm-link + + + + + + + + + + llvm-link + + Dynamic connector.31 + + + + Dynamic connector.32 + + + + Rounded Rectangle.33 + S1S2.bc + + + + + + + + + + + + + + + + + + + + + S1S2.bc + + Rounded Rectangle.36 + Kernel x() {b();} ‘S1’ + + + + + + + + + + + + + + + + + + + + + + + + + Kernel x() {b();} S1 + + Rounded Rectangle.37 + Kernel y() {a();} ‘S2’ + + + + + + + + + + + + + + + + + + + + + + + + + Kernel y() {a();} S2 + + Rounded Rectangle.38 + void a() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void a() {…} + + Rounded Rectangle.39 + void b() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void b() {…} + + Dynamic connector.40 + + + + Rectangle.41 + split + + + + + + + + + + split + + Dynamic connector.44 + + + + Rounded Rectangle.55 + + + + + + + + + + + + + + + + + + + + + + Rounded Rectangle.57 + S1.bc + + + + + + + + + + + + + + + + + + + + + S1.bc + + Rounded Rectangle.58 + void b() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void b() {…} + + Rounded Rectangle.59 + Kernel x() {b();} + + + + + + + + + + + + + + + + + + + + + + + + + Kernel x() {b();} + + Rounded Rectangle.60 + S1.txt + + + + + + + + + + + + + + + + + + + + + + + + + S1.txt + + Rounded Rectangle.61 + “x” + + + + + + + + + + + + + + + + + + + + + + + + + “x” + + Rounded Rectangle.62 + + + + + + + + + + + + + + + + + + + + + + Rounded Rectangle.63 + S2.bc + + + + + + + + + + + + + + + + + + + + + S2.bc + + Rounded Rectangle.64 + void a() {…} + + + + + + + + + + + + + + + + + + + + + + + + + void a() {…} + + Rounded Rectangle.65 + Kernel y() {a();} + + + + + + + + + + + + + + + + + + + + + + + + + Kernel y() {a();} + + Rounded Rectangle.66 + S2.txt + + + + + + + + + + + + + + + + + + + + + + + + + S2.txt + + Rounded Rectangle.67 + “y” + + + + + + + + + + + + + + + + + + + + + + + + + “y” + + Dynamic connector.68 + + + + Dynamic connector.69 + + + + Rectangle.70 + llvm-spirv + + + + + + + + + + llvm-spirv + + Rounded Rectangle.71 + S1.spv + + + + + + + + + + + + + + + + + + + + + + + + + S1.spv + + Rectangle.72 + aot + + + + + + + + + + aot + + Rounded Rectangle.73 + S1.bin + + + + + + + + + + + + + + + + + + + + + + + + + S1.bin + + Rectangle.74 + llvm-spirv + + + + + + + + + + llvm-spirv + + Rounded Rectangle.75 + S2.spv + + + + + + + + + + + + + + + + + + + + + + + + + S2.spv + + Rectangle.76 + aot + + + + + + + + + + aot + + Rounded Rectangle.77 + S2.bin + + + + + + + + + + + + + + + + + + + + + + + + + S2.bin + + Dynamic connector.78 + + + + Dynamic connector.79 + + + + Dynamic connector.80 + + + + Dynamic connector.81 + + + + Dynamic connector.83 + + + + Dynamic connector.84 + + + + Dynamic connector.85 + + + + Rectangle.88 + clang-offload-wrapper + + + + + + + + + + clang-offload-wrapper + + Dynamic connector.89 + + + + Dynamic connector.91 + + + + Dynamic connector.92 + + + + Dynamic connector.93 + + + +