Skip to content

Using a central installation of the compiler with the CUDA backend turned on #2218

@krasznaa

Description

@krasznaa

Dear All,

I ran into an interesting issue just now, which I thought could be worth discussing about a bit...

I had to realise that if I build the project with its CUDA backend enabled, I can only use the compiler on a machine that has an NVidia GPU installed. (Or at least has the necessary driver installed at least, even if doesn't have a GPU.) This is disappointing, as I was hoping that I could share a single installation between a number of my machines. Some of which would have both NVidia and Intel devices in them, and some that would only have Intel ones.

But when I use the compiler on a machine without an NVidia driver installed (because it doesn't have an NVidia card), I run into the following:

[bash][atlas]:build > ldd -r /opt/intel-clang/12.0.0/x86_64-centos8/lib/libsycl.so
        linux-vdso.so.1 (0x00007ffe7f2de000)
        libOpenCL.so.1 => /opt/intel-clang/12.0.0/x86_64-centos8/lib/libOpenCL.so.1 (0x00007ff00841b000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007ff008217000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007ff007ff7000)
        libpi_cuda.so => /opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so (0x00007ff007dbc000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007ff007ba4000)
        libcuda.so.1 => not found
        libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007ff00780f000)
        libm.so.6 => /lib64/libm.so.6 (0x00007ff00748d000)
        libc.so.6 => /lib64/libc.so.6 (0x00007ff0070cb000)
        /lib64/ld-linux-x86-64.so.2 (0x00007ff008b2f000)
        libcuda.so.1 => not found
undefined symbol: cuSurfObjectDestroy   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuGetErrorName        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxDestroy_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxSetCurrent       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuArray3DCreate_v2    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuEventCreate (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyDtoDAsync_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpy2DAsync_v2    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDeviceGet   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuLinkCreate_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemHostUnregister   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemFreeHost (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuStreamCreate        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemAlloc_v2 (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyAtoHAsync_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuArrayGetDescriptor_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuEventElapsedTime    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemAllocHost_v2     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemFree_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDeviceTotalMem_v2   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyHtoAAsync_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyHtoA_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyAtoA_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemAllocManaged     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyDtoHAsync_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemsetD8Async       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDriverGetVersion    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuArray3DGetDescriptor_v2     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDeviceGetName       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpy3D_v2 (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemHostGetDevicePointer_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxPopCurrent_v2    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuLinkDestroy (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuLinkComplete        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyHtoDAsync_v2  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxGetCurrent       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuGetErrorString      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuLaunchKernel        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuSurfObjectCreate    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuModuleUnload        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemsetD2D32Async    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuStreamWaitEvent     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxSynchronize      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuModuleLoadDataEx    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDeviceGetAttribute  (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemPrefetchAsync    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDeviceGetCount      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyHtoD_v2       (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpyAsync (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuFuncGetAttribute    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuInit        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuEventSynchronize    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuArrayDestroy        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuEventRecord (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxCreate_v2        (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpy3DAsync_v2    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuModuleGetFunction   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuStreamSynchronize   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuCtxPushCurrent_v2   (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuLinkAddData_v2      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDevicePrimaryCtxRetain      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuPointerGetAttribute (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemsetD32Async      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemsetD16Async      (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuMemcpy2D_v2 (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuStreamDestroy_v2    (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuEventDestroy_v2     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
undefined symbol: cuDevicePrimaryCtxRelease     (/opt/intel-clang/12.0.0/x86_64-centos8/lib/libpi_cuda.so)
[bash][atlas]:build >

(In this case this is actually happening inside of a Docker image and not with an installation on a network file-system, but I was also thinking about installing the compiler on a network drive.)

The issue is clearly that in this setup libsycl.so always needs libpi_cuda.so. But wouldn't it be possible to make it use libpi_cuda.so more as a "plugin"? So that if it can't be loaded/used (for whatever reason), it would still be able to compile code at least for platforms that don't need libpi_cuda.so.

I guess at that point one would better handle every backend in the same way. So that if the OpenCL based backend has issues, but the CUDA based one does not, the compiler could still function... (Even though the OpenCL based one is less of an issue, as you can install libOpenCL.so without a compatible piece of hardware very easily.)

What do you think? 🤔

Cheers,
Attila

P.S. Pinging @cleggett, @vpascuzz, @fwyzard, @ivorobts.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions