Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] ptxas error : Value of threads per SM for entry is out of the range. #2003

Open
yangjianscut opened this issue Nov 16, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@yangjianscut
Copy link

Describe the bug
I am building the C++ shared lib of raft from the source. When I used ./build.sh libraft --compile-lib to build the lib, it returned
ptxas error : Value of threads per SM for entry _ZN4raft9neighbors12experimental10nn_descent6detail17local_join_kernelIiNS3_12InternalID_tIiEEEEvPKT_S9_PK4int2S9_S9_SC_iPK6__halfiPT0_PfiPiSI_ is out of range. .minnctapersm will be ignored.

Steps/Code to reproduce bug
user@myserver:~/raft$ ./build.sh libraft --compile-lib
Building for the architecture of the GPU in the system...
-- Auto detection of gpu-archs: 89
-- CPM: Using local package Thrust@1.17.2.0
-- CPM: Using local package rmm@23.12.0
-- CPM: Adding package NvidiaCutlass@2.10.0 (v2.10.0)
-- CMake Version: 3.28.0-rc5
-- CUDART: /usr/local/cuda-11.8/lib64/libcudart.so
-- CUDA Driver: /usr/local/cuda-11.8/lib64/stubs/libcuda.so
-- NVRTC: /usr/local/cuda-11.8/lib64/libnvrtc.so
-- Default Install Location: /home/bld/raft_build
-- CUDA Compilation Architectures: 53;60;61;70;72;75;80;86
-- Enable caching of reference results in conv unit tests
-- Enable rigorous conv problem sizes in conv unit tests
-- Using NVCC flags: -DCUTLASS_TEST_LEVEL=0;-DCUTLASS_TEST_ENABLE_CACHED_RESULTS=1;-DCUTLASS_CONV_UNIT_TEST_RIGOROUS_SIZE_ENABLED=1;-DCUTLASS_DEBUG_TRACE_LEVEL=0;$<$BOOL:1:-Xcompiler=-Wconversion>;$<$BOOL:1:-Xcompiler=-fno-strict-aliasing>
-- CUTLASS Revision: 31fcbf1
-- Configuring cublas ...
-- cuBLAS Disabled.
-- Configuring cuBLAS ... done.
-- CPM: Using local package cuco@0.0.1
-- _RAPIDS_POLICY_CALLERS_VERSION: 23.12
-- _RAPIDS_POLICY_REMOVED_IN: 24.02
CMake Deprecation Warning at build/_deps/rapids-cmake-src/rapids-cmake/cmake/detail/policy.cmake:57 (message):
rapids-cmake policy [deprecated=23.12 removed=24.02]: Usage of
rapids_export_find_package_file without an explicit EXPORT_SET key has
been deprecated.
Call Stack (most recent call first):
build/_deps/rapids-cmake-src/rapids-cmake/export/find_package_file.cmake:73 (rapids_cmake_policy)
build/_deps/rapids-cmake-src/rapids-cmake/find/generate_module.cmake:216 (rapids_export_find_package_file)
CMakeLists.txt:535 (rapids_find_generate_module)

-- _RAPIDS_POLICY_CALLERS_VERSION: 23.12
-- _RAPIDS_POLICY_REMOVED_IN: 24.02
CMake Deprecation Warning at build/_deps/rapids-cmake-src/rapids-cmake/cmake/detail/policy.cmake:57 (message):
rapids-cmake policy [deprecated=23.12 removed=24.02]: Usage of
rapids_export_find_package_file without an explicit EXPORT_SET key has
been deprecated.
Call Stack (most recent call first):
build/_deps/rapids-cmake-src/rapids-cmake/export/find_package_file.cmake:73 (rapids_cmake_policy)
build/_deps/rapids-cmake-src/rapids-cmake/find/generate_module.cmake:223 (rapids_export_find_package_file)
CMakeLists.txt:535 (rapids_find_generate_module)

-- Configuring done (1.0s)
CMake Warning (dev) at CMakeLists.txt:254 (target_link_libraries):
The library that is being linked to, CUDA::nvToolsExt, is marked as being
deprecated by the owner. The message provided by the developer is:

nvToolsExt has been superseded by nvtx3 since CUDA 10.0 and CMake 3.25.
Use CUDA::nvtx3 and include <nvtx3/nvToolsExt.h> instead.

This warning is for project developers. Use -Wno-dev to suppress it.

CMake Warning (dev) at CMakeLists.txt:254 (target_link_libraries):
The library that is being linked to, CUDA::nvToolsExt, is marked as being
deprecated by the owner. The message provided by the developer is:

nvToolsExt has been superseded by nvtx3 since CUDA 10.0 and CMake 3.25.
Use CUDA::nvtx3 and include <nvtx3/nvToolsExt.h> instead.

This warning is for project developers. Use -Wno-dev to suppress it.

-- Generating done (0.0s)
-- Build files have been written to: /home/bld/raft/cpp/build
-- Compiling targets: ;raft_lib, verbose=
[1/4] Building CUDA object CMakeFiles/raft_objs.dir/src/raft_runtime/neighbors/cagra_build.cu.o
FAILED: CMakeFiles/raft_objs.dir/src/raft_runtime/neighbors/cagra_build.cu.o
/usr/local/cuda-11.8/bin/nvcc -forward-unknown-to-host-compiler -DCUTLASS_NAMESPACE=raft_cutlass -DFMT_HEADER_ONLY=1 -DNVTX_ENABLED -DRAFT_COMPILED -DRAFT_EXPLICIT_INSTANTIATE_ONLY -DRAFT_SYSTEM_LITTLE_ENDIAN=1 -DSPDLOG_FMT_EXTERNAL -DTHRUST_DEVICE_SYSTEM=THRUST_DEVICE_SYSTEM_CUDA -DTHRUST_HOST_SYSTEM=THRUST_HOST_SYSTEM_CPP -I/home/bld/raft/cpp/include -I/home/bld/raft/cpp/build/_deps/thrust-src -I/home/bld/raft/cpp/build/_deps/thrust-src/dependencies/cub -I/home/bld/anaconda3/include/rapids/libcudacxx -I/home/bld/raft/cpp/build/_deps/nvidiacutlass-src/include -I/home/bld/raft/cpp/build/_deps/nvidiacutlass-build/include -I/usr/local/cuda-11.8/include -isystem /home/bld/anaconda3/include -isystem /usr/local/cuda-11.8/targets/x86_64-linux/include -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_89,code=[sm_89]" -Xcompiler=-fPIC -Xcompiler=-Wno-deprecated-declarations -Xcompiler=-Wall,-Werror,-Wno-error=deprecated-declarations -Werror=all-warnings --expt-extended-lambda --expt-relaxed-constexpr -DCUDA_API_PER_THREAD_DEFAULT_STREAM -Xfatbin=-compress-all -Xcompiler=-fopenmp -MD -MT CMakeFiles/raft_objs.dir/src/raft_runtime/neighbors/cagra_build.cu.o -MF CMakeFiles/raft_objs.dir/src/raft_runtime/neighbors/cagra_build.cu.o.d -x cu -c /home/bld/raft/cpp/src/raft_runtime/neighbors/cagra_build.cu -o CMakeFiles/raft_objs.dir/src/raft_runtime/neighbors/cagra_build.cu.o
ptxas error : Value of threads per SM for entry ZN4raft9neighbors12experimental10nn_descent6detail17local_join_kernelIiNS3_12InternalID_tIiEEEEvPKT_S9_PK4int2S9_S9_SC_iPK6__halfiPT0_PfiPiSI is out of range. .minnctapersm will be ignored
ptxas fatal : Ptx assembly aborted due to errors
ninja: build stopped: subcommand failed.

Expected behavior
Finish compiling correctly.

Environment details (please complete the following information):

  • Environment location: Bare-metal
  • Method of RAFT install: source
  • System: Ubuntu 22.04
  • GCC: 11.4.0
  • Cuda: 11.8
  • Nvidia driver: 535.86.05
  • GPU: RTX4090
@yangjianscut yangjianscut added the bug Something isn't working label Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant