Website | Install | Tutorial | Examples | Documentation | API Reference | Forum
CuPy is a NumPy/SciPy-compatible array library for GPU-accelerated computing with Python. CuPy acts as a drop-in replacement to run existing NumPy/SciPy code on NVIDIA CUDA or AMD ROCm platforms.
>>> import cupy as cp
>>> x = cp.arange(6).reshape(2, 3).astype('f')
>>> x
array([[ 0., 1., 2.],
[ 3., 4., 5.]], dtype=float32)
>>> x.sum(axis=1)
array([ 3., 12.], dtype=float32)
CuPy also provides access to low-level CUDA features.
You can pass ndarray
to existing CUDA C/C++ programs via RawKernels, use Streams for performance, or even call CUDA Runtime APIs directly.
Binary packages (wheels) are available for Linux and Windows on PyPI. Choose the right package for your platform.
Platform | Architecture | Command |
---|---|---|
CUDA 11.x (11.2+) | x86_64 / aarch64 | pip install cupy-cuda11x |
CUDA 12.x | x86_64 / aarch64 | pip install cupy-cuda12x |
ROCm 4.3 (experimental) | x86_64 | pip install cupy-rocm-4-3 |
ROCm 5.0 (experimental) | x86_64 | pip install cupy-rocm-5-0 |
Note
To install pre-releases, append --pre -U -f https://pip.cupy.dev/pre
(e.g., pip install cupy-cuda11x --pre -U -f https://pip.cupy.dev/pre
).
Binary packages are also available for Linux and Windows on Conda-Forge.
Platform | Architecture | Command |
---|---|---|
CUDA | x86_64 / aarch64 / ppc64le | conda install -c conda-forge cupy |
If you need a slim installation (without also getting CUDA dependencies installed), you can do conda install -c conda-forge cupy-core
.
If you need to use a particular CUDA version (say 12.0), you can use the cuda-version
metapackage to select the version, e.g. conda install -c conda-forge cupy cuda-version=12.0
.
Note
If you encounter any problem with CuPy installed from conda-forge
, please feel free to report to cupy-feedstock, and we will help investigate if it is just a packaging issue in conda-forge
's recipe or a real issue in CuPy.
Use NVIDIA Container Toolkit to run CuPy container images.
$ docker run --gpus all -it cupy/cupy
- Installation Guide - instructions on building from source
- Release Notes
- Projects using CuPy
- Contribution Guide
- GPU Acceleration in Python using CuPy and Numba (GTC November 2021 Technical Session)
- GPU-Acceleration of Signal Processing Workflows using CuPy and cuSignal1 (ICASSP'21 Tutorial)
MIT License (see LICENSE
file).
CuPy is designed based on NumPy's API and SciPy's API (see docs/source/license.rst
file).
CuPy is being developed and maintained by Preferred Networks and community contributors.
Ryosuke Okuta, Yuya Unno, Daisuke Nishino, Shohei Hido and Crissman Loomis. CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations. Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS), (2017). [PDF]
@inproceedings{cupy_learningsys2017,
author = "Okuta, Ryosuke and Unno, Yuya and Nishino, Daisuke and Hido, Shohei and Loomis, Crissman",
title = "CuPy: A NumPy-Compatible Library for NVIDIA GPU Calculations",
booktitle = "Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS)",
year = "2017",
url = "http://learningsys.org/nips17/assets/papers/paper_16.pdf"
}
Footnotes
-
cuSignal is now part of CuPy starting v13.0.0. ↩