Stars
Watches files and records, or triggers actions, when they change.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
FUSE-based file system backed by Amazon S3
Tutorials, examples, discussions, research proposals, and other resources related to fuzzing
Run TensorFlow models in C++ without installation and without Bazel
A scalable inference server for models optimized with OpenVINO™
gmonitor is a GPU monitor (Nvidia only at the moment)
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note XPU is already supported in stock DeepSpeed (upstream).