Skip to content
View doraa7's full-sized avatar

Block or report doraa7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Manages Unified Access to Generative AI Services built on Envoy Gateway

Go 1,104 108 Updated Oct 10, 2025

Repository for open inference protocol specification

59 12 Updated May 12, 2025

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,117 877 Updated Oct 8, 2025

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Python 848 211 Updated Oct 6, 2025

This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.

Jupyter Notebook 301 137 Updated Oct 9, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and support state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorR…

C++ 11,800 1,788 Updated Oct 9, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 59,715 10,588 Updated Oct 9, 2025

Port of OpenAI's Whisper model in C/C++

C++ 43,714 4,799 Updated Oct 9, 2025

LLM inference in C/C++

C++ 87,404 13,263 Updated Oct 9, 2025

Open source FPGA-based NIC and platform for in-network compute

Verilog 2,020 482 Updated Jul 5, 2024

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,363 945 Updated Sep 23, 2025

Mastering Computer Vision with PyTorch 2.0, published by Orange, AVA®

Jupyter Notebook 2 4 Updated Jan 18, 2025

Implement Neural Networks in Cuda from Scratch

C++ 24 2 Updated May 17, 2024

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.

Go 1,394 355 Updated Oct 8, 2025

Neural network from scratch in CUDA/C++

Cuda 1 Updated Jan 17, 2025

Neural network from scratch in CUDA/C++

Cuda 86 17 Updated Sep 8, 2025

cuML - RAPIDS Machine Learning Library

C++ 4,952 594 Updated Oct 9, 2025

Slides and other materials from CppCon 2018

C++ 1,443 179 Updated Apr 11, 2019

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

C++ 1,987 183 Updated Sep 16, 2022
Jupyter Notebook 89 6 Updated Feb 29, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 40,346 4,575 Updated Oct 9, 2025

Ongoing research training transformer models at scale

Python 13,775 3,145 Updated Oct 9, 2025

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,312 1,086 Updated Sep 26, 2025

Open MPI jobs on Kubernetes

Makefile 119 25 Updated Apr 17, 2018

Open Fabric Interfaces

C 704 445 Updated Oct 9, 2025

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.

C++ 185 74 Updated Oct 1, 2025

Open source project for data preparation for GenAI applications

HTML 815 221 Updated Oct 5, 2025

CUDA Core Compute Libraries

C++ 1,960 284 Updated Oct 9, 2025

cuVS - a library for vector search and clustering on the GPU

Cuda 537 130 Updated Oct 9, 2025

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 941 215 Updated Oct 8, 2025
Next