Skip to content
View doraa7's full-sized avatar

Block or report doraa7

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Manages Unified Access to Generative AI Services built on Envoy Gateway

Go 1,758 276 Updated Jun 16, 2026

Repository for open inference protocol specification

72 14 Updated May 12, 2025

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,675 970 Updated Jun 3, 2026

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Python 889 235 Updated Jun 16, 2026

This NVIDIA RAG blueprint serves as a reference solution for a foundational Retrieval Augmented Generation (RAG) pipeline.

Python 661 286 Updated Jun 17, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,889 2,472 Updated Jun 17, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,104 18,128 Updated Jun 17, 2026

Port of OpenAI's Whisper model in C/C++

C++ 50,786 5,668 Updated Jun 16, 2026

LLM inference in C/C++

C++ 116,873 19,647 Updated Jun 16, 2026

Open source FPGA-based NIC and platform for in-network compute

Verilog 2,364 536 Updated Jul 5, 2024

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,969 1,054 Updated May 7, 2026

Mastering Computer Vision with PyTorch 2.0, published by Orange, AVA®

Jupyter Notebook 3 5 Updated Jan 18, 2025

Implement Neural Networks in Cuda from Scratch

C++ 23 2 Updated May 17, 2024

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.

Go 1,952 495 Updated Jun 17, 2026

Neural network from scratch in CUDA/C++

Cuda 1 Updated Jan 17, 2025

Neural network from scratch in CUDA/C++

Cuda 94 19 Updated Sep 8, 2025

cuML - RAPIDS Machine Learning Library

Python 5,209 631 Updated Jun 16, 2026

Slides and other materials from CppCon 2018

C++ 1,450 177 Updated Apr 11, 2019

BlazingSQL is a lightweight, GPU accelerated, SQL engine for Python. Built on RAPIDS cuDF.

C++ 2,012 181 Updated Sep 16, 2022
Jupyter Notebook 92 7 Updated Feb 29, 2024

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 42,528 4,858 Updated Jun 16, 2026

Ongoing research training transformer models at scale

Python 16,726 4,088 Updated Jun 17, 2026

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

Python 7,441 1,116 Updated Jun 11, 2026

Open MPI jobs on Kubernetes

Makefile 120 25 Updated Apr 17, 2018

Open Fabric Interfaces

C 801 508 Updated Jun 16, 2026

This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.

C++ 219 99 Updated Jun 16, 2026

Open source project for data preparation for GenAI applications

HTML 940 251 Updated Jun 4, 2026

CUDA Core Compute Libraries

C++ 2,383 412 Updated Jun 16, 2026

cuVS - a library for vector search and clustering on the GPU

Cuda 782 194 Updated Jun 17, 2026

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing …

Cuda 1,016 237 Updated Jun 16, 2026
Next