Skip to content
View kumare3's full-sized avatar
👋
Check out https://flyte.org
👋
Check out https://flyte.org

Block or report kumare3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Rust 2,137 94 Updated Mar 21, 2026

FastAPI-compatible Python framework with Zig HTTP core; 7x faster, free-threading native

Python 916 27 Updated Apr 14, 2026

Type-safe, distributed orchestration of agents, ML pipelines, and real-time inference — in pure Python with async/await.

Python 108 35 Updated Apr 14, 2026

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 191 12 Updated Mar 12, 2026

Rust based high-performance Apache Uniffle shuffle-server

Rust 64 5 Updated Apr 9, 2026

JAX in JavaScript – ML library for the web, running on WebGPU & Wasm

TypeScript 786 44 Updated Apr 14, 2026

Open-Source Frontier Voice AI

Python 39,351 4,560 Updated Apr 14, 2026

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…

Rust 25,636 1,093 Updated Apr 14, 2026

Autocomp: Optimize any AI kernel, anywhere.

Python 112 7 Updated Apr 14, 2026

Read-through cache for object storage

Rust 582 13 Updated Apr 8, 2026

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and m…

C 3,423 122 Updated Mar 23, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,464 350 Updated Apr 14, 2026

BARCH is a local l1 + remote l2 cache with valkey and multilanguage l1 interface providing low latency ordered access

C++ 15 Updated Apr 13, 2026

[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 10,796 949 Updated Apr 13, 2026

Official inference framework for 1-bit LLMs

Python 38,228 3,430 Updated Mar 10, 2026

MLX: An array framework for Apple silicon

C++ 25,385 1,688 Updated Apr 14, 2026

Nano vLLM

Python 12,867 1,920 Updated Apr 13, 2026

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 5,148 230 Updated Apr 4, 2026

Format click help output nicely with rich.

Python 800 48 Updated Jan 31, 2026

🧱 secure, local, cross-platform and programmable sandboxes for AI agents

Rust 5,334 254 Updated Apr 14, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,976 407 Updated Apr 14, 2026

A unified inference and post-training framework for accelerated video generation.

Python 3,373 316 Updated Apr 14, 2026

An extremely fast Python type checker and language server, written in Rust.

Python 18,288 282 Updated Apr 14, 2026

Programmatic sandboxing tool

Rust 276 15 Updated Apr 13, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,546 1,020 Updated Apr 14, 2026

Hybrid in-memory and disk cache in Rust

Rust 1,681 80 Updated Mar 5, 2026

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 97,171 9,068 Updated Apr 14, 2026

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,948 442 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,804 1,030 Updated Mar 30, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 9,118 1,148 Updated Apr 14, 2026
Next