Skip to content
View kumare3's full-sized avatar
👋
Check out https://flyte.org
👋
Check out https://flyte.org

Block or report kumare3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Rust 2,134 94 Updated Mar 21, 2026

FastAPI-compatible Python framework with Zig HTTP core; 7x faster, free-threading native

Python 916 27 Updated Apr 13, 2026

Type-safe, distributed orchestration of agents, ML pipelines, and real-time inference — in pure Python with async/await.

Python 108 35 Updated Apr 13, 2026

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 191 12 Updated Mar 12, 2026

Rust based high-performance Apache Uniffle shuffle-server

Rust 64 5 Updated Apr 9, 2026

JAX in JavaScript – ML library for the web, running on WebGPU & Wasm

TypeScript 786 44 Updated Apr 13, 2026

Open-Source Frontier Voice AI

Python 39,281 4,549 Updated Apr 10, 2026

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…

Rust 25,595 1,091 Updated Apr 13, 2026

Autocomp: Optimize any AI kernel, anywhere.

Python 112 7 Updated Apr 13, 2026

Read-through cache for object storage

Rust 581 13 Updated Apr 8, 2026

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and m…

C 3,423 122 Updated Mar 23, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,462 350 Updated Apr 13, 2026

BARCH is a local l1 + remote l2 cache with valkey and multilanguage l1 interface providing low latency ordered access

C++ 15 Updated Apr 10, 2026

[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 10,796 949 Updated Apr 13, 2026

Official inference framework for 1-bit LLMs

Python 38,219 3,429 Updated Mar 10, 2026

MLX: An array framework for Apple silicon

C++ 25,377 1,685 Updated Apr 13, 2026

Nano vLLM

Python 12,861 1,919 Updated Apr 13, 2026

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 5,146 230 Updated Apr 4, 2026

Format click help output nicely with rich.

Python 800 48 Updated Jan 31, 2026

🧱 secure, local, cross-platform and programmable sandboxes for AI agents

Rust 5,331 253 Updated Apr 13, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,975 406 Updated Apr 13, 2026

A unified inference and post-training framework for accelerated video generation.

Python 3,372 316 Updated Apr 11, 2026

An extremely fast Python type checker and language server, written in Rust.

Python 18,284 282 Updated Apr 13, 2026

Programmatic sandboxing tool

Rust 276 15 Updated Apr 13, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,546 1,019 Updated Apr 13, 2026

Hybrid in-memory and disk cache in Rust

Rust 1,680 80 Updated Mar 5, 2026

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 97,164 9,068 Updated Apr 13, 2026

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,948 442 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,804 1,031 Updated Mar 30, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 9,117 1,148 Updated Apr 9, 2026
Next