Skip to content
View kumare3's full-sized avatar
👋
Check out https://flyte.org
👋
Check out https://flyte.org

Block or report kumare3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Rust 2,127 94 Updated Mar 21, 2026

FastAPI-compatible Python framework with Zig HTTP core; 7x faster, free-threading native

Python 912 27 Updated Apr 13, 2026

Type-safe, distributed orchestration of agents, ML pipelines, and real-time inference — in pure Python with async/await.

Python 108 35 Updated Apr 12, 2026

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 189 12 Updated Mar 12, 2026

Rust based high-performance Apache Uniffle shuffle-server

Rust 64 5 Updated Apr 9, 2026

JAX in JavaScript – ML library for the web, running on WebGPU & Wasm

TypeScript 784 43 Updated Apr 13, 2026

Open-Source Frontier Voice AI

Python 39,097 4,517 Updated Apr 10, 2026

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…

Rust 25,344 1,078 Updated Apr 13, 2026

Autocomp: Optimize any AI kernel, anywhere.

Python 104 7 Updated Apr 13, 2026

Read-through cache for object storage

Rust 581 13 Updated Apr 8, 2026

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and m…

C 3,422 122 Updated Mar 23, 2026

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,449 350 Updated Apr 13, 2026

BARCH is a local l1 + remote l2 cache with valkey and multilanguage l1 interface providing low latency ordered access

C++ 15 Updated Apr 10, 2026

[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 10,792 947 Updated Apr 10, 2026

Official inference framework for 1-bit LLMs

Python 38,189 3,424 Updated Mar 10, 2026

MLX: An array framework for Apple silicon

C++ 25,353 1,681 Updated Apr 11, 2026

Nano vLLM

Python 12,834 1,913 Updated Nov 3, 2025

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 5,145 230 Updated Apr 4, 2026

Format click help output nicely with rich.

Python 800 48 Updated Jan 31, 2026

🧱 secure, local, cross-platform and programmable sandboxes for AI agents

Rust 5,316 249 Updated Apr 13, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 2,967 405 Updated Apr 12, 2026

A unified inference and post-training framework for accelerated video generation.

Python 3,370 315 Updated Apr 11, 2026

An extremely fast Python type checker and language server, written in Rust.

Python 18,272 281 Updated Apr 13, 2026

Programmatic sandboxing tool

Rust 276 15 Updated Apr 9, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,545 1,014 Updated Apr 13, 2026

Hybrid in-memory and disk cache in Rust

Rust 1,680 81 Updated Mar 5, 2026

FastAPI framework, high performance, easy to learn, fast to code, ready for production

Python 97,126 9,063 Updated Apr 10, 2026

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,947 442 Updated Mar 5, 2025

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 9,802 1,031 Updated Mar 30, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 9,116 1,148 Updated Apr 9, 2026
Next