kumare3

👋

Check out https://flyte.org

Ketan Umare kumare3

👋

Check out https://flyte.org

Founder and TSC chair @flyteorg and Co-founder and CEO @unionai

58 followers · 12 following

Achievements

x3 x4

Achievements

x3 x4

Lists (2)

Sort

✨ Inspiration

🚀 My stack

Stars

kforeman / union-mac-app

A small Mac app that displays the health of your Union.ai cluster as a menu bar icon.

Python 5 1 Updated Apr 27, 2026

ingero-io / ingero

eBPF-based GPU causal observability agent

Go 62 6 Updated Apr 26, 2026

florianmattana / sass-king

Reverse engineering NVIDIA SASS instruction dictionary, kernel audits and pattern recognition across GPU architectures.

Sass 193 10 Updated Apr 28, 2026

WeianMao / triattention

TriAttention — Efficient long reasoning with trigonometric KV cache compression. Enables OpenClaw local deployment on memory-constrained GPUs.

Python 660 53 Updated Apr 23, 2026

zerobootdev / zeroboot

Sub-millisecond VM sandboxes for AI agents via copy-on-write forking

Rust 2,253 96 Updated Mar 21, 2026

justrach / turboAPI

FastAPI-compatible Python framework with Zig HTTP core; 7x faster, free-threading native

Zig 963 27 Updated Apr 27, 2026

flyteorg / flyte-sdk

Type-safe, distributed orchestration of agents, ML pipelines, and real-time inference — in pure Python with async/await.

Python 112 37 Updated Apr 29, 2026

NVIDIA / nsight-python

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 197 13 Updated Apr 24, 2026

zuston / riffle

Rust based high-performance Apache Uniffle shuffle-server

Rust 64 5 Updated Apr 24, 2026

ekzhang / jax-js

JAX in JavaScript – ML library for the web, running on WebGPU & Wasm

TypeScript 799 47 Updated Apr 15, 2026

microsoft / VibeVoice

Open-Source Frontier Voice AI

Python 45,090 4,996 Updated Apr 24, 2026

rustfs / rustfs

🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…

Rust 26,763 1,143 Updated Apr 29, 2026

ucb-bar / autocomp

Autocomp: Optimize any AI kernel, anywhere.

Python 126 8 Updated Apr 26, 2026

s2-streamstore / cachey

Read-through cache for object storage

Rust 581 13 Updated Apr 26, 2026

ashvardanian / StringZilla

Up to 100x faster strings for C, C++, CUDA, Python, Rust, Swift, JS, & Go, leveraging NEON, AVX2, AVX-512, SVE, GPGPU, & SWAR to accelerate search, hashing, sorting, edit distances, sketches, and m…

C 3,444 123 Updated Mar 23, 2026

NVIDIA / Model-Optimizer

A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …

Python 2,581 371 Updated Apr 29, 2026

tjizep / barch

BARCH is a local l1 + remote l2 cache with valkey and multilanguage l1 interface providing low latency ordered access

C++ 15 Updated Apr 28, 2026

yichuan-w / LEANN

[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 10,930 957 Updated Apr 24, 2026