Skip to content
View rk119's full-sized avatar
🤺
Busy
🤺
Busy

Block or report rk119

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,350 529 Updated Jun 13, 2026

A framework for few-shot evaluation of language models.

Python 12,944 3,335 Updated Jun 2, 2026

InferX: Inference as a Service Platform

Rust 217 25 Updated Jun 12, 2026

🪢 Open source AI engineering platform: LLM evals, observability, metrics, prompt management, playground, datasets. Integrates with OpenTelemetry, LangChain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 29,026 3,010 Updated Jun 13, 2026

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

Rust 11,588 898 Updated Jun 11, 2026

Training/Fine-tuning at the speed of light

C++ 794 5 Updated Jun 10, 2026

Learn it. Build it. Ship it for others.

Python 31,955 5,232 Updated Jun 12, 2026

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,899 1,305 Updated Jun 13, 2026

Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.

Python 2,160 232 Updated Jun 12, 2026

A curated list of awesome open source and commercial platforms for serving models in production 🚀

52 5 Updated Apr 20, 2022

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Go 5,573 1,523 Updated Jun 12, 2026

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Python 889 235 Updated Jun 13, 2026

An awesome repository of local AI tools

1,976 218 Updated Nov 13, 2024

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 28,970 6,515 Updated Jun 14, 2026

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,673 968 Updated Jun 3, 2026

Hundreds of models & providers. One command to find what runs on your hardware.

Rust 27,845 1,705 Updated Jun 13, 2026

Supercomputing for Artificial Intelligence

Jupyter Notebook 81 21 Updated Apr 20, 2026

Implementation of my AI agent system for ERC 3: https://erc.timetoact-group.at

Python 65 18 Updated Dec 28, 2025

Implementation of my RAG system that won all categories in Enterprise RAG Challenge 2

Python 2,365 485 Updated May 12, 2025

Needle simplifies building RAG pipelines.

Python 46 2 Updated Jul 27, 2025

Universal memory layer for AI Agents

Python 58,495 6,722 Updated Jun 13, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 378,580 79,179 Updated Jun 14, 2026

Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.

Python 1,213 92 Updated Jun 13, 2026

A curated list of materials on AI efficiency

221 20 Updated Jun 2, 2026

The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol

TypeScript 35,010 4,363 Updated Jun 13, 2026

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

C++ 1,512 489 Updated Jun 14, 2026

Hexagon-MLIR is a compiler toolchain for compiling and executing AI kernels and models on Qualcomm Hexagon Neural Processing Units (NPUs).

C++ 155 41 Updated Jun 3, 2026

FlashInfer: Kernel Library for LLM Serving

Python 5,787 1,047 Updated Jun 14, 2026

Open-source localization engineering tools. Connects to Lingo.dev localization engineering platform for consistent, quality translations.

TypeScript 5,404 824 Updated Jun 12, 2026

Multi-Agent Harness for Production AI

Python 10,529 5,651 Updated May 29, 2026
Next