rk119

Follow

🤺

Busy

rk119 rk119

🤺

Busy

Follow

51 followers · 184 following

Dubai

Achievements

Achievements

Stars

llm-d / llm-d

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,350 529 Updated Jun 13, 2026

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 12,944 3,335 Updated Jun 2, 2026

inferx-net / inferx

InferX: Inference as a Service Platform

Rust 217 25 Updated Jun 12, 2026

langfuse / langfuse

🪢 Open source AI engineering platform: LLM evals, observability, metrics, prompt management, playground, datasets. Integrates with OpenTelemetry, LangChain, OpenAI SDK, LiteLLM, and more. 🍊YC W23

TypeScript 29,026 3,010 Updated Jun 13, 2026

tensorzero / tensorzero

TensorZero is an open-source LLMOps platform that unifies an LLM gateway, observability, evaluation, optimization, and experimentation.

Rust 11,588 898 Updated Jun 11, 2026

invergent-ai / surogate

Training/Fine-tuning at the speed of light

C++ 794 5 Updated Jun 10, 2026

rohitg00 / ai-engineering-from-scratch

Learn it. Build it. Ship it for others.

Python 31,955 5,232 Updated Jun 12, 2026

LMCache / LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 8,899 1,305 Updated Jun 13, 2026

dstackai / dstack

Vendor-agnostic orchestration for training, inference and agentic workloads across NVIDIA, AMD, TPU, and Tenstorrent on clouds, Kubernetes, and bare metal.

Python 2,160 232 Updated Jun 12, 2026

awesome-mlops / awesome-ml-serving

A curated list of awesome open source and commercial platforms for serving models in production 🚀

52 5 Updated Apr 20, 2022

kserve / kserve

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Go 5,573 1,523 Updated Jun 12, 2026

SeldonIO / MLServer

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more

Python 889 235 Updated Jun 13, 2026

janhq / awesome-local-ai

An awesome repository of local AI tools

1,976 218 Updated Nov 13, 2024

sgl-project / sglang

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 28,970 6,515 Updated Jun 14, 2026

bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,673 968 Updated Jun 3, 2026

AlexsJones / llmfit

Hundreds of models & providers. One command to find what runs on your hardware.

Rust 27,845 1,705 Updated Jun 13, 2026

jorditorresBCN / supercomputing-for-ai

Supercomputing for Artificial Intelligence

Jupyter Notebook 81 21 Updated Apr 20, 2026

IlyaRice / Enterprise-RAG-Challenge-3-AI-Agents

Implementation of my AI agent system for ERC 3: https://erc.timetoact-group.at

Python 65 18 Updated Dec 28, 2025

IlyaRice / RAG-Challenge-2

Implementation of my RAG system that won all categories in Enterprise RAG Challenge 2

Python 2,365 485 Updated May 12, 2025

needle-ai / needle-python

Needle simplifies building RAG pipelines.

Python 46 2 Updated Jul 27, 2025

mem0ai / mem0

Universal memory layer for AI Agents

Python 58,495 6,722 Updated Jun 13, 2026

openclaw / openclaw

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 378,580 79,179 Updated Jun 14, 2026

PrunaAI / pruna

Pruna is a model optimization framework built for developers, enabling you to deliver faster, more efficient models with minimal overhead.

Python 1,213 92 Updated Jun 13, 2026

PrunaAI / awesome-ai-efficiency

A curated list of materials on AI efficiency

221 20 Updated Jun 2, 2026

CopilotKit / CopilotKit

The Frontend Stack for Agents & Generative UI. React, Angular, Mobile, Slack, and more. Makers of the AG-UI Protocol

TypeScript 35,010 4,363 Updated Jun 13, 2026

tenstorrent / tt-metal

🤘 TT-NN operator library, and TT-Metalium low level kernel programming model.

C++ 1,512 489 Updated Jun 14, 2026

qualcomm / hexagon-mlir

Hexagon-MLIR is a compiler toolchain for compiling and executing AI kernels and models on Qualcomm Hexagon Neural Processing Units (NPUs).

C++ 155 41 Updated Jun 3, 2026

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 5,787 1,047 Updated Jun 14, 2026

lingodotdev / lingo.dev

Open-source localization engineering tools. Connects to Lingo.dev localization engineering platform for consistent, quality translations.

TypeScript 5,404 824 Updated Jun 12, 2026

aden-hive / hive

Multi-Agent Harness for Production AI

Python 10,529 5,651 Updated May 29, 2026