- South Korea, Seoul
-
06:30
(UTC +09:00)
Lists (5)
Sort Name ascending (A-Z)
Stars
Onboarding game built for OpenAI Agent Hackathon NYC
agentsculptor is an experimental AI-powered development agent designed to analyze, refactor, and extend Python projects automatically. It uses an OpenAI-like planner–executor loop on top of a vLLM …
Fastest enterprise AI gateway (50x faster than LiteLLM) with adaptive load balancer, cluster mode, guardrails, 1000+ models support & <100 µs overhead at 5k RPS.
Comprehensive Claude Code project configuration example with hooks, skills, agents, commands, and GitHub Actions workflows
Financial Services Interest Group
Typescript/React Library for AI Chat💬🚀
FlashInfer: Kernel Library for LLM Serving
Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.
If you want to become good at system design, join this newsletter now 👇
Universal Python SDK to run AI workloads on Kubernetes
A Datacenter Scale Distributed Inference Serving Framework
The universal tool suite for vector database management. Manage Pinecone, Chroma, Qdrant, Weaviate and more vector databases with ease.
Generate spreadsheets based on GitHub contributions
Learn how to design systems at scale and prepare for system design interviews
Manages Unified Access to Generative AI Services built on Envoy Gateway
Efficient and easy multi-instance LLM serving
HAMi-core compiles libvgpu.so, which ensures hard limit on GPU in container
GenAI inference performance benchmarking tool
Repository for the next iteration of composite service (e.g. Ingress) and load balancing APIs.
Gateway API Inference Extension
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM