- Hangzhou, China
-
04:13
(UTC +08:00) - https://www.rockdai.com
- https://orcid.org/0009-0006-4171-1273
Highlights
Starred repositories
A unified library of SOTA model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks …
Nexus is a local web console for managing multiple CLI AI Agent instances in parallel.
Use Garry Tan's exact Claude Code setup: 15 opinionated tools that serve as CEO, Designer, Eng Manager, Release Manager, Doc Engineer, and QA
Run OpenClaw more securely inside NVIDIA OpenShell with managed inference
Miu2D is a 2D RPG game engine built with Rust + TypeScript + React + Canvas, designed for the Web platform. 剑侠情缘2/月影传说/新剑侠情缘网页复刻版
A dedicated effort to make an optimized, bleeding edge vLLM image using Docker to support DGX comprehensively
Complete guide to running Qwen3.5-35B-A3B on NVIDIA DGX Spark (GB10) with vLLM - installation, benchmarks, vision features, and troubleshooting
Build and run agents you can see, understand and trust.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Transform your favorite cities into beautiful, minimalist designs. MapToPoster lets you create and export visually striking map posters with code.
Docker configuration for running VLLM on dual DGX Sparks
Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthr…
MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B
Blazing fast and accurate glob matcher written JavaScript, with no dependencies and full support for standard and extended Bash glob features, including braces, extglobs, POSIX brackets, and regula…
AI agent toolkit: coding agent CLI, unified LLM API, TUI & web UI libraries, Slack bot, vLLM pods
A high-throughput and memory-efficient inference and serving engine for LLMs
JWA, JWS, JWE, JWT, JWK, JWKS for Node.js, Browser, Cloudflare Workers, Deno, Bun, and other Web-interoperable runtimes
A blazing fast inference solution for text embeddings models
Production-ready platform for agentic workflow development.
👑 JavaScript ORM for MySQL, PostgreSQL, and SQLite.
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…