Skip to content
View withlin's full-sized avatar
🧸
🧸
  • GuangZhou,China

Block or report withlin

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

LLAMA Turboquant implementation with CUDA support

C++ 93 5 Updated Mar 27, 2026

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 7,005 747 Updated Mar 26, 2026

An incremental parsing system for programming tools

Rust 24,377 2,513 Updated Mar 27, 2026

State-of-the-Art Text Embeddings

Python 18,455 2,766 Updated Mar 25, 2026

NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments

Go 239 59 Updated Mar 25, 2026

A Claude Code skill to generate images with Nano Banana

220 28 Updated Feb 19, 2026

Tooling for optimized, validated, and reproducible GPU-accelerated AI runtime in Kubernetes

Go 220 22 Updated Mar 27, 2026

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

Python 10,472 1,737 Updated Mar 27, 2026

AI agents running research on single-GPU nanochat training automatically

Python 58,302 8,081 Updated Mar 26, 2026

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.

Python 23,130 2,267 Updated Feb 2, 2026

Lightweight coding agent that runs in your terminal

Rust 67,992 9,111 Updated Mar 27, 2026

An open-source, code-first Go toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

Go 7,254 595 Updated Mar 26, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,991 633 Updated Mar 27, 2026

A security-focused library OS supporting kernel- and user-mode execution

Rust 2,537 112 Updated Mar 27, 2026

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Python 874 63 Updated Mar 4, 2026

Safe rust wrapper around CUDA toolkit

Rust 1,083 144 Updated Mar 25, 2026

Fast, small, and fully autonomous AI personal assistant infrastructure, ANY OS, ANY PLATFORM — deploy anywhere, swap anything 🦀

Rust 28,972 4,022 Updated Mar 27, 2026

Tiny, Fast, and Deployable anywhere — automate the mundane, unleash your creativity

Go 26,342 3,678 Updated Mar 27, 2026

Distributed KV cache scheduling & offloading libraries

Go 122 105 Updated Mar 26, 2026

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 4,151 299 Updated Jan 14, 2026

OpenClaw-RL: Train any agent simply by talking

Python 4,306 424 Updated Mar 27, 2026

happy happy happyclaw~

TypeScript 526 80 Updated Mar 27, 2026

The awesome collection of OpenClaw skills. 5,400+ skills filtered and categorized from the official OpenClaw Skills Registry.🦞

42,463 4,047 Updated Mar 26, 2026

基于 Garmin 数据,通过专业的跑力(VDOT)分析、心率区间分布、配速趋势等多维度数据,为跑者提供科学的训练建议和数据洞察。

TypeScript 33 4 Updated Feb 24, 2026

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 1,101 110 Updated Dec 30, 2024

A Model Context Protocol (MCP) server that enables secure interaction with MySQL databases

Python 1,186 228 Updated Jun 5, 2025

OpenAI ChatGPT, GPT-5, GPT-Image-1, Whisper API clients for Go

Go 10,603 1,686 Updated Oct 21, 2025

A Model Context Protocol server that provides read-only access to MySQL databases. This server enables LLMs to inspect database schemas and execute read-only queries.

JavaScript 1,436 184 Updated Mar 10, 2026
Next