withlin

🧸

Jinlin withlin

🧸

AI Infra, Docker, Kubernetes, SRE ,ebpf, Observability, MiddleWare, OpenTelemetry, Go, C#, Java, Rust, TypeScript.

438 followers · 700 following

GuangZhou,China

Achievements

x3 x3

Achievements

x3 x3

Organizations

Lists (17)

Sort

✨ Inspiration

interview

5 repositories

iterm

1 repository

k8s

25 repositories

k8s-network

2 repositories

k8s-operator

1 repository

kernel

4 repositories

leetcode

1 repository

network

47 repositories

remote

1 repository

rust

23 repositories

WebAssembly

区块链

5 repositories

Starred repositories

mindfold-ai / Trellis

The best agent harness.

TypeScript 10,375 574 Updated Jun 15, 2026

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,571 1,064 Updated Jul 1, 2024

bcefghj / learn-minimind

📖 从零基础到面试通关 —— 22节课彻底搞懂大语言模型 | Learn MiniMind: 系统化学习LLM训练全流程

TypeScript 372 34 Updated Apr 1, 2026

microsoft / agent-lightning

The absolute trainer to light up AI agents.

Python 17,310 1,515 Updated Apr 29, 2026

weiruihhh / cs336_note_and_hw

记录我在cs336学习时的笔记和作业

Python 912 29 Updated May 2, 2026

mocibb / cs336

C++ 97 3 Updated Jul 20, 2025

mcxiaoxiao / annotated-transformer-Chinese

哈佛大学 Transformer 经典入门教程 annotated-transformer-Chinese 中文版 Transformer 论文 Attention is All You Need 的 pytorch 中文注释代码实现，翻译自harvardnlp/annotated-transformer

Jupyter Notebook 90 12 Updated Jan 19, 2025

skyzh / tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 4,278 332 Updated Jun 13, 2026

NVIDIA-NeMo / ProRL-Agent-Server

Agentic RL on Any Harness at Scale

Python 554 57 Updated Jun 13, 2026

dqbd / tiktokenizer

Online playground for OpenAPI tokenizers

TypeScript 1,618 176 Updated Apr 24, 2025

VectifyAI / OpenKB

OpenKB: Open LLM Knowledge Base

Python 2,059 230 Updated Jun 15, 2026

nexu-io / open-design

🎨 Local-first, open-source Claude Design alternative. 🖥️ Native desktop app. ⚡ 259+ Skills · ✨ 142+ Design Systems 🖼️ Web · desktop · mobile prototypes · slides · images · videos · HyperFrames 📦 Sa…

TypeScript 64,915 7,274 Updated Jun 15, 2026

catswe / flash-attention-residuals

Triton kernels and PyTorch ops for Block Attention Residuals (AttnRes)

Python 82 6 Updated May 29, 2026

PaddleJitLab / CUDATutorial

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 1,016 103 Updated Jan 14, 2026

deepseek-ai / TileKernels

A kernel library written in tilelang

Python 1,586 138 Updated Apr 23, 2026

llm-d / llm-d-inference-payload-processor

Inference payload processor for llm-d

Go 9 21 Updated Jun 14, 2026

MoonshotAI / FlashKDA

FlashKDA: high-performance Kimi Delta Attention kernels

Cuda 449 38 Updated May 26, 2026

tsinghua-ideal / flash-topk-attention

Efficient and unified implementations for TopK-based sparse attention

Cuda 35 1 Updated Apr 20, 2026

kunchenguid / axi

Design principles for agent ergonomics. Higher accuracy with lower token cost than both MCP and regular CLI.

TypeScript 861 34 Updated Jun 11, 2026

amitshekhariitbhu / llm-internals

Learn LLM internals step by step - from tokenization to attention to inference optimization.

1,071 94 Updated Jun 14, 2026

toon-format / spec

Official specification for Token-Oriented Object Notation (TOON)

JavaScript 294 34 Updated Jun 12, 2026

inclusionAI / cuLA

CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

Python 519 64 Updated Jun 12, 2026

openai / codex-plugin-cc

Use Codex from Claude Code to review code or delegate tasks.

JavaScript 20,980 1,268 Updated Jun 14, 2026

chenglou / pretext

Fast, accurate & comprehensive text measurement & layout

TypeScript 48,425 2,695 Updated Jun 12, 2026

deusyu / harness-engineering

Harness Engineering 学习指南 — 从概念理解到独立实践的深度学习档案

Shell 3,786 335 Updated Jun 11, 2026

teremterem / claude-code-gpt-5-codex

Run Anthropic's Claude Code CLI with OpenAI models such as GPT-5-Codex, GPT-5.1, and others via a local LiteLLM proxy.

Python 235 25 Updated Jan 4, 2026

duoan / TorchCode

🔥 LeetCode for PyTorch — practice implementing softmax, attention, GPT-2 and more from scratch with instant auto-grading. Jupyter-based, self-hosted or try online.

Jupyter Notebook 4,165 355 Updated May 25, 2026

LMCache / LMCache

LMCache: Supercharge Your LLM with the Fastest KV Cache Layer

Python 9,063 1,320 Updated Jun 15, 2026

spiritbuun / buun-llama-cpp

Forked from TheTom/llama-cpp-turboquant

LLAMA Turboquant implementation with CUDA support

C++ 657 71 Updated Jun 4, 2026

gamogestionweb / Turboquant-llama

Shell 24 1 Updated Mar 26, 2026

Jinlin withlin

Organizations

Lists (17)

AI

ai-learing

ebpf

go-hack

✨ Inspiration

interview

iterm

k8s

k8s-network

k8s-operator

kernel

leetcode

network

remote

rust

WebAssembly

区块链

Starred repositories

Data structures

Algorithm