Lyxien

Li Xin Lyxien

5 followers · 26 following

Achievements

Stars

mit-han-lab / ncu-report-skill

Python 130 17 Updated May 24, 2026

inclusionAI / cuLA

CUDA kernels for linear attention variants, written in CuTe DSL and CUTLASS C++.

Python 523 65 Updated Jun 17, 2026

TheTom / turboquant_plus

Python 6,955 925 Updated Jun 12, 2026

titanwings / colleague-skill

将冰冷的离别化为温暖的 Skill，欢迎加入数字生命1.0！Transforming cold farewells into warm skills? It's giving rebirth era. Welcome to Digital Life 1.0. 🫶

Python 19,649 1,943 Updated Jun 1, 2026

ultraworkers / claw-code

An agent-managed museum exhibit, built in Rust with Gajae-Code / LazyCodex — developed and maintained with no human intervention.

Rust 194,120 109,928 Updated Jun 8, 2026

brucefan1983 / CUDA-Programming

Sample codes for my CUDA programming book

Cuda 2,072 388 Updated Dec 14, 2025

EdwardChasel / BinaryAttention

[CVPR2026] BinaryAttention: One-Bit QK-Attention for Vision and Diffusion Transformers

Python 37 3 Updated Mar 17, 2026

nunchaku-ai / deepcompressor

Model Compression Toolbox for Large Language Models and Diffusion Models

Python 788 95 Updated Aug 14, 2025

hao-ai-lab / flash-attention-fp4

NVFP4 Flash-Attention 4 on BlackWell

Python 13 1 Updated Jun 21, 2026

SandAI-org / MagiCompiler

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 314 23 Updated May 31, 2026

RightNow-AI / autokernel

Autoresearch for GPU kernels. Give it any PyTorch model, go to sleep, wake up to optimized Triton kernels.

Python 1,418 143 Updated Mar 19, 2026

maxiaosong1124 / ncu-cuda-profiling-skill

let coding agents use ncu skills analysis cuda program automatically!

Shell 112 8 Updated May 25, 2026

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 18,156 1,152 Updated May 18, 2026

xiaoju111a / OpenLovart

OpenLovart 是一个基于 AI 的设计平台，让创意设计变得简单而强大。通过 AI 对话和智能画布，快速实现你的设计想法。

TypeScript 261 75 Updated Jan 23, 2026

skyzh / tiny-llm

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 4,295 333 Updated Jun 13, 2026

thu-ml / TurboDiffusion

TurboDiffusion: 100–200× Acceleration for Video Diffusion Models

Python 3,536 265 Updated Jun 17, 2026

Danielohayon / Block-Sparse-Flash-Attention

C++ 34 Updated Dec 10, 2025

Tencent-Hunyuan / flex-block-attn

flex-block-attn: an efficient block sparse attention computation library

Jupyter Notebook 130 14 Updated Dec 26, 2025

Tencent-Hunyuan / HunyuanVideo-1.5

HunyuanVideo-1.5: A leading lightweight video generation model

Python 4,492 229 Updated Apr 10, 2026

Lyxien / GPU-Performance-Glossary

1 Updated Nov 19, 2025

gau-nernst / learn-cuda

Learn CUDA with PyTorch

Cuda 336 50 Updated Jun 1, 2026

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 30,286 3,658 Updated Jun 26, 2025

Lyxien / cutlass_learn

Cuda 2 Updated Sep 22, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,482 18,291 Updated Jun 21, 2026

sankeerth95 / FG-Attn

9 Updated Nov 10, 2025

lx0623 / UltraPrecise

C++ 3 3 Updated Nov 5, 2023

xlite-dev / Awesome-DiT-Inference

📚A curated list of Awesome Diffusion Inference Papers with Codes: Sampling, Cache, Quantization, Parallelism, etc.🎉

Python 569 27 Updated Jun 13, 2026

AmberLJC / LLMSys-PaperList

Large Language Model (LLM) Systems Paper List

2,141 110 Updated Jun 21, 2026

oahzxl / Awesome-Efficient-Video-Generation

A curated list of recent efficient video generation methods.

72 3 Updated Oct 7, 2025

Lyxien / Triton-Puzzles-Lite

Puzzles for learning Triton, play it with minimal environment configuration!

Python 2 Updated Aug 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Li Xin Lyxien

Achievements

Achievements

Block or report Lyxien

Stars

mit-han-lab / ncu-report-skill

inclusionAI / cuLA

TheTom / turboquant_plus

titanwings / colleague-skill

ultraworkers / claw-code

brucefan1983 / CUDA-Programming

EdwardChasel / BinaryAttention

nunchaku-ai / deepcompressor

hao-ai-lab / flash-attention-fp4

SandAI-org / MagiCompiler

RightNow-AI / autokernel

maxiaosong1124 / ncu-cuda-profiling-skill

stas00 / ml-engineering

xiaoju111a / OpenLovart

skyzh / tiny-llm

thu-ml / TurboDiffusion

Danielohayon / Block-Sparse-Flash-Attention

Tencent-Hunyuan / flex-block-attn

Tencent-Hunyuan / HunyuanVideo-1.5

Lyxien / GPU-Performance-Glossary

gau-nernst / learn-cuda

karpathy / llm.c

Lyxien / cutlass_learn

vllm-project / vllm

sankeerth95 / FG-Attn

lx0623 / UltraPrecise

xlite-dev / Awesome-DiT-Inference

AmberLJC / LLMSys-PaperList

oahzxl / Awesome-Efficient-Video-Generation

Lyxien / Triton-Puzzles-Lite