Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
DFlash: Block Diffusion for Flash Speculative Decoding
Build academic conference posters as a single HTML/CSS file, rendered to print-ready PDF via headless Chromium. A coding-agent skill.
Official repository for Parallax (Parameterized Local Linear Attention)
[MLSys 26] 🥇 Solution for Gated Delta Net Track of MLSys 26 Flash infer competition
PolyGLU: a drop-in replacement for SwiGLU in transformer FFN blocks. Inspired by neurotransmitter-receptor diversity, each neuron routes between 4 qualitatively distinct activation functions via hy…
The official implementation of the W&B Models and Weave MCP server.
Official PyTorch Implementation of Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention
Efficient Infinite Context Transformers with Infini-attention Pytorch Implementation + QwenMoE Implementation + Training Script + 1M context keypass retrieval
Skills for writing tilelang and debugging with CUDA toolkits.
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火星文的数据。MNBVC数据集包括新闻、作文、小说、书籍、杂志、论文、台词、帖子、wiki、古诗、歌词、商品介绍、笑话、糗事、聊天记录等一切形式的纯文本中文数据。
high-performance linear attention kernel library built on TileLang
Claude Code skill that removes signs of AI-generated writing from text
An agentic skills framework & software development methodology that works.
Mobile and Web client for Codex and Claude Code, with realtime voice, encryption and fully featured
提取微信聊天记录,将其导出成HTML、Word、CSV文档永久保存,对聊天记录进行分析生成年度聊天报告
Python tool for converting files and office documents to Markdown.
A unified library of SOTA model optimization techniques like quantization, distillation, pruning, neural architecture search, speculative decoding, etc. It compresses deep learning models for downs…
Understand and test language model architectures on synthetic tasks.
FlashInfer: Kernel Library for LLM Serving