Skip to content
View JiwenJ's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report JiwenJ

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

赛博苏神:你的 research 好老师

HTML 6 2 Updated Jun 16, 2026

[CVPR 2026 Best Paper Finalist] Pixel Diffusion Transformers for Image Generation

Python 835 63 Updated Jun 16, 2026
Cuda 12 Updated Jun 16, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 28 24 Updated Jun 20, 2026

🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models

Python 10,512 1,119 Updated Jun 19, 2026

Accepted to MLSys 2026

Python 88 7 Updated Apr 19, 2026
Python 341 30 Updated Jun 15, 2026

Drop-in TaylorSeer/HiCache basis upgrade — training-free diffusion acceleration via a Dynamic Mode Decomposition (Prony) exponential feature-forecast basis. Not the SGLang KV-cache HiCache.

Python 8 1 Updated Jun 15, 2026

Official repository of the xLSTM.

Python 2,175 184 Updated May 28, 2026

Agentic Kernel Optimization — advanced & eXtensible: a closed-loop, campaign-based multi-agent system for optimizing GPU kernels (benchmark-swappable; default flashinfer-bench).

Python 55 10 Updated May 31, 2026

Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language

Python 299 21 Updated May 31, 2026

Ongoing research training transformer models at scale

Python 10 Updated Jun 22, 2026

[AAAI 2026] Official implementation of "FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models". If you find this repository helpful, please consider starring 🌟 it to support the p…

Python 17 2 Updated May 1, 2026

HiCache: Hermite Polynomial-based Feature Cache for diffusion inference

Python 14 1 Updated Jan 27, 2026

Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, an…

Python 4,395 499 Updated Jun 22, 2026

Analyze computation-communication overlap in V3/R1.

1,168 148 Updated Mar 21, 2025

An agent harness that compiles a model into one provably-correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS at batch-1 LLM decode, paper: https://arxiv.org/abs/2606.09682

Python 70 9 Updated Jun 18, 2026

Modular Markdown-based audio skills for AI agents and developers, covering signal processing, synthesis, effects, analysis, and spatial audio.

Shell 14 1 Updated May 21, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,327 223 Updated Jun 19, 2026

Open-source framework for the research and development of foundation models.

Python 1,128 133 Updated Jun 22, 2026

An LLM post-training framework with vLLM for RL Scaling

Python 290 30 Updated Jun 22, 2026

Fast and memory-efficient exact attention

Python 8 2 Updated Jun 22, 2026

An implementation of the all-new rope from jianlin

Python 9 Updated Oct 6, 2025

UniRL is a Framework for Unified Multimodal Model Reinforcement Learning

Python 683 43 Updated Jun 22, 2026

MiMo Code: Where Models and Agents Co-Evolve

TypeScript 10,266 959 Updated Jun 22, 2026
Python 80 9 Updated Feb 5, 2026

Muon in Int8 Precision Made Possible

Python 19 1 Updated Jun 18, 2026

Official repository for Parallax (Parameterized Local Linear Attention)

Python 61 5 Updated Jun 20, 2026
Next