-
Institute of Automation, Chinese Academy of Sciences
- Beijing
-
06:36
(UTC +08:00) - https://jiwenj.github.io/
- https://www.zhihu.com/people/JiwenJ
Lists (8)
Sort Name ascending (A-Z)
Starred repositories
[CVPR 2026 Best Paper Finalist] Pixel Diffusion Transformers for Image Generation
sgl-project / DeepGEMM
Forked from deepseek-ai/DeepGEMMDeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
🤗 ml-intern: an open-source ML engineer that reads papers, trains models, and ships ML models
Drop-in TaylorSeer/HiCache basis upgrade — training-free diffusion acceleration via a Dynamic Mode Decomposition (Prony) exponential feature-forecast basis. Not the SGLang KV-cache HiCache.
Agentic Kernel Optimization — advanced & eXtensible: a closed-loop, campaign-based multi-agent system for optimizing GPU kernels (benchmark-swappable; default flashinfer-bench).
Agentic Kernel Optimization for All — automated GPU kernel optimization for any kernel, any hardware, any language
ISEEKYAN / mlite
Forked from NVIDIA/Megatron-LMOngoing research training transformer models at scale
[AAAI 2026] Official implementation of "FlashSVD: Memory-Efficient Inference with Streaming for Low-Rank Models". If you find this repository helpful, please consider starring 🌟 it to support the p…
HiCache: Hermite Polynomial-based Feature Cache for diffusion inference
Omnigent is an open-source AI agent framework and meta-harness: orchestrate Claude Code, Codex, Cursor, Pi, and custom agents — swap harnesses without rewriting, enforce policies and sandboxing, an…
Analyze computation-communication overlap in V3/R1.
An agent harness that compiles a model into one provably-correct, self-retargeting CUDA megakernel and self-tunes it past cuBLAS at batch-1 LLM decode, paper: https://arxiv.org/abs/2606.09682
Modular Markdown-based audio skills for AI agents and developers, covering signal processing, synthesis, effects, analysis, and spatial audio.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Open-source framework for the research and development of foundation models.
An LLM post-training framework with vLLM for RL Scaling
Fast and memory-efficient exact attention
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
MiMo Code: Where Models and Agents Co-Evolve
Official repository for Parallax (Parameterized Local Linear Attention)