Skip to content
View zihaomu's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Shenzhen
  • 12:35 (UTC +08:00)

Organizations

@opencv

Block or report zihaomu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AccelOpt: Self-improving Agents for AI Accelerator Kernel Optimization

Python 21 Updated Jan 28, 2026

Our first fully AI generated deep learning system

Python 510 36 Updated Feb 2, 2026

人人都能用英语

TypeScript 33,525 4,713 Updated Feb 3, 2026

An unbiased CPU benchmark by OpenCV that provides an evaluation of different CPUs under real-world computer vision and AI workloads.

Python 5 Updated Feb 4, 2026

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 931 84 Updated Dec 31, 2025

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,382 418 Updated Feb 8, 2026

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Python 5,144 756 Updated Jan 27, 2026

Light Image Video Generation Inference Framework

Python 1,929 159 Updated Feb 6, 2026

SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation

Python 572 36 Updated Dec 23, 2025

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 21,071 2,520 Updated Jun 30, 2025

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 3,271 390 Updated Jun 11, 2025

FlagGems is an operator library for large language models implemented in the Triton Language.

Python 897 237 Updated Feb 9, 2026

CUDA & Triton Learning Project: Flash Attention 实现探索

Python 23 4 Updated Aug 14, 2025

LeetGPU Challenges

Python 616 52 Updated Feb 4, 2026

微舆:人人可用的多Agent舆情分析助手,打破信息茧房,还原舆情原貌,预测未来走向,辅助决策!从0实现,不依赖任何框架。

Python 35,404 6,793 Updated Jan 20, 2026

Simple high-throughput inference library

Python 155 10 Updated May 14, 2025

📰 Must-read papers on KV Cache Compression (constantly updating 🤗).

658 21 Updated Sep 30, 2025

[ICML 2025] RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression

Python 32 3 Updated Aug 7, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,149 338 Updated Jan 17, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,106 1,691 Updated Dec 17, 2025

LeetGPU Solutions

Python 107 5 Updated Oct 9, 2025

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

C++ 2,122 173 Updated Jan 29, 2026

Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels

Python 5,106 439 Updated Feb 9, 2026

Triton for OpenCL backend, and use mlir-translate to get source OpenCL code

MLIR 24 4 Updated Aug 27, 2025

The simplest, fastest repository for training/finetuning small-sized VLMs.

Python 4,631 462 Updated Oct 27, 2025

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 7,191 539 Updated May 5, 2025

Fork of the Triton language and compiler for Windows support and easy installation

MLIR 1,840 94 Updated Feb 8, 2026

compiler learning resources collect.

Python 2,679 365 Updated Mar 19, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 8,967 1,090 Updated Feb 5, 2026

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 9,390 1,015 Updated Dec 4, 2025
Next