Starred repositories
[HPCA 2026] Official implementation of "Focus: A Streaming Concentration Architecture for Efficient Vision-Language Models"
✨✨Latest Advances on Multimodal Large Language Models
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
🚀 Efficient implementations of state-of-the-art linear attention models
a clone of POCL that includes RISC-V newlib devices support and Vortex
yisier / nps
Forked from ehang-io/nps基于NPS 0.26.10 版本二开而来,NPS接力项目。
A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention
Deep Learning Primitives and Mini-Framework for OpenCL
DLPrimitives/OpenCL out of tree backend for pytorch
Mobile-Agent: The Powerful GUI Agent Family
Chat with AI large language models running natively in your browser. Enjoy private, server-free, seamless AI conversations.
🎯 告别信息过载,AI 助你看懂新闻资讯热点,简单的舆情监控分析 - 多平台热点聚合+基于 MCP 的AI分析工具。监控35个平台(抖音、知乎、B站、华尔街见闻、财联社等),智能筛选+自动推送+AI对话分析(用自然语言深度挖掘新闻:趋势追踪、情感分析、相似检索等13种工具)。支持企业微信/个人微信/飞书/钉钉/Telegram/邮件/ntfy/bark/slack 推送,1分钟手机通知,无需…
A small OpenCL benchmark program to measure peak GPU/CPU performance.
Scientific computing with Metal in C++: Matrix multiplication example
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
A curated collection of fun and creative examples generated with Nano Banana & Nano Banana Pro🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the commu…
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Demonstration of running a native LLM on Android device.
Self-implemented NN operators for Qualcomm's Hexagon NPU
Run Stable Diffusion on Android Devices with Snapdragon NPU acceleration. Also supports CPU/GPU inference.
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Cloud Mail - Simple Email Service on Cloudflare | 基于 Cloudflare 的简约响应式邮箱服务 | Cloudflare Email 邮箱 Mail