Stars
A powerful AI coding agent. Built for the terminal.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
A data visualization and analytics component, especially well-suited for large and/or streaming datasets.
[3DV 2026] Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting
(ICCV2025) EEdit⚡: Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
[ICCV2025] From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Official PyTorch implementation of the paper "dLLM-Cache: Accelerating Diffusion Large Language Models with Adaptive Caching" (dLLM-Cache).
A Conversational Speech Generation Model
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
yqxu / FlashMLA
Forked from deepseek-ai/FlashMLAFlashMLA: Efficient MLA Decoding Kernel for Hopper GPUs
DeepEP: an efficient expert-parallel communication library
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
《ECMAScript 6入门》是一本开源的 JavaScript 语言教程,全面介绍 ECMAScript 6 新增的语法特性。
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Superagent protects your AI applications against prompt injections, data leaks, and harmful outputs. Embed safety directly into your app and prove compliance to your customers.
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
a netty like asynchronous network I/O library based on tcp/udp/websocket; a bidirectional RPC framework based on JSON/Protobuf; a microservice framework based on zookeeper/etcd
《The Way to Go》中文译本,中文正式名《Go 入门指南》
ScaleCube Cluster is a lightweight Java VM implementation of SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol. features cluster membership, failure detection, and …
10 differentiable physical simulators built with Taichi differentiable programming (DiffTaichi, ICLR 2020)
Grammars written for ANTLR v4; expectation that the grammars are free of actions.
ebpf-go is a pure-Go library to read, modify and load eBPF programs and attach them to various hooks in the Linux kernel.