Lists (1)
Sort Name ascending (A-Z)
Stars
ResearchClaw is a personal AI assistant built for research: fast to set up, easy to run locally or in the cloud, and ready to integrate with the chat apps you already use. With extensible skills, i…
Vue + SpringBoot + RocketMQ + Redis 智能AI视频解析平台。用户可通过 本地视频 或 在线链接 ,一键提取音频,文字,AI总结概括。涉及消息队列异步化、分布式锁防并发、分片上传与断点续传等等
CS33作业 2 的代码和飞书 qa, 这个作业太恶心了, 绝对是所有作业里面花的最久的
最新Docker容器技术,从真实案例中学习最佳实践!| Learn and understand Docker&Container technologies, with real DevOps practice!
slime is an LLM post-training framework for RL Scaling.
分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
Open deep learning compiler stack for Kendryte AI accelerators ✨
A list of awesome compiler projects and papers for tensor computation and deep learning.
Backward compatible ML compute opset inspired by HLO/MHLO
Distributed Compiler based on Triton for Parallel Systems
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
triton-lang / triton-cpu
Forked from triton-lang/tritonAn experimental CPU backend for Triton
OneDiff: An out-of-the-box acceleration library for diffusion models.
PyGim is the first runtime framework to efficiently execute Graph Neural Networks (GNNs) on real Processing-in-Memory systems. It provides a high-level Python interface, currently integrated with P…
PiDRAM is the first flexible end-to-end framework that enables system integration studies and evaluation of real Processing-using-Memory techniques. Prototype on a RISC-V rocket chip system impleme…
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
Ongoing research training transformer models at scale
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Open-source Framework for HPCA2024 paper: Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators