-
Xiamen University
Starred repositories
Toolchain built around the Megatron-LM for Distributed Training
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
This script allows you to download VS Code extensions as VSIX files directly from the Visual Studio Marketplace.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
how to optimize some algorithm in cuda.
Puzzles for learning Triton
A web-based markdown viewer optimized for Obsidian
在docker中安装了obsidian可以部署在服务器上通过网页进行访问
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Baichuan-Omni: Towards Capable Open-source Omni-modal LLM 🌊
Zero Bubble Pipeline Parallelism
deepspeedai / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
LLMs interview notes and answers:该仓库主要记录大模型(LLMs)算法工程师相关的面试题和参考答案
Tutel MoE: Optimized Mixture-of-Experts Library, Support GptOss/DeepSeek/Kimi-K2/Qwen3 using FP8/NVFP4/MXFP4
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
Ongoing research training transformer language models at scale, including: BERT & GPT-2
A Data Streaming Library for Efficient Neural Network Training
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
BELLE: Be Everyone's Large Language model Engine(开源中文对话大模型)