Stars
taozha2 / cutlass-fork
Forked from intel/sycl-tlaCUDA Templates for Linear Algebra Subroutines
校招、秋招、春招、实习好项目,带你从零动手实现支持LLama2/3和Qwen2.5的大模型推理框架。
A light llama-like llm inference framework based on the triton kernel.
OpenAI Triton backend for Intel® GPUs
intel / sycl-tla
Forked from NVIDIA/cutlassSYCL* Templates for Linear Algebra (SYCL*TLA) - SYCL based CUTLASS implementation for Intel GPUs
LuFinch / pytorch
Forked from pytorch/pytorchTensors and Dynamic neural networks in Python with strong GPU acceleration
how to optimize some algorithm in cuda.
A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations
Tools to run and parse MKL verbose mode
sanchitintel / benchmark
Forked from pytorch/benchmarkTorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
《Machine Learning Systems: Design and Implementation》 (V2 is launching soon)
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
The Tensor Algebra SuperOptimizer for Deep Learning
A list of awesome compiler projects and papers for tensor computation and deep learning.
LightSeq: A High Performance Library for Sequence Processing and Generation
A compiler from Doxygen XML to reStructuredText -- hence, the name. It parses XML databases generated by Doxygen and produces reStructuredText for the Python documentation generator Sphinx.
oneAPI Deep Neural Network Library (oneDNN)
Detailed comments for ORB-SLAM2 with trouble-shooting, key formula derivation, and diagrammatic drawing
mjanderson09 / libxsmm
Forked from egeor/libxsmmLibrary targeting Intel Architecture for small, dense or sparse matrix multiplications, and small convolutions.
model optimization, model compression, model pruning
A repository of different Algorithms and Data Structures implemented in many programming languages.
This Repo consists of Data structures and Algorithms
Data Structures and Algorithms implemented In Python, C, C++, Java or any other languages. Aimed to help strengthen the concepts of DSA. Give a Star 🌟 if it helps you.