-
Institute of Computing Technology, CAS
- https://tfruan2000.github.io/
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
A high-throughput and memory-efficient inference and serving engine for LLMs
Scrapy, a fast high-level web crawling & scraping framework for Python.
Making large AI models cheaper, faster and more accessible
Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!
deep learning for image processing including classification and object-detection etc.
Fast and memory-efficient exact attention
DeepSeek Coder: Let the Code Write Itself
Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复
Datasets, Transforms and Models specific to Computer Vision
Ongoing research training transformer models at scale
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
FlashInfer: Kernel Library for LLM Serving
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉
LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.
Training and serving large-scale neural networks with auto parallelization.
AMD-SHARK Studio -- Web UI for SHARK+IREE High Performance Machine Learning Distribution
Distributed Compiler based on Triton for Parallel Systems
LLMPerf is a library for validating and benchmarking LLMs
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
Reinforcement learning environments for compiler and program optimization tasks
FlagGems is an operator library for large language models implemented in the Triton Language.