Skip to content
View thynics's full-sized avatar

Block or report thynics

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A parallel programming training mini app simulating weather-like flows

C++ 170 79 Updated Aug 11, 2025

BGHT: High-performance static GPU hash tables.

C++ 71 8 Updated Jul 2, 2025

Scalable radix top-k selection on GPUs.

Cuda 19 2 Updated Jan 27, 2025

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,208 178 Updated Jul 29, 2023

[TMLR 2024] Efficient Large Language Models: A Survey

1,240 97 Updated Jun 23, 2025

LLM inference in C/C++

C++ 91,789 14,183 Updated Dec 22, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,922 12,118 Updated Dec 22, 2025

List of papers related to neural network quantization in recent AI conferences and journals.

771 59 Updated Mar 27, 2025

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache

Python 343 36 Updated Nov 20, 2025

[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

Python 395 38 Updated Aug 13, 2024

[ICML 2024] BiLLM: Pushing the Limit of Post-Training Quantization for LLMs

Python 227 16 Updated Jan 11, 2025

📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉

Cuda 9,010 881 Updated Dec 4, 2025

A post-modern modal text editor.

Rust 41,979 3,216 Updated Dec 19, 2025

CUDA on non-NVIDIA GPUs

Rust 13,684 880 Updated Dec 19, 2025

AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,898 2,275 Updated Sep 3, 2025

zTT: Learning-based DVFS with Zero Thermal Throttling for Mobile Devices [MobiSys'21] - Artifact Evaluation

C 27 17 Updated May 10, 2021

a Model-Free GPU Online Energy Optimization (MF-GPOEO) framework

C++ 4 2 Updated Dec 11, 2023

XiTAO is a lightweight layer built on top of modern C++ features with the goals of being low-overhead and serving as a development platform for testing scheduling and resource management algorithms.

C++ 2 1 Updated Jun 2, 2021

Material for gpu-mode lectures

Jupyter Notebook 5,441 552 Updated Dec 8, 2025

本人的科研经验

9,197 502 Updated Dec 12, 2025

A quick survival guild for i18n students who comes to chalmers.

SCSS 4 2 Updated Nov 18, 2023

Wiki fo HPC

Python 123 12 Updated Jul 23, 2025

😏国内外计算机的优秀课程,包含MIT、CMU等世界CS名校,🔥🔥其中包含计算机基础学科(操作系统、计算机网络、编译器、数据库、数据结构与算法等)以及人工智能&AI等高级科目,欢迎通过PR形式贡献!

1,682 190 Updated Apr 18, 2023

hpc-learning

766 47 Updated May 30, 2024

My curriculum vitae (CV) written using LaTeX.

TeX 872 263 Updated Sep 11, 2024

Project level config for insanely fast feedback loops

Rust 21,179 841 Updated Dec 21, 2025

程序员延寿指南 | A programmer's guide to live longer

34,613 2,369 Updated May 19, 2025

欧港新CS留学项目指北

HTML 767 62 Updated Aug 25, 2025