hq-ansel

hq_struggling hq-ansel

2 followers · 17 following

Achievements

Stars

ArthurinRUC / cutlass-notes

From Minimal GEMM to Everything

Cuda 86 4 Updated Nov 11, 2025

Infrasys-AI / AISystem

AISystem 主要是指AI系统，包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术

Jupyter Notebook 15,895 2,275 Updated Sep 3, 2025

linkedin / fmchisel

fmchisel: Efficient Compression and Training Algorithms for Foundation Models

Python 81 9 Updated Oct 23, 2025

ai-dynamo / nixl

NVIDIA Inference Xfer Library (NIXL)

C++ 774 209 Updated Dec 21, 2025

NVIDIA / TensorRT-LLM

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,442 1,970 Updated Dec 22, 2025

ByteDance-Seed / VeOmni

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,441 121 Updated Dec 20, 2025

clash-verge-rev / clash-verge-rev

A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience

TypeScript 88,265 6,496 Updated Dec 22, 2025

NVlabs / COAT

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 255 23 Updated Aug 9, 2025

DeepLink-org / ditorch

Python 26 2 Updated Jan 7, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,861 647 Updated Dec 21, 2025

vectozavr / llm-hessian

Using PyTorch autograd to compute Hessian of Perplexity for Large Language Models

Python 26 2 Updated Apr 17, 2025

ModelCloud / GPTQModel

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

Python 939 141 Updated Dec 22, 2025

DerrickYLJ / TidalDecode

[ICLR 2025] TidalDecode: A Fast and Accurate LLM Decoding with Position Persistent Sparse Attention

Python 49 4 Updated Aug 6, 2025

microsoft / BitNet

Official inference framework for 1-bit LLMs

Python 24,461 1,914 Updated Jun 3, 2025

ashvardanian / less_slow.cpp

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, networking and user-space IO

C++ 1,885 81 Updated Sep 10, 2025

Zippland / worth-calculator

Calculating the actual value of your job beyond just salary

TypeScript 2,974 186 Updated Dec 8, 2025

alibaba / rtp-llm

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 950 130 Updated Dec 22, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes for ML SYS.

Python 4,738 300 Updated Dec 22, 2025

SamsungLabs / PMPD

Codebase for the Progressive Mixed-Precision Decoding paper.

Python 19 Updated Jul 15, 2025

Tencent / KsanaLLM

C++ 518 43 Updated Nov 17, 2025

SNU-ARC / any-precision-llm

[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

Python 121 7 Updated Jul 4, 2025

fwtan / any-precision-llm

Forked from SNU-ARC/any-precision-llm

[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

Python 2 Updated Nov 20, 2024

HTML 12 1 Updated Nov 8, 2024

zhihu / ZhiLight

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 906 102 Updated Jul 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hq_struggling hq-ansel

Achievements

Achievements

Block or report hq-ansel

Stars

ArthurinRUC / cutlass-notes

Infrasys-AI / AISystem

linkedin / fmchisel

ai-dynamo / nixl

NVIDIA / TensorRT-LLM

ByteDance-Seed / VeOmni

clash-verge-rev / clash-verge-rev

NVlabs / COAT

DeepLink-org / ditorch

pytorch / torchtitan

vectozavr / llm-hessian

ModelCloud / GPTQModel

DerrickYLJ / TidalDecode

microsoft / BitNet

ashvardanian / less_slow.cpp

Zippland / worth-calculator

alibaba / rtp-llm

zhaochenyang20 / Awesome-ML-SYS-Tutorial

SamsungLabs / PMPD

Tencent / KsanaLLM

SNU-ARC / any-precision-llm

fwtan / any-precision-llm

NVlabs / tiny-cuda-nn

linkedin / Liger-Kernel

flashinfer-ai / flashinfer

InternLM / turbomind

microsoft / vattention

thustorage / Medusa

MachineLearningSystem / 25ASPLOS-Medusa

zhihu / ZhiLight