shifeiwen

shifeiwen

3 followers · 3 following

北京
19:12 (UTC -12:00)

Stars

facebookresearch / SpinQuant

Code repo for the paper "SpinQuant LLM quantization with learned rotations"

Python 405 90 Updated Feb 14, 2025

saic-fi / MobileQuant

[EMNLP Findings 2024] MobileQuant: Mobile-friendly Quantization for On-device Language Models

Python 68 6 Updated Sep 22, 2024

qualcomm / ai-hub-models

Qualcomm® AI Hub Models is our collection of state-of-the-art machine learning models optimized for performance (latency, memory etc.) and ready to deploy on Qualcomm® devices.

Python 1,133 202 Updated Jun 23, 2026

pytorch / executorch

On-device AI across mobile, embedded and edge for PyTorch

Python 4,747 1,041 Updated Jun 23, 2026

ggml-org / llama.cpp

LLM inference in C/C++

C++ 117,748 19,835 Updated Jun 23, 2026

huggingface / text-generation-inference

Large Language Model Text Generation Inference

Python 10,864 1,270 Updated Mar 21, 2026

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 83,608 18,346 Updated Jun 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly