dalistarh

Dan Alistarh dalistarh

46 followers · 1 following

IST Austria & Neural Magic

Achievements

Stars

IST-DASLab / llmq

Quantized LLM training in pure CUDA/C++.

C++ 214 14 Updated Nov 3, 2025

IST-DASLab / gptq-gguf-toolkit

Efficient non-uniform quantization with GPTQ for GGUF

Python 52 4 Updated Sep 17, 2025

IST-DASLab / FP-Quant

Python 59 10 Updated Nov 5, 2025

IST-DASLab / qutlass

QuTLASS: CUTLASS-Powered Quantized BLAS for Deep Learning

C++ 125 9 Updated Nov 4, 2025

IST-DASLab / MoE-Quant

Code for data-aware compression of DeepSeek models

Python 57 10 Updated Jun 10, 2025

turboderp-org / exllamav3

An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs

Python 555 51 Updated Nov 2, 2025

unslothai / unsloth

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek-R1, Qwen3, Gemma 3, TTS 2x faster with 70% less VRAM.

Python 47,906 3,916 Updated Nov 5, 2025

IST-DASLab / QuEST

Work in progress.

Jupyter Notebook 75 6 Updated Jun 29, 2025

AIDC-AI / Marco-o1

An Open Large Reasoning Model for Real-World Solutions

Python 1,524 80 Updated May 30, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,088 11,037 Updated Nov 5, 2025

usamec / this-paper-could-be-a-tweet

List of (mostly ML) papers, where description of the method could be shortened significantly

5 Updated Nov 9, 2024

HanGuo97 / flute

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

C++ 375 18 Updated Apr 13, 2025

IST-DASLab / Mathador-LM

Code for the EMNLP 2024 paper "Mathador-LM: A Dynamic Benchmark for Mathematical Reasoning on LLMs".

Python 8 Updated Jun 18, 2024

linkedin / Liger-Kernel

Efficient Triton Kernels for LLM Training

Python 5,800 426 Updated Nov 5, 2025

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 2,194 276 Updated Nov 4, 2025

turboderp-org / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4,355 324 Updated Aug 16, 2025

medical-genomics-group / gVAMP

Vector Approximate Message Passing inference framework for GWAS

C++ 16 4 Updated Jul 16, 2025

IST-DASLab / PanzaMail

Python 296 19 Updated Apr 8, 2025

IST-DASLab / RoSA

Official implementation of the ICML 2024 paper RoSA (Robust Adaptation)

Python 44 6 Updated Feb 13, 2024

Cornell-RelaxML / quip-sharp

Python 564 49 Updated Oct 29, 2024

IST-DASLab / qmoe

Code for the paper "QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models".

Python 277 23 Updated Nov 3, 2023

princeton-nlp / LLM-Shearing

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Python 631 56 Updated Mar 4, 2024

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,984 525 Updated Apr 11, 2025

jzhang38 / TinyLlama

The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.

Python 8,784 572 Updated May 3, 2024

qwopqwop200 / GPTQ-for-LLaMa

4 bits quantization of LLaMA using GPTQ

Python 3,076 461 Updated Jul 13, 2024

IST-DASLab / sparseprop

C++ 15 5 Updated Sep 27, 2023

nebuly-ai / optimate

A collection of libraries to optimise AI model performances

Python 8,366 632 Updated Jul 22, 2024

IST-DASLab / gptq

Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".

Python 2,213 182 Updated Mar 27, 2024

IST-DASLab / spdy

Code for ICML 2022 paper "SPDY: Accurate Pruning with Speedup Guarantees"

Python 20 4 Updated May 3, 2023

IST-DASLab / torch_cgx

Pytorch distributed backend extension with compression support

C++ 16 Updated Mar 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dan Alistarh dalistarh

Achievements

Achievements

Block or report dalistarh

Stars

IST-DASLab / llmq

IST-DASLab / gptq-gguf-toolkit

IST-DASLab / FP-Quant

IST-DASLab / qutlass

IST-DASLab / MoE-Quant

turboderp-org / exllamav3

unslothai / unsloth

IST-DASLab / QuEST

AIDC-AI / Marco-o1

vllm-project / vllm

usamec / this-paper-could-be-a-tweet

HanGuo97 / flute

IST-DASLab / Mathador-LM

linkedin / Liger-Kernel

vllm-project / llm-compressor

turboderp-org / exllamav2

medical-genomics-group / gVAMP

IST-DASLab / PanzaMail

IST-DASLab / RoSA

Cornell-RelaxML / quip-sharp

IST-DASLab / qmoe

princeton-nlp / LLM-Shearing

AutoGPTQ / AutoGPTQ

jzhang38 / TinyLlama

qwopqwop200 / GPTQ-for-LLaMa

IST-DASLab / sparseprop

nebuly-ai / optimate

IST-DASLab / gptq

IST-DASLab / spdy

IST-DASLab / torch_cgx