quaternior

Jinhyeok Kim quaternior

1 follower · 10 following

MS/Phd Student @ AIDAS Lab, SNU
https://sites.google.com/view/jinhyeokkim/home

Highlights

Organizations

Stars

15 results for source starred repositories written in Python

Clear filter

huggingface / transformers

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 152,270 31,092 Updated Nov 8, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,538 11,127 Updated Nov 9, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 20,413 2,125 Updated Nov 5, 2025

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 10,563 2,835 Updated Oct 29, 2025

xlite-dev / Awesome-LLM-Inference

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉

Python 4,674 319 Updated Aug 19, 2025

horseee / Awesome-Efficient-LLM

A curated list for Efficient Large Language Models

Python 1,892 145 Updated Jun 17, 2025

OpenDriveLab / Vista

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Python 812 58 Updated Jul 2, 2025

microsoft / sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Python 439 58 Updated Oct 16, 2025

FMInference / DejaVu

Python 345 44 Updated Apr 2, 2024

ConnollyLeon / awesome-Auto-Parallelism

A baseline repository of Auto-Parallelism in Training Neural Networks

Python 147 20 Updated Jun 25, 2022

RobertCsordas / moe_attention

Official repository for the paper "SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention"

Python 101 7 Updated Sep 30, 2024

Infini-AI-Lab / Sirius

Sirius, an efficient correction mechanism, which significantly boosts Contextual Sparsity models on reasoning tasks while maintaining its efficiency gain.

Python 21 3 Updated Sep 10, 2024

RobertCsordas / switchhead

Python 16 1 Updated Jun 11, 2025

abdelfattah-lab / shadow_llm

Python 10 1 Updated Sep 20, 2024

dhjoo98 / mustafar

codebase for MUSTAFAR:Promoting Unstructured Sparsity for KV Pruning in LLM Inference

Python 8 2 Updated Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly