gimenu

Hyeonu Kim gimenu

0 followers · 10 following

Stars

LLMServe / DistServe

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 754 82 Updated Apr 6, 2025

floodsung / Deep-Learning-Papers-Reading-Roadmap

Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!

Python 39,404 7,335 Updated Nov 27, 2022

google-coral / coralnpu

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 1,964 217 Updated Dec 19, 2025

gthparch / MQSim_macsim

C++ 2 Updated Jun 5, 2024

gthparch / MMSpGEMM

MMSpGEMM

Cuda 1 Updated Aug 11, 2025

gthparch / macsim

A heterogeneous architecture timing model simulator.

C++ 173 61 Updated Sep 11, 2025

vortexgpgpu / vortex_tutorials

HTML 205 62 Updated Oct 29, 2025

vortexgpgpu / vortex

Verilog 1,830 421 Updated Dec 22, 2025

khanhnamle1994 / cracking-the-data-science-interview

A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep

Jupyter Notebook 4,350 1,176 Updated Aug 31, 2024

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 5,683 753 Updated Dec 25, 2025

mutinifni / splitwise-sim

LLM serving cluster simulator

Jupyter Notebook 128 13 Updated Apr 25, 2024

casys-kaist / pimba

Official code repository for "Pimba: A Processing-in-Memory Acceleration for Post-Transformer Large Language Model Serving [MICRO'25]"

Python 22 2 Updated Oct 23, 2025

KFM135 / chiplet-optimizer

This repository contains the code for this paper: Chiplet-Gym: An RL-based Optimization Framework for Chiplet-based AI Accelerator

Python 21 3 Updated Sep 28, 2024

spcl / rapidchiplet

A toolchain for rapid design space exploration of chiplet architectures

C++ 71 14 Updated Jul 25, 2025

scale-snu / LLMSimulator

C++ 24 2 Updated Oct 14, 2025

NVlabs / Jet-Nemotron

Python 717 47 Updated Nov 30, 2025

openai / sparse_attention

Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

Python 1,602 191 Updated Aug 12, 2020

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 66,155 12,181 Updated Dec 25, 2025

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

LLVM 36,085 15,584 Updated Dec 25, 2025

XueFuzhao / awesome-mixture-of-experts

A collection of AWESOME things about mixture-of-experts

1,245 82 Updated Dec 8, 2024

decoderesearch / SAELens

Training Sparse Autoencoders on Language Models

Python 1,129 208 Updated Dec 24, 2025

PSAL-POSTECH / ONNXim

ONNXim is a fast cycle-level simulator that can model multi-core NPUs for DNN inference

C++ 178 32 Updated Dec 9, 2025

OpenLMLab / LEval

[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark

Python 391 13 Updated Jul 9, 2024

mozhu621 / LongGenBench

Jupyter Notebook 29 7 Updated Oct 4, 2025

OpenBMB / InfiniteBench

Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718

Python 364 30 Updated Sep 25, 2024

infinigence / LVEval

Repository of LV-Eval Benchmark

Python 73 10 Updated Aug 31, 2024

LoongServe / LoongServe

Jupyter Notebook 126 12 Updated Nov 11, 2024

THUDM / LongBench

LongBench v2 and LongBench (ACL 25'&24')

Python 1,050 112 Updated Jan 15, 2025

wangkai930418 / awesome-diffusion-categorized

collection of diffusion model papers categorized by their subareas

2,093 95 Updated Dec 25, 2025

rosinality / ml-papers

My collection of machine learning papers

295 21 Updated Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly