Skip to content
View Gyu1291's full-sized avatar
🚀
Let's rocket
🚀
Let's rocket
  • KAIST
  • Seoul, Korea
  • 11:27 (UTC +09:00)

Block or report Gyu1291

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

C++ 12,073 1,850 Updated Nov 8, 2025

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 3,627 373 Updated Nov 9, 2025

A fully open source biomolecular structure prediction model based on AlphaFold3

Python 367 28 Updated Nov 7, 2025

PyTorchSim is a Comprehensive, Fast, and Accurate NPU Simulation Framework

Python 42 3 Updated Nov 6, 2025

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 1,743 176 Updated Nov 6, 2025
Python 20 4 Updated Oct 21, 2025

A simulator for SK hynix AiM PIM architecture based on Ramulator 2.0

C++ 42 8 Updated Jul 22, 2025

Cross-Platform, GPU Accelerated Whisper 🏎️

TypeScript 1,805 83 Updated Feb 27, 2024

A curated list of open-source projects that help leverage CXL technology.

22 Updated Sep 26, 2024

Run LLMs with MLX

Python 2,804 299 Updated Nov 7, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,391 229 Updated Nov 2, 2025

Created and enhanced a local LLM training system on Apple Silicon with MLX and Metal API, overcoming the absence of CUDA support. Fine-tuned the Llama3 model on 16 GPUs for streamlined solution of …

Python 20 5 Updated May 29, 2024

Fast Multimodal LLM on Mobile Devices

C++ 1,167 141 Updated Nov 8, 2025

A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)

Python 26 1 Updated Oct 17, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 17,005 1,297 Updated Nov 3, 2025

A Lossless Compression Library for AI pipelines

Python 285 31 Updated Jul 3, 2025

GPU-Accelerated Lossless Data Compressors Survey

Cuda 121 11 Updated Sep 10, 2020

Open Source Specialized Computing Stack for Accelerating Deep Neural Networks.

Jupyter Notebook 224 74 Updated Apr 22, 2019

PyTorch implementation of AlphaZero Chess from scratch

Python 176 34 Updated Aug 7, 2024

Dynamic Memory Management for Serving LLMs without PagedAttention

C 434 33 Updated May 30, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,536 11,127 Updated Nov 9, 2025

The Artifact of NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering

59 5 Updated Aug 11, 2024

Open-source high-performance RISC-V processor

Scala 6,723 832 Updated Nov 8, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 20,054 3,309 Updated Nov 9, 2025

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 460 105 Updated Nov 9, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,978 220 Updated Nov 5, 2025

RISC-V Integrated Matrix Development Repository

TeX 18 Updated Oct 13, 2025

A matrix extension proposal for AI applications under RISC-V architecture

Makefile 155 29 Updated Feb 11, 2025

AMD Ryzen™ AI Software includes the tools and runtime libraries for optimizing and deploying AI inference on AMD Ryzen™ AI powered PCs.

Python 686 107 Updated Nov 5, 2025

HBM2-PIM Simulator for lecture at the KAIST AI-PIM Center

C++ 6 1 Updated Jul 3, 2024
Next