Skip to content
View Gyu1291's full-sized avatar
🚀
Let's rocket
🚀
Let's rocket
  • KAIST
  • Seoul, Korea
  • 07:51 (UTC +09:00)

Block or report Gyu1291

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.

Python 3,503 350 Updated Nov 5, 2025

A fully open source biomolecular structure prediction model based on AlphaFold3

Python 343 25 Updated Nov 4, 2025

PyTorchSim is a Comprehensive, Fast, and Accurate NPU Simulation Framework

Python 42 4 Updated Nov 5, 2025

A machine learning accelerator core designed for energy-efficient AI at the edge.

Emacs Lisp 1,721 170 Updated Nov 3, 2025
Python 19 4 Updated Oct 21, 2025

A simulator for SK hynix AiM PIM architecture based on Ramulator 2.0

C++ 42 8 Updated Jul 22, 2025

Cross-Platform, GPU Accelerated Whisper 🏎️

TypeScript 1,805 83 Updated Feb 27, 2024

A curated list of open-source projects that help leverage CXL technology.

22 Updated Sep 26, 2024

Run LLMs with MLX

Python 2,764 298 Updated Nov 5, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,385 228 Updated Nov 2, 2025

Created and enhanced a local LLM training system on Apple Silicon with MLX and Metal API, overcoming the absence of CUDA support. Fine-tuned the Llama3 model on 16 GPUs for streamlined solution of …

Python 20 5 Updated May 29, 2024

Fast Multimodal LLM on Mobile Devices

C++ 1,162 141 Updated Nov 5, 2025

A simple Python script for running LLMs on Intel's Neural Processing Units (NPUs)

Python 24 1 Updated Oct 17, 2025

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 16,927 1,292 Updated Nov 3, 2025

A Lossless Compression Library for AI pipelines

Python 285 31 Updated Jul 3, 2025

GPU-Accelerated Lossless Data Compressors Survey

Cuda 121 11 Updated Sep 10, 2020

Open Source Specialized Computing Stack for Accelerating Deep Neural Networks.

Jupyter Notebook 224 74 Updated Apr 22, 2019

PyTorch implementation of AlphaZero Chess from scratch

Python 176 34 Updated Aug 7, 2024

Dynamic Memory Management for Serving LLMs without PagedAttention

C 434 33 Updated May 30, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 62,154 11,046 Updated Nov 5, 2025

The Artifact of NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering

59 5 Updated Aug 11, 2024

Open-source high-performance RISC-V processor

Scala 6,718 830 Updated Nov 5, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 19,765 3,273 Updated Nov 5, 2025

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 457 105 Updated Nov 5, 2025

Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).

Python 1,971 220 Updated Nov 5, 2025

RISC-V Integrated Matrix Development Repository

TeX 18 Updated Oct 13, 2025

A matrix extension proposal for AI applications under RISC-V architecture

Makefile 154 29 Updated Feb 11, 2025

AMD Ryzen™ AI Software includes the tools and runtime libraries for optimizing and deploying AI inference on AMD Ryzen™ AI powered PCs.

Python 683 107 Updated Nov 5, 2025

HBM2-PIM Simulator for lecture at the KAIST AI-PIM Center

C++ 6 1 Updated Jul 3, 2024

A repository where GPU applications are aggregated using a common build flow that supports multiple CUDA versions.

Cuda 81 54 Updated Oct 27, 2025
Next