Skip to content
View NoakLiu's full-sized avatar

Organizations

@FastLM

Block or report NoakLiu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

ICCAD'23 Best Paper Award candidate: Robust GNN-based Representation Learning for HLS

LLVM 27 5 Updated May 23, 2024

[ICML2026] Auto-Regressive Long Video Generation via 2-Bit KV-Cache Quantization

Python 48 4 Updated Jun 4, 2026

[ICML2025, NeurIPS2025 Spotlight] Sparse VideoGen 1 & 2: Accelerating Video Diffusion Transformers with Sparse Attention

Python 680 46 Updated Mar 6, 2026

Fast and memory-efficient exact kmeans

Python 576 33 Updated Jun 3, 2026

Opal (O.P.A.L. - Open simulator Platform for distributed AI and LLM workflows) is an LLM platform-level simulator written purely in Python. It can be used to explore policies, deployment configurat…

Python 5 2 Updated Jun 11, 2026
Python 4 1 Updated May 28, 2026

[NeurIPS 2024] Efficient LLM Scheduling by Learning to Rank

Python 80 17 Updated Nov 4, 2024
C++ 144 21 Updated Jan 30, 2025

This is the documentation repository for SGLang. It is auto-generated from https://github.com/sgl-project/sglang

Jupyter Notebook 134 38 Updated Jun 10, 2026

My learning notes for ML SYS.

Python 6,499 440 Updated Jun 8, 2026

[ICML 2026] Code for Equilibrium Reasoners: learning attractor dynamics for scalable reasoning

Python 38 5 Updated Jun 1, 2026
Rust 10 4 Updated Nov 9, 2023
Python 8 Updated Feb 18, 2025
Rust 16 3 Updated Apr 13, 2024

[EMNLP 2025 Main Conference] QSpec: Speculative Decoding with Complementary Quantisation Schemes

Python 7 1 Updated Mar 9, 2026
Python 16 1 Updated Feb 20, 2024
Cuda 7 2 Updated Sep 5, 2025

Official Implementation for the paper "d1: Scaling Reasoning in Diffusion Large Language Models via Reinforcement Learning"

Python 447 52 Updated Jan 26, 2026

Thoughts-as-Planning: Latent World Models for Chain-of-Thoughts Optimization via Reinforcement Planning

C++ 3 Updated Jun 1, 2026

OLIVE: Online Low-Rank Incremental Learning for Efficient Adaptive Exoskeletons

C++ 2 Updated Apr 19, 2026

[TMLR 2025] On Memorization in Diffusion Models

Python 31 3 Updated Oct 5, 2023

A novel and efficient post-training INT6 quantization framework tailored for LLM inference.

Jupyter Notebook 11 1 Updated Dec 26, 2025

Ranking, acceptance rate, deadline, and publication tips

Python 341 39 Updated Mar 25, 2021

Run LLMs on AMD Ryzen™ AI NPUs in minutes. Just like Ollama - but purpose-built and deeply optimized for the AMD NPUs.

C++ 1,464 102 Updated Jun 9, 2026

Collection of kernel accelerators optimised for LLM execution

C++ 32 4 Updated Feb 26, 2026

[CVPR 2026] Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

Python 123 11 Updated Mar 24, 2026

T2I-Adapter

Python 3,802 230 Updated Jun 21, 2024

SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis

Python 69 Updated Jul 24, 2025

[ICML 2025 Spotlight] RAPID: Long-Context Inference with Retrieval-Augmented Speculative Decoding

Python 23 Updated Mar 2, 2025
Next