Harahan Harahan

🎯

Focusing

Ph.D. @ HKUST; B.Eng @ BUAA

47 followers · 3 following

The Hong Kong University of Science and Technology
Hong Kong SAR
21:38 (UTC +08:00)
https://harahan.github.io/
https://orcid.org/0009-0002-7898-8402

Highlights

Lists (3)

Sort

🔮 Future ideas

✨ Inspiration

🚀 My stack

8 repositories

Stars

3 stars written in C++

Clear filter

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,541 1,006 Updated Feb 6, 2026

mit-han-lab / omniserve

[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

C++ 821 61 Updated Mar 6, 2025

sunnynexus / RetroLLM

[ACL 2025] RetroLLM: Empowering LLMs to Retrieve Fine-grained Evidence within Generation

C++ 117 4 Updated Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly