-
Huazhong University of Science and Technology
- Wuhan, China
- https://jianyue.tech
Highlights
- Pro
- All languages
- Assembly
- BibTeX Style
- C
- C#
- C++
- CSS
- Cuda
- Dart
- Dockerfile
- Emacs Lisp
- Gnuplot
- Go
- HTML
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MDX
- MLIR
- Makefile
- Mermaid
- Oz
- Perl
- PostScript
- PowerShell
- Python
- QML
- R
- ReScript
- Ruby
- Rust
- SCSS
- Scala
- Shell
- SystemVerilog
- TLA
- Tcl
- TeX
- TypeScript
- VHDL
- Vala
- Verilog
- Vim Script
- Vue
Starred repositories
AG2 (formerly AutoGen): The Open-Source AgentOS. Join us at: https://discord.gg/pAbnFJrkgZ
🚀 Efficient implementations of state-of-the-art linear attention models
Computer Networks: A Systems Approach -- Textbook
Official PyTorch implementation for "Large Language Diffusion Models"
FlameScope is a visualization tool for exploring different time ranges as Flame Graphs.
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
MoBA: Mixture of Block Attention for Long-Context LLMs
Official repository for the BPF Performance Tools book
An Open Large Reasoning Model for Real-World Solutions
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
[NeurIPS'24 Spotlight, ICLR'25, ICML'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filli…
[ICLR 2025 Oral] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
[ACL'24 Outstanding] Data and code for L-Eval, a comprehensive long context language models evaluation benchmark
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Analysis leveldb source code step by step
[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718
Extracts the historic word occurrence of a search term in academic papers
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference
[ICLR2025 Spotlight] MagicPIG: LSH Sampling for Efficient LLM Generation