-
Huazhong University of Science and Technology
- Wuhan, China
- https://jianyue.tech
Highlights
- Pro
- All languages
- Assembly
- BibTeX Style
- C
- C#
- C++
- CSS
- Cuda
- Dart
- Dockerfile
- Emacs Lisp
- Gnuplot
- Go
- HTML
- Java
- JavaScript
- Jupyter Notebook
- Kotlin
- LLVM
- Lua
- MDX
- MLIR
- Makefile
- Mermaid
- Oz
- Perl
- PostScript
- PowerShell
- Python
- QML
- R
- ReScript
- Ruby
- Rust
- SCSS
- Scala
- Shell
- SystemVerilog
- TLA
- Tcl
- TeX
- TypeScript
- VHDL
- Vala
- Verilog
- Vim Script
- Vue
Starred repositories
Persist and reuse KV Cache to speedup your LLM.
The code of our paper "InfLLM: Unveiling the Intrinsic Capacity of LLMs for Understanding Extremely Long Sequences with Training-Free Memory"
Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message…
Tongyi Deep Research, the Leading Open-source Deep Research Agent
[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule
🚀 Efficient implementations of state-of-the-art linear attention models
Universal LLM Deployment Engine with ML Compilation
Lightweight coding agent that runs in your terminal
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
An AI agent development platform with all-in-one visual tools, simplifying agent creation, debugging, and deployment like never before. Coze your way to AI Agent creation.
🚀 The fast, Pythonic way to build MCP servers and clients
Developer-friendly, embedded retrieval engine for multimodal AI. Search More; Manage Less.
This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding code links.
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
A minimal GPU design in Verilog to learn how GPUs work from the ground up
Dynamic Memory Management for Serving LLMs without PagedAttention
An open-source AI agent that brings the power of Gemini directly into your terminal.
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.