Skip to content

Dominic789654/Dominic789654

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Xiang Liu - Efficient LLM Inference and Agent Systems

Homepage Google Scholar X

Ph.D. student @ HKUST(GZ) · Research Intern @ Mind Lab
Efficient and reliable LLMs: inference, long context, KV cache, retrieval, and agentic workflows.


40+ stars
personal public non-fork repos
8.4k+ / 1.1k+
contributed projects: LMFlow / kvpress
benchmark → method → artifact
how I like research to ship

Current Focus

Inference efficiency
KV-cache compression, token-efficient reasoning, energy-to-token evaluation, serving bottlenecks.
Long-context evaluation
Generation-focused benchmarks, dense reasoning integrity, multi-turn coherence.
Agent systems
Tool use, post-training, harness design, local-first agent workflow infrastructure.
Research infrastructure
Reproducible artifacts, project pages, scholar tracking, figure and report tooling.

Selected Work

Contributed to an extensible toolkit for fine-tuning and inference of large foundation models.

stars Python

Long-context generation benchmark for coherent, context-aware long-form responses.

stars paper

Policy-conditioned live-market evaluation for LLM trading agents. Benchmark the policy, not just the model.

Python agents

Local-first agent task hub with SQLite queueing, dependency-aware dispatch, templates, and dashboards.

SQLite workflow

Adapters between XML-like tool calls and OpenAI-style structured tool-call histories.

Python tool use

Project page for evaluating LLM inference as energy-to-token production.

HTML serving

Research Map

long-context generation ──┬── LongGenBench
                          ├── semantic integrity under KV compression
                          └── multi-turn coherence / FlowKV

agent capability eval ────┬── QuantArena
                          ├── tool-use adapters
                          └── local-first agent workflow runtime

efficient inference ──────┬── ChunkKV / KV compression
                          ├── token-efficient reasoning
                          └── energy-to-token production

Stack

Python PyTorch TypeScript React SQLite LaTeX

GitHub stats Top languages

repositories · publications · citations

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors