MPhil Student @ HKUST(GZ) | Researching LLM Agents & Reinforcement Learning
I am a research-oriented MPhil student passionate about building intelligent systems that can reason, learn, and act autonomously. My work lies at the intersection of Large Language Model (LLM) Agents and Reinforcement Learning (RL), driven by a systems engineering philosophy: I enjoy not only exploring novel AI concepts but also architecting robust, end-to-end solutions. My goal is to bridge the gap between theoretical research and practical, impactful applications.
🔭 I am actively seeking a Research Internship. I am eager to tackle challenging problems and contribute to a world-class team. I'm particularly interested in:
- Autonomous Agent Architectures & Planning
- Multi-Agent Systems & Collaboration
- Human-AI Interaction & Tool Usage
- Efficient Fine-tuning & RLHF for Agents
Here are some of the projects I'm proud of. They showcase my skills across research, systems engineering, and product thinking.
Click to see my featured projects
| Project | Description | Technologies Used |
|---|---|---|
| 🧠 LLM Mathematical Reasoning | Led research on enhancing LLM math skills using multi-dimensional rewards & RL. Achieved a 4.03% accuracy gain on GSM8K with a single GPU. Filed one national invention patent. | PyTorch, Unsloth, QLoRA, TRL, Transformers, W&B |
| 🛠️ JChat - AI Super Terminal | An AI productivity engine for developers, featuring an ultra-long context editor (Monaco), workflow automation, and a 100% local-first architecture. Live Demo | Next.js, TypeScript, React, IndexedDB, Monaco, Zustand |
| 🛡️ Multimodal Risk Identification | Architected an award-winning workflow to detect risky content using LLMs, dynamic tree parsing, and staged inference. (National 2nd Prize) | LLM, CoT, Dify, LMDeploy, AWQ, Python |
| ⚙️ C++ Unix-like File System | Implemented a POSIX-compatible file system from scratch in C++, covering inodes, block management, Dentry caching, and a full CI/CD pipeline. GitHub Repo | C++, POSIX, Filesystem, CI/CD, Gitea Actions |
My technical skills are broad, allowing me to move fluidly between research and implementation.
Click to view a detailed breakdown of my skills
- AI & Machine Learning:
PyTorch,HuggingFace Transformers,Unsloth,TRL,scikit-learn,Pandas,NumPy - Backend & Systems:
Python(Expert),C++(Proficient),Shell,Node.js,Django,SQL,Docker,Nginx - Frontend & Data Viz:
TypeScript,Next.js,React,Matplotlib,Streamlit,HTML/CSS - DevOps & Collaboration:
Git,GitHub Actions,CI/CD,PostgreSQL,pgvector,Redis,Linux,Vim