AI Agents & LLM Infrastructure · Deep Learning for Wireless Sensing · Embedded Systems
Open to full-time SWE / AI Engineer / ML Engineer roles — graduating Dec 2026.
Engineer who connects hardware signals to intelligent software, and who ships systems honestly — including when the simple baseline wins. Recently I've contributed merged fixes to several leading LLM-infrastructure projects (SGLang, LiteLLM, LangChain), built embedded RTOS firmware sampling RF at 77 kHz (3x prior published rates), trained deep-learning models that recover signals lost to aliasing with 0.986 R2 on chirp recovery, and shipped full-stack LLM agents live on the Chrome Web Store and in production.
- Contributed to leading LLM-infrastructure projects — merged PRs into SGLang (~29k★ serving framework), LiteLLM (50k★ gateway), and LangChain, spanning multi-tenant batching, multi-region routing, and prompt-encoding bugs (details below)
- Built a physics-informed neural network on NVIDIA B200 reconstructing aliased RF signals with 0.986 R2 on chirp recovery
- Custom Zephyr RTOS firmware on nRF54L15 hitting 77 kHz BLE RSSI sampling with <0.01% drop rate
- Shipped Archiagents (https://archiagents.com/) — an end-to-end AI agent for architectural design that takes project briefs through to IFC4 BIM models and photorealistic renders. Owned engineering implementation and VPS deployment (2-person team)
- Deployed a Claude-powered learning agent live on Chrome Web Store + HuggingFace, with a 4-policy benchmark and an honestly-reported finding that a rule-based heuristic outperformed Q-learning on short-horizon tasks
- Shipped RepoAgentBench, an open-source toolkit that mines merged PRs into reproducible coding-agent benchmarks; tested 4 frontier LLMs across claude-code and aider with real API spend
- Running a production LLM API gateway (https://api.manxuezhida.com) with multi-provider routing, load balancing, and key management — serves my downstream products
- Summer 2026 intern at Halo Microelectronics — full-stack AI agent system for analog IC design (RAG + agent orchestration)
Interests: LLM serving infrastructure, edge AI, wireless sensing, LLM agents, signal processing, sim-to-real for robotics.
sgl-project/sglang (~29k★) — high-performance LLM/multimodal inference-serving framework
- PR #26971 (merged): Fixed a batched multi-tenant cache-routing crash —
GenerateReqInput.extra_keywasn't indexed per sub-request, so the whole list was passed toRadixKey.child_key(), crashing prefix-cache matching withTypeError: unhashable type: 'list'. Added_normalize_extra_key()(scalar broadcast / list-length validation / parallel-sample expansion) + a 6-path regression test; passed 121 CI checks. - PR #25975 (merged, co-author): Prefill-delayer monitoring-metric fix —
prefill_delayer_wait_*histogram stuck at 0 because the release path readnext_state=None; maintainer adopted theprev_stateapproach and credited me as co-author.
BerriAI/litellm (50k★) — LLM gateway/proxy unifying 100+ providers
- PR #29707 (merged): Diagnosed a Vertex AI context-caching 404 on multi-region (eu/us) endpoints — the caching path hardcoded the single-region host instead of the multi-region REP host the inference path already used — and contributed the merged parametrized regression suite locking the corrected host-resolution invariant. 49 green CI checks.
langchain-ai/langchain-aws — AWS/Bedrock integrations for LangChain
- PR #1085 (merged): Repo-wide static analysis caught
ensure_ascii=Truedefaults injson.dumpsacross Bedrock converters, tool-schema serializers, and stream parsers — silently escaping CJK/emoji to\uXXXXand inflating prompt token cost ~6x. Fixed across 11 sites in 3 modules.
RepoAgentBench — my open-source CLI on PyPI for reproducible, contamination-free coding-agent benchmarks.
Languages
AI / ML
Backend & Web
Infrastructure
Embedded & Hardware
| Project | Description | Stack |
|---|---|---|
| Archiagents — https://archiagents.com/ | End-to-end AI agent for architectural design (2-person team). Ingests project briefs + CAD/DWG/IFC/Revit files, conducts requirement dialogue, generates design schemes, renders photorealistic visualizations (gpt-image-1), and outputs IFC4 BIM models with embedded Autodesk APS viewer. Multi-LLM backend (Claude / GPT / Gemini); deployed on custom domain via VPS. | Vercel AI SDK, shadcn/ui, gpt-image-1, Autodesk APS, IFC4 |
| LLM API Gateway — https://api.manxuezhida.com | Production LLM API proxy serving multiple providers (Claude / GPT / Gemini) with load balancing, API key management, and request routing. Powers SmartStudy Agent, Archiagents, and other downstream products. Custom domain on VPS. | Node.js, Express, VPS |
| SmartStudy Agent (Web · Chrome Extension) | Closed-loop POMDP learning agent with 4-policy benchmark (Random / Rule-based / LinUCB Bandit / Q-learning) over 30 simulated students x 30 sessions. Honestly reported finding: rule-based heuristic +35% over random vs Q-learning +18% — RL is defensible but not dominant in short-horizon regime. Live on Chrome Web Store + HuggingFace; 8-page Streamlit UI; 3 pluggable LLM backends. | Python, Claude API, Streamlit, SQLite, Chrome MV3 |
| RepoAgentBench | Open-source CLI that mines merged GitHub PRs into reproducible, contamination-free coding-agent benchmarks. Adapters for claude-code and aider; tested with 4 frontier LLMs (Opus 4.7 / GPT-5.5 / Sonnet 4.6 / Gemini 3.1 Pro) using real API spend. | Python, Click, PyPI, JSONL, GitHub API |
| NeuroUnfold | Physics-informed DL recovering 406 kHz LoRa chirps from 5.3x aliased BLE RSSI with 0.986 R2 on chirp recovery. Branch disambiguation enables BLE-only wireless sensing at 5 m. | Python, PyTorch, NumPy |
| High-Speed BLE RSSI Firmware | Custom Zephyr RTOS firmware on nRF54L15 hitting 77 kHz sampling (3x prior published), bypassing BLE protocol layer for raw energy detection. | C, Zephyr RTOS, DMA |
| Agentic Weather Assistant | Full-stack agentic web app with 3-service architecture: React frontend + FastAPI backend (LangChain ReAct agent + LangGraph) + custom MCP microservice wrapping a public REST API. Pydantic-validated typed tool-calling across services. | React, FastAPI, LangChain, LangGraph, MCP |
| Dual-Stream Gesture Transformer | Real-time hand gesture recognition via a Dual-Stream Spatiotemporal Transformer on MediaPipe skeletons. 557 FPS GPU (1.79 ms latency), 88.2% accuracy with 35 labeled samples via Sim-to-Real training. | Python, PyTorch, MediaPipe |
| Deep Learning for BLE Sensing | End-to-end super-resolution pipeline recovering wideband LoRa channel responses from narrowband BLE RSSI via progressive sub-pixel convolution. | Python, PyTorch, C |
- Robotic Manipulation RL — Sim-to-Real on Franka & xArm (paper in preparation): Contact-rich policy training in Isaac Lab with sim-to-real transfer to physical hardware.
- Peer Reviewer, AgentSkills Workshop, ACM CAIS 2026 (ACM Conference on AI and Agentic Systems)
- Peer Reviewer, IEEE Wireless Communications Letters
- 2 Chinese patents accepted on mixed-signal circuit techniques
- Provincial Second Prize, China Undergraduate Mathematical Contest in Modeling