Skip to content
View HumphreySun98's full-sized avatar
🎯
Focusing
🎯
Focusing

Sponsoring

@vllm-project

Block or report HumphreySun98

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HumphreySun98/README.md

Hi, I'm Haofei Sun

AI Agents & LLM Infrastructure · Deep Learning for Wireless Sensing · Embedded Systems

Open to full-time SWE / AI Engineer / ML Engineer roles — graduating Dec 2026.

LinkedIn Email SmartStudy on Chrome Web Store Archiagents Live LLM API Gateway SGLang PRs Merged LiteLLM PR Merged LangChain PR Merged


About Me

Engineer who connects hardware signals to intelligent software, and who ships systems honestly — including when the simple baseline wins. Recently I've contributed merged fixes to several leading LLM-infrastructure projects (SGLang, LiteLLM, LangChain), built embedded RTOS firmware sampling RF at 77 kHz (3x prior published rates), trained deep-learning models that recover signals lost to aliasing with 0.986 R2 on chirp recovery, and shipped full-stack LLM agents live on the Chrome Web Store and in production.

  • Contributed to leading LLM-infrastructure projects — merged PRs into SGLang (~29k★ serving framework), LiteLLM (50k★ gateway), and LangChain, spanning multi-tenant batching, multi-region routing, and prompt-encoding bugs (details below)
  • Built a physics-informed neural network on NVIDIA B200 reconstructing aliased RF signals with 0.986 R2 on chirp recovery
  • Custom Zephyr RTOS firmware on nRF54L15 hitting 77 kHz BLE RSSI sampling with <0.01% drop rate
  • Shipped Archiagents (https://archiagents.com/) — an end-to-end AI agent for architectural design that takes project briefs through to IFC4 BIM models and photorealistic renders. Owned engineering implementation and VPS deployment (2-person team)
  • Deployed a Claude-powered learning agent live on Chrome Web Store + HuggingFace, with a 4-policy benchmark and an honestly-reported finding that a rule-based heuristic outperformed Q-learning on short-horizon tasks
  • Shipped RepoAgentBench, an open-source toolkit that mines merged PRs into reproducible coding-agent benchmarks; tested 4 frontier LLMs across claude-code and aider with real API spend
  • Running a production LLM API gateway (https://api.manxuezhida.com) with multi-provider routing, load balancing, and key management — serves my downstream products
  • Summer 2026 intern at Halo Microelectronics — full-stack AI agent system for analog IC design (RAG + agent orchestration)

Interests: LLM serving infrastructure, edge AI, wireless sensing, LLM agents, signal processing, sim-to-real for robotics.


Open Source — LLM Infrastructure Contributions

sgl-project/sglang (~29k★) — high-performance LLM/multimodal inference-serving framework

  • PR #26971 (merged): Fixed a batched multi-tenant cache-routing crash — GenerateReqInput.extra_key wasn't indexed per sub-request, so the whole list was passed to RadixKey.child_key(), crashing prefix-cache matching with TypeError: unhashable type: 'list'. Added _normalize_extra_key() (scalar broadcast / list-length validation / parallel-sample expansion) + a 6-path regression test; passed 121 CI checks.
  • PR #25975 (merged, co-author): Prefill-delayer monitoring-metric fix — prefill_delayer_wait_* histogram stuck at 0 because the release path read next_state=None; maintainer adopted the prev_state approach and credited me as co-author.

BerriAI/litellm (50k★) — LLM gateway/proxy unifying 100+ providers

  • PR #29707 (merged): Diagnosed a Vertex AI context-caching 404 on multi-region (eu/us) endpoints — the caching path hardcoded the single-region host instead of the multi-region REP host the inference path already used — and contributed the merged parametrized regression suite locking the corrected host-resolution invariant. 49 green CI checks.

langchain-ai/langchain-aws — AWS/Bedrock integrations for LangChain

  • PR #1085 (merged): Repo-wide static analysis caught ensure_ascii=True defaults in json.dumps across Bedrock converters, tool-schema serializers, and stream parsers — silently escaping CJK/emoji to \uXXXX and inflating prompt token cost ~6x. Fixed across 11 sites in 3 modules.

RepoAgentBench — my open-source CLI on PyPI for reproducible, contamination-free coding-agent benchmarks.


Tech Stack

Languages

Python C C++ Verilog CUDA SQL JavaScript TypeScript Bash

AI / ML

PyTorch Claude GPT Gemini LangChain SGLang MCP Vercel AI SDK Transformers NumPy Pandas

Backend & Web

FastAPI Node.js Express React Streamlit shadcn/ui Tailwind PostgreSQL SQLite

Infrastructure

Docker Linux Git AWS VPS OpenMP MPI

Embedded & Hardware

Zephyr nRF MATLAB Isaac Lab Autodesk APS IFC4 BIM


Featured Projects

Project Description Stack
Archiagents — https://archiagents.com/ End-to-end AI agent for architectural design (2-person team). Ingests project briefs + CAD/DWG/IFC/Revit files, conducts requirement dialogue, generates design schemes, renders photorealistic visualizations (gpt-image-1), and outputs IFC4 BIM models with embedded Autodesk APS viewer. Multi-LLM backend (Claude / GPT / Gemini); deployed on custom domain via VPS. Vercel AI SDK, shadcn/ui, gpt-image-1, Autodesk APS, IFC4
LLM API Gateway — https://api.manxuezhida.com Production LLM API proxy serving multiple providers (Claude / GPT / Gemini) with load balancing, API key management, and request routing. Powers SmartStudy Agent, Archiagents, and other downstream products. Custom domain on VPS. Node.js, Express, VPS
SmartStudy Agent (Web · Chrome Extension) Closed-loop POMDP learning agent with 4-policy benchmark (Random / Rule-based / LinUCB Bandit / Q-learning) over 30 simulated students x 30 sessions. Honestly reported finding: rule-based heuristic +35% over random vs Q-learning +18% — RL is defensible but not dominant in short-horizon regime. Live on Chrome Web Store + HuggingFace; 8-page Streamlit UI; 3 pluggable LLM backends. Python, Claude API, Streamlit, SQLite, Chrome MV3
RepoAgentBench Open-source CLI that mines merged GitHub PRs into reproducible, contamination-free coding-agent benchmarks. Adapters for claude-code and aider; tested with 4 frontier LLMs (Opus 4.7 / GPT-5.5 / Sonnet 4.6 / Gemini 3.1 Pro) using real API spend. Python, Click, PyPI, JSONL, GitHub API
NeuroUnfold Physics-informed DL recovering 406 kHz LoRa chirps from 5.3x aliased BLE RSSI with 0.986 R2 on chirp recovery. Branch disambiguation enables BLE-only wireless sensing at 5 m. Python, PyTorch, NumPy
High-Speed BLE RSSI Firmware Custom Zephyr RTOS firmware on nRF54L15 hitting 77 kHz sampling (3x prior published), bypassing BLE protocol layer for raw energy detection. C, Zephyr RTOS, DMA
Agentic Weather Assistant Full-stack agentic web app with 3-service architecture: React frontend + FastAPI backend (LangChain ReAct agent + LangGraph) + custom MCP microservice wrapping a public REST API. Pydantic-validated typed tool-calling across services. React, FastAPI, LangChain, LangGraph, MCP
Dual-Stream Gesture Transformer Real-time hand gesture recognition via a Dual-Stream Spatiotemporal Transformer on MediaPipe skeletons. 557 FPS GPU (1.79 ms latency), 88.2% accuracy with 35 labeled samples via Sim-to-Real training. Python, PyTorch, MediaPipe
Deep Learning for BLE Sensing End-to-end super-resolution pipeline recovering wideband LoRa channel responses from narrowband BLE RSSI via progressive sub-pixel convolution. Python, PyTorch, C

Research & Publications

  • Robotic Manipulation RL — Sim-to-Real on Franka & xArm (paper in preparation): Contact-rich policy training in Isaac Lab with sim-to-real transfer to physical hardware.
  • Peer Reviewer, AgentSkills Workshop, ACM CAIS 2026 (ACM Conference on AI and Agentic Systems)
  • Peer Reviewer, IEEE Wireless Communications Letters
  • 2 Chinese patents accepted on mixed-signal circuit techniques
  • Provincial Second Prize, China Undergraduate Mathematical Contest in Modeling

Pinned Loading

  1. langchain-ai/langchain-aws langchain-ai/langchain-aws Public

    Build LangChain Applications on AWS

    Python 328 294

  2. physical-informed-Deep-Learning-for-wireless-sensing physical-informed-Deep-Learning-for-wireless-sensing Public

    NeuroUnfold: Physics-informed deep learning for chirp unfolding in cross-technology wireless sensing. Recovers 406 kHz LoRa beat chirps from 77 kHz aliased BLE RSSI via branch disambiguation, enabl…

    Python 32

  3. Smart-Study-Agent Smart-Study-Agent Public

    🎓 Adaptive AI study agent with POMDP belief state — OPEAA loop, Q-learning + LinUCB bandit policies, SM-2 spaced repetition, concept DAG. Streamlit web app + Chrome extension (MV3). Claude & free H…

    JavaScript 61

  4. repoagentbench repoagentbench Public

    SWE-bench for your codebase — mine your merged PRs into local, contamination-free coding-agent benchmarks. Adapters: claude-code, aider (Opus 4.7 / GPT-5.5 / Sonnet 4.6 / Gemini 3.1 Pro).

    Python 32

  5. High-speed-BLE-RSSI-sampling-rate High-speed-BLE-RSSI-sampling-rate Public

    High-speed BLE RSSI sampling on nRF54L15 (~77 kHz, 3× prior published rates) for wireless sensing applications. Custom Zephyr firmware that bypasses BLE protocol processing for raw energy detection…

    C 2

  6. dual-stream-gesture-transformer dual-stream-gesture-transformer Public

    Real-time hand gesture recognition via a Dual-Stream Spatiotemporal Transformer (DSST) on MediaPipe skeletons. Decouples static pose and motion velocity into two Transformer streams with late fusio…

    Python 2