π€ Junfan Zhu is an AI Engineer with 5 years of experience in Agentic RL, multi-modal agent reasoning, and scalable LLM/VLM systems. With a strong background in speculative decoding, distributed training, and high-throughput inference, Junfan also brings expertise in multi-agent orchestration, Transformer architectures, mixture-of-experts (MoE), and algorithmic optimization, enabling him to design and scale advanced AI systems end-to-end. He holds Masterβs in CS from Georgia Tech and Mathematics from UChicago, and is recognized as a resilient collaborator and builder of high-impact AI systems.
Key Strengths: distributed training (FSDP, MoE), speculative decoding, RAG pipelines, multi-hop reasoning, stochastic optimization, observability, vLLM customization, and high-throughput inference orchestration.
Long-term thinker, resilient collaborator, and builder of high-impact AI systems.
My portfolio boasts pioneering projects in MoE & Attention for scalable LLM, reflective multi-agent orchestrations, and full-stack GenAI applications.
- 1. Awesome-AI-Engineer-Review
In-depth review of industry trends in AI, LLMs, Machine Learning, Computer Science, and Quantitative Finance. - 2. MiniGPT-and-DeepSeek-MLA-Multi-Head-Latent-Attention
Memory-efficient multi-head latent attention in PyTorch, that leverages low-rank approximation and decoupled rotary positional embeddings, to compress keyβvalue representations, reducing inference memory while maintaining high performance in long-context language models. - 3. DeepSeek-MoE-Mixture-of-Experts-in-PyTorch
Implemented scalable 8-expert MoE model with top-k routing, expert load balancing, and capacity-aware gating; enabled parallel sparse activation and DeepSeek-R1-style distributed training scalability. - 4. MCP-MultiServer-Interoperable-Agent2Agent-LangGraph-AI-System
A decoupled real-time agent architecture connecting LangGraph agents to remote tools served by custom MCP servers via SSE and STDIO, enabling a scalable multi-agent system for LLM workflows. The design supports flexible multi-server connectivity and lays the groundwork for an Agent2Agent protocol, fostering seamless, cloud-deployable interoperability across diverse AI systems. - 5. LangGraph-Reflection-Researcher
Engineered LangGraph-based multi-agent system with self-reflection and retrieval-grounded alignment; integrated LangSmith trace for reasoning introspection, cutting hallucination 40% with iterative expert routing. - 6. Cognito-LangGraph-RAG-Chatbot
Advanced Retrieval Augmented Generation (RAG) chatbot that utilizes LangGraph to enhance answer accuracy and minimize hallucinations in LLM outputs. - 7. Cursor-FullStack-AI-App
Cursor Vibe Engineering: Full-stack micro SaaS AI application that processes GitHub URLs to generate insightful JSON reports powered by AI analytics. - 8. Cryptocurrency-Blockchain-FullStack
Comprehensive decentralized blockchain platform demonstrating practical applications of core blockchain concepts through a modular, full-stack approach.
Iβm a traveler
I summited πΉπΏ Kilimanjaro Uhuru Peak (5895m) at the Roof of Africa π¦, trekked π³π΅ Annapurna Base Camp π, hiked π¬πΉ VolcΓ‘n de Fuego π, traversed a desert in π¨π³ Inner Mongolia π, and completed 2 marathons (PB within 5h) π.
My expeditions have taken me to beautiful adventurous journeys, such as π³π΄ Longyearbyen & Barentsburg (π₯Ά icebreaker β΄) in Svalbard π, π¨π± Rapa NuiπΏ, π¨π¦ Iqaluit Nunavut π, π¦π· Ushuaia π§, πΊπΈ Unalaska & Cold Bay in Aleutian Islandsπ» / UtqiaΔ‘vik & Prudhoe Bay Alaska βοΈ, π¨π³ Tibet π, π΅π« Bora Bora πͺΈ, πΊπΈ MolokaΚ»i ποΈ, πͺπ¨ middle of the Earth π and so on. These experiences have shaped my adaptability π½, problem-solving skills βοΈ, and global perspective π.
π Motto: "Every man carries within him a world, composed of all that he has seen and loved, and it is to this world that he constantly returns, even when he seems to be journeying and living in another different world." β Chateaubriand, "Voyages en Italie" π