I’m a Machine Learning Engineer focused on AI infrastructure, large-scale systems, and core AI research.
- TorchDigest – A curated digest for technical PyTorch and inference optimization insights.
- Ephemera – A different way to listen to music on YouTube. I built it because having access to every song ever recorded makes it harder to actually choose one. You describe a moment, get three songs, and just listen. A strict 10-minute cooldown forces a bit of artificial scarcity—getting us to stop scrolling and actually notice what’s playing.
- mmoe-recsys – A modular PyTorch implementation of Multi-gate Mixture-of-Experts (MMoE) for multi-task recommendation, designed to mitigate negative task transfer in complex ranking pipelines.
- Engineering Blog – Deep dives into inference optimization, recommendation architecture, and the intersection of LLMs and systems design.
- Inference Engineering: LLM serving, PagedAttention, and custom kernel optimization.
- Recommendation Systems: Deep retrieval, multi-task ranking, and vector search at scale.
- Research Frontiers: World Models and agentic planning.