alexzms alexzms

Hi there 👋 This is Minshen Zhang

I am a first-year M.S. student in Computer Science at UC San Diego, advised by Prof. Hao Zhang, and I hold a B.S. from ShanghaiTech University advised by Prof. Kewei Tu. My research lies at the intersection of Natural Language Processing and Machine Learning Systems. I am particularly passionate about designing efficient architectures for Long-Context Modeling and exploring the frontiers of World Models to bridge system efficiency with model capability.

Currently, I focus on scalable training and inference for generative models. I am the lead author of FlashMHF (under review), where I proposed a novel Multi-Head FFN architecture backed by IO-aware Triton/CUDA kernels. Additionally, as a core contributor to FastVideo in Hao AI Lab, I am working on new model aggregation and optimized kernel implementations to accelerate video generation systems.

Looking ahead, I aim to extend my work on FlashMHF to broader LLM backbones and delve deeper into World Models within the FastVideo framework. I am also actively exploring retrieval-based methods and Continual Learning to solve the challenges of long-context understanding in foundation models.

Technical Focus: NLP Triton/CUDA LLM Architecture Video Generation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

alexzms alexzms

Achievements

Achievements

Highlights

Block or report alexzms

Hi there 👋 This is Minshen Zhang

Pinned Loading

Uh oh!