Neil Shah
Director of Research, Senior Principal Scientist at Snap.
Bellevue, WA
neil at nshah dot net
nshah at snap dot com
I currently lead a team of scientists, engineers, interns, and collaborators on fundamental and applied research around modeling users, content, and their interactions at scale. I am broadly interested in advancing the state-of-the-art across machine learning technologies underpinning this, including graph and sequential representation learning, generative recommendation, and large language and foundation models for content and user understanding. At Snap, my team’s work has led to multiple step-function research platform capabilities and 90+ launches with topline business impact across our Growth, Content, Ads, Lenses, and Safety ML surfaces.
Prior to Snap, I got my PhD in the Computer Science Department at Carnegie Mellon University, where I worked on modeling and discovery of various abuse vectors in large online platforms. I was fortunate to have been advised by Christos Faloutsos. Earier, I received my B.S. in Computer Science from the Department of Computer Science at North Carolina State University. There, I worked with Nagiza Samatova on reduction, indexing, and storage systems for large-scale scientific data.
news
| May 08, 2026 | Sharing a new preprint on expressiveness limits in Semantic ID-based GR, and how latent tokens can alleviate them. |
|---|---|
| May 07, 2026 | Excited to share our work on plain transformers as scalable link predictors on graphs was accepted to ICML 2026 – see you in Seoul! |
| Apr 28, 2026 | Three papers accepted at ACL 2026 in San Diego on training-free LLM embeddings, sparse attention, and collaborative memory for agentic recommendation. |
| Apr 27, 2026 | Two papers accepted at SIGIR 2026 in Melbourne! New work on multimodal generative retrieval with vision-language semantic IDs, and an industry paper on deploying semantic IDs for recommendation at Snapchat. |
| Dec 28, 2025 | Sharing a new preprint on hierarchical token prepending, a new training-free method for getting strong LLM embeddings for retrieval. |
| Nov 28, 2025 | Sharing a new preprint on model-scaling behavior in generative recommendation methods, which shows scaling limitations in existing semantic ID-based methods. |
| Oct 28, 2025 | Excited to share two new works at CIKM 2025 on generative recommendation, covering the newest open-source reproducibility tooling (GRID) and meta-item embeddings for cold-start learning. |
| Oct 27, 2025 | Excited to share our new work at LoG 2025 on GNN distillation to MLPs, which shows that stronger models aren’t always stronger teachers. |
selected publications
A curated cross-section of my work across graph machine learning, recommendation systems, and trust & safety. See publications for the full list, or Google Scholar for citations.
- SIGIRSemantic IDs for Recommender Systems at Snapchat: Use Cases, Technical Challenges, and Design ChoicesIn ACM SIGIR Conference on Research and Development in Information Retrieval, 2026
- WSDMSequential Data Augmentation for Generative RecommendationIn ACM International Conference on Web Search and Data Mining, 2026
- CIKMGenerative Recommendation with Semantic IDs: A Practitioner’s HandbookIn ACM International Conference on Information and Knowledge Management, 2025
- KDDGiGL: Large-Scale Graph Neural Networks at SnapchatIn ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2025
- SIRIPLearning Universal User Representations Leveraging Cross-domain User Intent at SnapchatIn ACM SIGIR Conference on Research and Development in Information Retrieval, 2025
- SIRIPEmbedding-based Retrieval in Friend RecommendationIn ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
- ICLRMLPInit: Embarrassingly Simple GNN Training Acceleration with MLP InitializationIn International Conference on Learning Representations, 2023
- ICLRGraph-less Neural Networks: Teaching Old MLPs new Tricks via DistillationIn International Conference on Learning Representations, 2022
- WWWGraph Neural Networks for Friend Ranking in Large-scale Social PlatformsIn The Web Conference, 2021
- DSAASliceNDice: Mining Suspicious Multi-attribute Entity Groups with Multi-view GraphsIn IEEE International Conference on Data Science and Advanced Analytics, 2019
- KDDModeling Dwell Time Engagement on Visual MultimediaIn ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2019
- WWWFLOCK: Combating Astroturfing on Livestreaming PlatformsIn ACM World Wide Web Conference, 2017
- KDDFRAUDAR: Bounding Graph Fraud in the Face of CamouflageIn ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2016