You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Robustness study: how small LLMs handle misleading or contradictory retrieved evidence in RAG. Detection + intervention policies across BM25, dense, and hybrid retrieval.
LLM-based user simulation for conversational search evaluation. Tests multiple prompting strategies on TREC CAsT 2020 topics with teacher-forced replay (TREC 2026 User Simulation track).