YilunZhou

Yilun Zhou YilunZhou

Achievements

SalesforceAIResearch/jetts-benchmark SalesforceAIResearch/jetts-benchmark Public

Code repository for the paper "Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators"

Python 5
champ-dataset champ-dataset Public

Code repository for the ACL 2024 (Findings) paper "CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities"

Python 9 1
solvability-explainer solvability-explainer Public

Code repository for the EACL 2023 (Findings) paper "The Solvability of Interpretability Evaluation Metrics"

Python 3
ExSum ExSum Public

Code repository for the NAACL 2022 paper "ExSum: From Local Explanations to Model Understanding"

Python 64 5
feature-attribution-evaluation feature-attribution-evaluation Public

Code repository for the AAAI 2022 paper "Do Feature Attribution Methods Correctly Attribute Features?"

Python 21
RoCUS RoCUS Public

Code repository for the CoRL 2021 paper "RoCUS: Robot Controller Understanding via Sampling"

Python 11 4