A repository containing recent explainable AI/Interpretable ML approaches
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission | KDD | 2015 | N/A | `` | |
| Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model | arXiv | 2015 | N/A | `` |
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| Interpretable Decision Sets: A Joint Framework for Description and Prediction | KDD | 2016 | N/A | `` | |
| "Why Should I Trust You?": Explaining the Predictions of Any Classifier | KDD | 2016 | N/A | `` | |
| Towards A Rigorous Science of Interpretable Machine Learning | arXiv | 2017 | N/A | Review Paper |
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| Transparency: Motivations and Challenges | arXiv | 2017 | N/A | Review Paper |
|
| A Unified Approach to Interpreting Model Predictions | NeurIPS | 2017 | N/A | `` | |
| SmoothGrad: removing noise by adding noise | ICML (Workshop) | 2017 | Github | `` | |
| Axiomatic Attribution for Deep Networks | ICML | 2017 | N/A | `` | |
| Learning Important Features Through Propagating Activation Differences | ICML | 2017 | N/A | `` | |
| Understanding Black-box Predictions via Influence Functions | ICML | 2017 | N/A | `` | |
| Network Dissection: Quantifying Interpretability of Deep Visual Representations | CVPR | 2017 | N/A | `` |
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| Explainable Prediction of Medical Codes from Clinical Text | ACL | 2018 | N/A | `` | |
| Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) | ICML | 2018 | N/A | `` | |
| Counterfactual Explanations without Opening the Black Box: Automated Decisions and the GDPR | HJTL | 2018 | N/A | `` | |
| Sanity Checks for Saliency Maps | NeruIPS | 2018 | N/A | `` | |
| Deep Learning for Case-Based Reasoning through Prototypes: A Neural Network that Explains Its Predictions | AAAI | 2018 | N/A | `` | |
| The Mythos of Model Interpretability | arXiv | 2018 | N/A | Review Paper |
|
| Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead | Nature Machine Intelligence | 2018 | N/A | `` |
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| Human Evaluation of Models Built for Interpretability | AAAI | 2019 | N/A | Human in the loop |
|
| Data Shapley: Equitable Valuation of Data for Machine Learning | ICML | 2019 | N/A | `` | |
| Attention is not Explanation | ACL | 2019 | N/A | `` | |
| Actionable Recourse in Linear Classification | FAccT | 2019 | N/A | `` | |
| Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead | Nature | 2019 | N/A | `` | |
| Explanations can be manipulated and geometry is to blame | NeurIPS | 2019 | N/A | `` | |
| Learning Optimized Risk Scores | JMLR | 2019 | N/A | `` | |
| Explain Yourself! Leveraging Language Models for Commonsense Reasoning | ACL | 2019 | N/A | `` | |
| Deep Neural Networks Constrained by Decision Rules | AAAI | 2018 | N/A | `` | |
| Towards Automatic Concept-based Explanations | NeurIPS | 2019 | Github | `` |
| Title | Venue | Year | Code | Keywords | Summary |
|---|---|---|---|---|---|
| A Learning Theoretic Perspective on Local Explainability | ICLR (Poster) | 2021 | N/A | `` | |
| A Learning Theoretic Perspective on Local Explainability | ICLR | 2021 | N/A | `` | |
| Do Input Gradients Highlight Discriminative Features? | NeurIPS | 2021 | N/A | `` | |
| Explaining by Removing: A Unified Framework for Model Explanation | JMLR | 2021 | N/A | `` | |
| Explainable Active Learning (XAL): An Empirical Study of How Local Explanations Impact Annotator Experience | PACMHCI | 2021 | N/A | `` | |
| Towards Robust and Reliable Algorithmic Recourse | NeurIPS | 2021 | N/A | `` | |
| A Framework to Learn with Interpretation | NeurIPS | 2021 | N/A | `` | |
| Algorithmic Recourse: from Counterfactual Explanations to Interventions | FAccT | 2021 | N/A | `` | |
| Manipulating and Measuring Model Interpretability | CHI | 2021 | N/A | `` | |
| Explainable Reinforcement Learning via Model Transforms | NeurIPS | 2021 | N/A | `` | |
| Aligning Artificial Neural Networks and Ontologies towards Explainable AI | AAAI | 2021 | N/A | `` |