MENTOR: Fixing Introductory Programming Assignments With Formula-Based Fault Localization and LLM-Driven Program Repair
This is the official Git repository for “MENTOR: Fixing Introductory Programming Assignments With Formula-Based Fault Localization and LLM-Driven Program Repair”, published in the Journal of Systems and Software (JSS), 2026.
MENTOR [1] is a semantic automated program repair (APR) framework designed to provide automated feedback on introductory programming assignments (IPAs). It leverages previous student submissions and integrates program clustering [2], variable alignment [3], fault localization [4], and Large Language Models (LLMs) [5] to guide program repair.
MENTOR can either provide feedback by highlighting the faulty program statements in the students' programs or by fixing the incorrect program and presenting students with a fixed program.
Over 70% of students found MENTOR's feedback helpful in understanding and correcting their programming mistakes [6].
This repository contains several directories, scripts, and submodules:
- program-clustering/: Clustering-based program repair module.
- variable-mapping/: submodule for variable alignment in programs.
- fault-localization/: Fault localization submodule.
- C-Pack-IPAs/: submodule with a benchmark of introductory programming assignments (IPAs) [7].
- mentor/: Core implementation of MENTOR.
- LLMs/: Contains components for LLM-based repair processes.
- code_metrics/: Provides code evaluation metrics.
- database/: Stores and manages relevant data for program repair.
- utils/: Utility scripts and helper functions.
- LLM-CEGIS-Repair.md: README for the LLM-based repair approach.
- how_to_run_LLMs.sh: Script explaining how to run LLM-based repair.
- how_to_run_RepairAgents.sh: Script explaining how to run repair agents.
- repair.py: The main script to run MENTOR.
- repair_CPackIPAs.sh: Script to run MENTOR on the entire C-Pack-IPAs benchmark using different prompt configurations.
- requirements.txt: Lists required dependencies for MENTOR.
MENTOR relies on multiple submodules, each containing its own requirements and implementation instructions. To set up the project, follow these steps:
git clone --recurse-submodules git@github.com:pmorvalho/MENTOR.git
cd MENTOR
pip install -r requirements.txtFor additional dependencies, check the requirements files inside each submodule.
To hearn how to run MENTOR on an individual program repair task, run:
python repair.py -hTo learn how to run MENTOR on the entire C-Pack-IPAs benchmark with different prompt configurations, run:
./repair_CPackIPAs.sh -hFor LLM-based repair, refer to the how_to_run_LLMs.sh script. For running repair agents, use how_to_run_RepairAgents.sh.
If you use MENTOR in your research, please cite the following paper:
@article{OrvalhoJM26,
author = {Pedro Orvalho and
Mikol{\'{a}}s Janota and
Vasco Manquinho},
title = {{MENTOR: Fixing Introductory Programming Assignments with Formula-Based Fault Localization and LLM-Driven Program Repair}},
journal = {Journal of Systems and Software},
year = {2026},
publisher = {Elsevier},
issn = {0164-1212},
doi = {https://doi.org/10.1016/j.jss.2025.112690},
url = {https://www.sciencedirect.com/science/article/pii/S0164121225003590}
}
Contributions are welcome! Please follow the standard GitHub workflow:
- Fork the repository.
- Create a feature branch.
- Commit your changes.
- Open a pull request.
MENTOR is actively maintained and used in ongoing research. We continue to develop the tool and build upon it, and we are open to collaborations.
If you run into a problem, please open an issue on this repository and/or (optionally) email us so we do not miss it.
This project is licensed under the terms of the MIT LICENSE.
[1] P. Orvalho, M. Janota, and V. Manquinho. MENTOR: Fixing Introductory Programming Assignments with Formula-Based Fault Localization and LLM-Driven Program Repair. The Journal of Systems & Software, JSS 2026. PDF. GitHub.
[2] P. Orvalho, M. Janota, and V. Manquinho. InvAASTCluster: On Applying Invariant-Based Program Clustering to Introductory Programming Assignments. arXiv 2022. PDF. GitHub.
[3] P. Orvalho, J. Piepenbrock, M. Janota, and . Manquinho. Graph Neural Networks For Mapping Variables Between Programs. ECAI 2023. PDF. GitHub.
[4] P. Orvalho, M. Janota, and V. Manquinho. CFaults: Model-Based Diagnosis for Fault Localization in C with Multiple Test Cases. The 26th International Symposium on Formal Methods, FM 2024. PDF. GitHub.
[5] P. Orvalho, M. Janota, and V. Manquinho. Counterexample Guided Program Repair Using Zero-Shot Learning and MaxSAT-based Fault Localization. In the 39th Annual AAAI Conference on Artificial Intelligence, AAAI 2025. PDF. GitHub.
[6] P. Orvalho, M. Janota, and V. Manquinho. GitSEED: A Git-backed Automated Assessment Tool for Software Engineering and Programming Education. The 1st ACM Virtual Global Computing Education Conference, SIGCSE Virtual 2024. PDF. GitLab.
[7] P. Orvalho, M. Janota, and V. Manquinho. C-Pack of IPAs: A C90 Program Benchmark of Introductory Programming Assignments. In the 5th International Workshop on Automated Program Repair, APR 2024, co-located with ICSE 2024. PDF. GitHub.