skip to main content
10.1145/3650212.3680381acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

One Size Does Not Fit All: Multi-granularity Patch Generation for Better Automated Program Repair

Published: 11 September 2024 Publication History

Abstract

Automated program repair aims to automate bug correction and alleviate the burden of manual debugging, which plays a crucial role in software development and maintenance. Recent studies reveal that learning-based approaches have outperformed conventional APR techniques (e.g., search-based APR). Existing learning-based APR techniques mainly center on treating program repair either as a translation task or a cloze task. The former primarily emphasizes statement-level repair, while the latter concentrates on token-level repair, as per our observations. In practice, however, patches may manifest at various repair granularity, including statement, expression, or token levels. Consequently, merely generating patches from a single granularity would be ineffective to tackle real-world defects. Motivated by this observation, we propose Mulpor, a multi-granularity patch generation approach designed to address the diverse nature of real-world bugs. Mulpor comprises three components: statement-level, expression-level, and token-level generator, each is pre-trained to generate correct patches at its respective granularity. The approach involves generating candidate patches from various granularities, followed by a re-ranking process based on a heuristic to prioritize patches. Experimental results on the Defects4J dataset demonstrate that Mulpor correctly repair 92 bugs on Defects4J-v1.2, which achieves 27.0% (20 bugs) and 12.2% (10 bugs) improvement over the previous state-of-the-art NMT-style Rap-Gen and Cloze-style GAMMA. We also studied the generalizability of Mulpor in repairing vulnerabilities, revealing a notable 51% increase in the number of correctly-fixed patches compared with state-of-the-art vulnerability repair approaches. This paper underscores the importance of considering multiple granularities in program repair techniques for a comprehensive strategy to address the diverse nature of real-world software defects. Mulpor, as proposed herein, exhibits promising results in achieving effective and diverse bug fixes across various program repair scenarios.

References

[1]
Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In Testing: Academic and Industrial Conference Practice and Research Techniques-MUTATION. 89–98. https://doi.org/10.1109/TAIC.PART.2007.13
[2]
Samuel Benton, Xia Li, Yiling Lou, and Lingming Zhang. 2020. On the effectiveness of unified debugging: An extensive study on 16 program repair systems. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. 907–918.
[3]
Berkay Berabi, Jingxuan He, Veselin Raychev, and Martin Vechev. 2021. Tfix: Learning to fix coding errors with a text-to-text transformer. In International Conference on Machine Learning. 780–791.
[4]
Guru Bhandari, Amara Naseer, and Leon Moonen. 2021. CVEfixes: automated collection of vulnerabilities and their fixes from open-source software. In Proceedings of the 17th International Conference on Predictive Models and Data Analytics in Software Engineering. 30–39.
[5]
José Campos, André Riboira, Alexandre Perez, and Rui Abreu. 2012. GZoltar: an eclipse plug-in for testing and debugging. In Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering. 378–381. https://doi.org/10.1145/2351676.2351752
[6]
Saikat Chakraborty, Toufique Ahmed, Yangruibo Ding, Premkumar T. Devanbu, and Baishakhi Ray. 2022. NatGen: Generative Pre-Training by “naturalizing” Source Code. ESEC/FSE 2022. Association for Computing Machinery, New York, NY, USA. 18–30. isbn:9781450394130 https://doi.org/10.1145/3540250.3549162
[7]
Zimin Chen, Steve Kommrusch, and Martin Monperrus. 2022. Neural transfer learning for repairing security vulnerabilities in c code. IEEE Transactions on Software Engineering, 49, 1 (2022), 147–165.
[8]
Zimin Chen, Steve James Kommrusch, Michele Tufano, Louis-Noël Pouchet, Denys Poshyvanyk, and Martin Monperrus. 2019. Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Trans. on Software Engineering.
[9]
Jianlei Chi, Yu Qu, Ting Liu, Qinghua Zheng, and Heng Yin. 2022. Seqtrans: automatic vulnerability fix via sequence to sequence learning. IEEE Transactions on Software Engineering, 49, 2 (2022), 564–585.
[10]
Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. 2014. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
[11]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171–4186. https://doi.org/10.18653/v1/n19-1423
[12]
Thomas Durieux, Fernanda Madeiral, Matias Martinez, and Rui Abreu. 2019. Empirical Review of Java Program Repair Tools: A Large-Scale Experiment on 2,141 Bugs and 23,551 Repair Attempts. In Proceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 302–313. https://doi.org/10.1145/3338906.3338911
[13]
Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. Fine-grained and accurate source code differencing. In Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering. 313–324. https://doi.org/10.1145/2642937.2642982
[14]
Jiahao Fan, Yi Li, Shaohua Wang, and Tien N Nguyen. 2020. AC/C++ code vulnerability dataset with code changes and CVE summaries. In Proceedings of the 17th International Conference on Mining Software Repositories. 508–512.
[15]
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, and Daxin Jiang. 2020. CodeBERT: A Pre-Trained Model for Programming and Natural Languages. In Findings of the Association for Computational Linguistics: EMNLP 2020. 1536–1547.
[16]
Michael Fu, Chakkrit Tantithamthavorn, Trung Le, Van Nguyen, and Dinh Phung. 2022. VulRepair: A T5-Based Automated Software Vulnerability Repair. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). Association for Computing Machinery, New York, NY, USA. 935–947. isbn:9781450394130 https://doi.org/10.1145/3540250.3549098
[17]
Luca Gazzola, Daniela Micucci, and Leonardo Mariani. 2019. Automatic software repair: A survey. IEEE Transactions on Software Engineering, 45, 1 (2019), 34–67. https://doi.org/10.1109/TSE.2017.2755013
[18]
Ali Ghanbari, Samuel Benton, and Lingming Zhang. 2019. Practical program repair via bytecode mutation. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 19–30. https://doi.org/10.1145/3293882.3330559
[19]
Claire Le Goues, Michael Pradel, and Abhik Roychoudhury. 2019. Automated program repair. Commun. ACM, 62, 12 (2019), 56–65.
[20]
Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin. 2022. Unixcoder: Unified cross-modal pre-training for code representation. arXiv preprint arXiv:2203.03850.
[21]
Daya Guo, Shuo Ren, Shuai Lu, Zhangyin Feng, Duyu Tang, Shujie Liu, Long Zhou, Nan Duan, Alexey Svyatkovskiy, and Shengyu Fu. 2021. GraphCodeBERT: Pre-training Code Representations with Data Flow. In ICLR.
[22]
Rahul Gupta, Soham Pal, Aditya Kanade, and Shirish Shevade. 2017. DeepFix: Fixing common C language errors by deep learning. In Proceedings of the 31st AAAI Conference on Artificial Intelligence. AAAI, 1345–1351.
[23]
Dan Hao, Lu Zhang, Lei Zang, Yanbo Wang, Xingxia Wu, and Tao Xie. 2015. To be optimal or not in test-case prioritization. IEEE Transactions on Software Engineering, 42, 5 (2015), 490–505.
[24]
Kai Huang, Xiangxin Meng, Jian Zhang, Yang Liu, Wenjie Wang, Shuhao Li, and Yuqing Zhang. 2023. An Empirical Study on Fine-tuning Large Language Models of Code for Automated Program Repair. In Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering. 1–13.
[25]
Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. 2019. Codesearchnet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436.
[26]
Jiajun Jiang, Yingfei Xiong, Hongyu Zhang, Qing Gao, and Xiangqun Chen. 2018. Shaping program repair space with existing patches and similar code. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 298–309. https://doi.org/10.1145/3213846.3213871
[27]
Nan Jiang, Thibaud Lutellier, and Lin Tan. 2021. Cure: Code-aware neural machine translation for automatic program repair. In 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE). 1161–1173.
[28]
Mandar Joshi, Danqi Chen, Yinhan Liu, Daniel S Weld, Luke Zettlemoyer, and Omer Levy. 2020. Spanbert: Improving pre-training by representing and predicting spans. Transactions of the association for computational linguistics, 8 (2020), 64–77.
[29]
René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 23rd International Symposium on Software Testing and Analysis. 437–440. https://doi.org/10.1145/2610384.2628055
[30]
Anil Koyuncu, Kui Liu, Tegawendé F. Bissyandé, Dongsun Kim, Jacques Klein, Martin Monperrus, and Yves Le Traon. 2020. FixMiner: Mining relevant fix patterns for automated program repair. Empirical Software Engineering, 25, 3 (2020), 1980–2024. https://doi.org/10.1007/s10664-019-09780-z
[31]
Xuan-Bach D Le, Duc-Hiep Chu, David Lo, Claire Le Goues, and Willem Visser. 2017. S3: syntax-and semantic-guided repair synthesis via programming by examples. In Proceedings of the 11th Joint Meeting on Foundations of Software Engineering. 593–604. https://doi.org/10.1145/3106237.3106309
[32]
Xuan Bach D Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In Proceedings of the 23rd IEEE International Conference on Software Analysis, Evolution, and Reengineering. 213–224. https://doi.org/10.1109/SANER.2016.76
[33]
Claire Le Goues, ThanhVu Nguyen, Stephanie Forrest, and Westley Weimer. 2012. GenProg: A generic method for automatic software repair. IEEE Transactions on Software Engineering, 38, 1 (2012), 54–72. https://doi.org/10.1109/TSE.2011.104
[34]
Yi Li, Shaohua Wang, and Tien N. Nguyen. 2020. DLFix: Context-Based Code Transformation Learning for Automated Program Repair. ICSE ’20. Association for Computing Machinery, New York, NY, USA. 602–614. isbn:9781450371216 https://doi.org/10.1145/3377811.3380345
[35]
Yi Li, Shaohua Wang, and Tien N Nguyen. 2022. Dear: A novel deep learning-based approach for automated program repair. In Proceedings of the 44th International Conference on Software Engineering. 511–523.
[36]
Zhiyu Li, Shuai Lu, Daya Guo, Nan Duan, Shailesh Jannu, Grant Jenks, Deep Majumder, Jared Green, Alexey Svyatkovskiy, Shengyu Fu, and Neel Sundaresan. 2022. Automating Code Review Activities by Large-Scale Pre-Training. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022). 1035–1047. https://doi.org/10.1145/3540250.3549081
[37]
Kui Liu, Anil Koyuncu, Dongsun Kim, and Tegawendé F. Bissyandé. 2019. TBar: Revisiting Template-based Automated Program Repair. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis. 31–42. https://doi.org/10.1145/3293882.3330577
[38]
Xuliang Liu and Hao Zhong. 2018. Mining stackoverflow for program repair. In Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution and Reengineering. 118–129. https://doi.org/10.1109/SANER.2018.8330202
[39]
Fan Long and Martin Rinard. 2015. Staged program repair with condition synthesis. In Proceedings of the 10th Joint Meeting on Foundations of Software Engineering. 166–178. https://doi.org/10.1145/2786805.2786811
[40]
Thibaud Lutellier, Hung Viet Pham, Lawrence Pang, Yitong Li, Moshi Wei, and Lin Tan. 2020. CoCoNuT: combining context-aware neural translation models using ensemble for program repair. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 101–114. https://doi.org/10.1145/3395363.3397369
[41]
Sergey Mechtaev, Jooyong Yi, and Abhik Roychoudhury. 2016. Angelix: Scalable multiline program patch synthesis via symbolic analysis. In Proceedings of the 38th International Conference on Software Engineering. 691–701. https://doi.org/10.1145/2884781.2884807
[42]
Xiangxin Meng, Xu Wang, Hongyu Zhang, Hailong Sun, and Xudong Liu. 2022. Improving fault localization and program repair with deep semantic features and transferred knowledge. In Proceedings of the 44th International Conference on Software Engineering. 1169–1180.
[43]
Martin Monperrus. 2018. Automatic software repair: A bibliography. Comput. Surveys, 51, 1 (2018), 17:1–17:24. https://doi.org/10.1145/3105906
[44]
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. SemFix: Program repair via semantic analysis. In Proceedings of the 35th International Conference on Software Engineering. 772–781. https://doi.org/10.1109/ICSE.2013.6606623
[45]
Yuhua Qi, Xiaoguang Mao, Yan Lei, Ziying Dai, and Chengsong Wang. 2014. The strength of random search on automated program repair. In Proceedings of the 36th International Conference on Software Engineering. 254–265. https://doi.org/10.1145/2568225.2568254
[46]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training.
[47]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21 (2020), 1–67.
[48]
Victor Sobreira, Thomas Durieux, Fernanda Madeiral, Martin Monperrus, and Marcelo de Almeida Maia. 2018. Dissection of a bug dataset: Anatomy of 395 patches from Defects4J. In Proceedings of the 25th International Conference on Software Analysis, Evolution and Reengineering. 130–140. https://doi.org/10.1109/SANER.2018.8330203
[49]
Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2019. An empirical study on learning bug-fixing patches in the wild via neural machine translation. ACM Transactions on Software Engineering and Methodology, 28, 4 (2019), 19:1–19:29. https://doi.org/10.1145/3340544
[50]
Shangwen Wang, Ming Wen, Bo Lin, Hongjun Wu, Yihao Qin, Deqing Zou, Xiaoguang Mao, and Hai Jin. 2020. Automated Patch Correctness Assessment: How Far are We? In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering. ACM.
[51]
Weishi Wang, Yue Wang, Shafiq Joty, and Steven CH Hoi. 2023. RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic Program Repair. arXiv preprint arXiv:2309.06057.
[52]
Yue Wang, Weishi Wang, Shafiq Joty, and Steven CH Hoi. 2021. CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. 8696–8708.
[53]
Yuxiang Wei, Chunqiu Steven Xia, and Lingming Zhang. 2023. Copiloting the Copilots: Fusing Large Language Models with Completion Engines for Automated Program Repair. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 172–184.
[54]
Westley Weimer, ThanhVu Nguyen, Claire Le Goues, and Stephanie Forrest. 2009. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering. 364–374. https://doi.org/10.1109/ICSE.2009.5070536
[55]
Ming Wen, Junjie Chen, Rongxin Wu, Dan Hao, and Shing-Chi Cheung. 2018. Context-aware patch generation for better automated program repair. In Proceedings of the 40th International Conference on Software Engineering. 1–11. https://doi.org/10.1145/3180155.3180233
[56]
Alexander Wettig, Tianyu Gao, Zexuan Zhong, and Danqi Chen. 2022. Should you mask 15% in masked language modeling? arXiv preprint arXiv:2202.08005.
[57]
Martin White, Michele Tufano, Matias Martinez, Martin Monperrus, and Denys Poshyvanyk. 2019. Sorting and transforming program repair ingredients via deep learning code similarities. In Proceedings of the IEEE 26th International Conference on Software Analysis, Evolution and Reengineering. 479–490. https://doi.org/10.1109/SANER.2019.8668043
[58]
Chunqiu Steven Xia and Lingming Zhang. 2022. Less training, more repairing please: revisiting automated program repair via zero-shot learning. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 959–971.
[59]
Jifeng Xuan, Matias Martinez, Favio Demarco, Maxime Clement, Sebastian Lamelas Marcote, Thomas Durieux, Daniel Le Berre, and Martin Monperrus. 2017. Nopol: Automatic repair of conditional statement bugs in java programs. IEEE Transactions on Software Engineering, 43, 1 (2017), 34–55. https://doi.org/10.1109/TSE.2016.2560811
[60]
He Ye, Matias Martinez, Xiapu Luo, Tao Zhang, and Martin Monperrus. 2022. Selfapr: Self-supervised program repair with test execution diagnostics. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering. 1–13.
[61]
He Ye, Matias Martinez, and Martin Monperrus. 2022. Neural program repair with execution-based backpropagation. In Proceedings of the 44th International Conference on Software Engineering. 1506–1518.
[62]
Wei Yuan, Quanjun Zhang, Tieke He, Chunrong Fang, Nguyen Quoc Viet Hung, Xiaodong Hao, and Hongzhi Yin. 2022. CIRCLE: Continual repair across programming languages. In Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis. 678–690.
[63]
Quanjun Zhang, Chunrong Fang, Tongke Zhang, Bowen Yu, Weisong Sun, and Zhenyu Chen. 2023. Gamma: Revisiting template-based automated program repair via mask prediction. arXiv preprint arXiv:2309.09308.
[64]
Qihao Zhu, Zeyu Sun, Yuan-an Xiao, Wenjie Zhang, Kang Yuan, Yingfei Xiong, and Lu Zhang. 2021. A syntax-guided edit decoder for neural program repair. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 341–353.

Cited By

View all
  • (2024)Divide-and-Conquer: Automating Code Revisions via Localization-and-RevisionACM Transactions on Software Engineering and Methodology10.1145/3697013Online publication date: 24-Sep-2024
  • (2024)Towards Understanding the Effectiveness of Large Language Models on Directed Test Input GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695513(1408-1420)Online publication date: 27-Oct-2024
  • (2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024
  • Show More Cited By

Index Terms

  1. One Size Does Not Fit All: Multi-granularity Patch Generation for Better Automated Program Repair

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ISSTA 2024: Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis
    September 2024
    1928 pages
    ISBN:9798400706127
    DOI:10.1145/3650212
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 September 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. Automated Program Repair
    2. Deep Learning
    3. Pre-Training

    Qualifiers

    • Research-article

    Funding Sources

    • National Key R&D Program of China
    • National Natural Science Foundation of China
    • Research Foundation from NUDT

    Conference

    ISSTA '24
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 58 of 213 submissions, 27%

    Upcoming Conference

    ISSTA '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)265
    • Downloads (Last 6 weeks)60
    Reflects downloads up to 23 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Divide-and-Conquer: Automating Code Revisions via Localization-and-RevisionACM Transactions on Software Engineering and Methodology10.1145/3697013Online publication date: 24-Sep-2024
    • (2024)Towards Understanding the Effectiveness of Large Language Models on Directed Test Input GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695513(1408-1420)Online publication date: 27-Oct-2024
    • (2024)Unveiling the Characteristics and Impact of Security Patch EvolutionProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695488(1094-1106)Online publication date: 27-Oct-2024
    • (2024)Effective Vulnerable Function Identification based on CVE Description Empowered by Large Language ModelsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695013(393-405)Online publication date: 27-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media