Skip to content

MutAP: A prompt_based learning technique to automatically generate test cases with Large Language Model

Notifications You must be signed in to change notification settings

dandan92831/MuTAP

 
 

Repository files navigation

1. Initial prompt on LLMC (Codex and Llama-2-chat) and syntax Fixer

At the end of this step, a unit test, including several test cases, will be generated for a PUT. The test cases are syntactically fixed.

$ python generate_test_oracle.py "fewshot" "HumanEval" x(the number of put) "script_NDS_" "T_O_FS_synxfixed_" "fewshot_synx_fix.csv"

2. Functional repair

semantic_err_correction.py repairs the functional errors in assertions.

$ python semantic_err_correction.py "HumanEval" x "T_O_FS_synxfixed_" "T_O_FS_semticfixed_" "fewshot_semantic_fix.csv"

3. Generate mutants

We use MutPy [cite] to generate mutants of a PUT.

$ python Mutants_generation.py "HumanEval" x "script_NDS_" "mutant_type.csv"

4. Calculate Mutation Score (MS)

After generating mutants, Mutation_Score.py calculates the MS for IUT.

$ python Mutation_Score.py "HumanEval" x  "T_O_FS_semticfixed_" "fewshot_mutant_score.csv"

5. Prompt augmentation

The surviving mutant in step 4 indicates the weaknesses of IUT'. In our prompt-based learning technique, MuTAPuses those mutants to improve the effectiveness of theIUT`. The final output is named Augmented Unit Test (AUT).

$ python augmented_prompt.py "fewshot" "HumanEval" x "T_O_FS_semticfixed_" "test_oracle_FS_Mut_" "fewshot_mutant_score.csv"

6. Merge test cases

This step merges IUT with the AUT.

python Merge_all_mut.py  "HumanEval" x "T_O_FS_semticfixed_" "test_oracle_FS_Mut_" "T_O_FS_Mut_all_"

7. Greedy minimization

This step tries to minimize the number of assertions while maximizing the MS. greedy_test_generator.py runs on all PUT

python greedy_test_generator.py "HumanEval" "greedy_FS_results.scv"
#the PUT92
def any_int(x, y, z): 
          if isinstance(x,int) and isinstance(y,int) and isinstance(z,int):    
               if (x+y==z) or (x+z==y) or (y+z==x):           
                    return True       
               return False   
          return False

The final test case that MuTAP generates for this example PUT is as follows:

def test():
    assert any_int(3, 2, 5) == True
    assert any_int(-3, -2, 1) == True
    assert any_int(3, 2, 2) == False
    assert any_int(3, 2, 1) == True
    assert any_int(3.6, -2.2, 2) == False
    

Load Data

Test cases generated by both initial prompt types, before and after augmentation are stored into two jsonl files: Codex_test.jsonl.gz and llama2_test.jsonl.gz. Read_json_data.py loads all test cases into a pickle file.


Citation

Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

@article{,
  title={Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing},
  author={Arghavan Moradi Dakhel, Amin Nikanjam, Vahid Majdinasab, Foutse Khomh, Michel C. Desmarais},
  journal={https://arxiv.org/abs/2308.16557},
  year={2023}
}

About

MutAP: A prompt_based learning technique to automatically generate test cases with Large Language Model

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%