GitHub - dandan92831/MuTAP: MutAP: A prompt_based learning technique to automatically generate test cases with Large Language Model

1. Initial prompt on LLMC (Codex and Llama-2-chat) and syntax Fixer

At the end of this step, a unit test, including several test cases, will be generated for a PUT. The test cases are syntactically fixed.

$ python generate_test_oracle.py "fewshot" "HumanEval" x(the number of put) "script_NDS_" "T_O_FS_synxfixed_" "fewshot_synx_fix.csv"

2. Functional repair

semantic_err_correction.py repairs the functional errors in assertions.

$ python semantic_err_correction.py "HumanEval" x "T_O_FS_synxfixed_" "T_O_FS_semticfixed_" "fewshot_semantic_fix.csv"

3. Generate mutants

We use MutPy [cite] to generate mutants of a PUT.

$ python Mutants_generation.py "HumanEval" x "script_NDS_" "mutant_type.csv"

4. Calculate Mutation Score (MS)

After generating mutants, Mutation_Score.py calculates the MS for IUT.

$ python Mutation_Score.py "HumanEval" x  "T_O_FS_semticfixed_" "fewshot_mutant_score.csv"

5. Prompt augmentation

The surviving mutant in step 4 indicates the weaknesses of IUT'. In our prompt-based learning technique, MuTAPuses those mutants to improve the effectiveness of theIUT`. The final output is named Augmented Unit Test (AUT).

$ python augmented_prompt.py "fewshot" "HumanEval" x "T_O_FS_semticfixed_" "test_oracle_FS_Mut_" "fewshot_mutant_score.csv"

6. Merge test cases

This step merges IUT with the AUT.

python Merge_all_mut.py  "HumanEval" x "T_O_FS_semticfixed_" "test_oracle_FS_Mut_" "T_O_FS_Mut_all_"

7. Greedy minimization

This step tries to minimize the number of assertions while maximizing the MS. greedy_test_generator.py runs on all PUT

python greedy_test_generator.py "HumanEval" "greedy_FS_results.scv"

#the PUT92
def any_int(x, y, z): 
          if isinstance(x,int) and isinstance(y,int) and isinstance(z,int):    
               if (x+y==z) or (x+z==y) or (y+z==x):           
                    return True       
               return False   
          return False

The final test case that MuTAP generates for this example PUT is as follows:

def test():
    assert any_int(3, 2, 5) == True
    assert any_int(-3, -2, 1) == True
    assert any_int(3, 2, 2) == False
    assert any_int(3, 2, 1) == True
    assert any_int(3.6, -2.2, 2) == False

Load Data

Test cases generated by both initial prompt types, before and after augmentation are stored into two jsonl files: Codex_test.jsonl.gz and llama2_test.jsonl.gz. Read_json_data.py loads all test cases into a pickle file.

Citation

Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing

@article{,
  title={Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing},
  author={Arghavan Moradi Dakhel, Amin Nikanjam, Vahid Majdinasab, Foutse Khomh, Michel C. Desmarais},
  journal={https://arxiv.org/abs/2308.16557},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
Data		Data
HumanEval		HumanEval
MuTAP		MuTAP
Refactory/Reference_Scripts		Refactory/Reference_Scripts
llama_util		llama_util
utils		utils
.gitignore		.gitignore
MuTAP_diagram.png		MuTAP_diagram.png
README.md		README.md
Read_json_data.py		Read_json_data.py
additional_results.xlsx		additional_results.xlsx
motivatated_example.png		motivatated_example.png
prompt_example.txt		prompt_example.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

1. Initial prompt on LLMC (Codex and Llama-2-chat) and syntax Fixer

2. Functional repair

3. Generate mutants

4. Calculate Mutation Score (MS)

5. Prompt augmentation

6. Merge test cases

7. Greedy minimization

Load Data

Citation

About

Uh oh!

Releases

Packages

Languages

dandan92831/MuTAP

Folders and files

Latest commit

History

Repository files navigation

1. Initial prompt on LLMC (Codex and Llama-2-chat) and syntax Fixer

2. Functional repair

3. Generate mutants

4. Calculate Mutation Score (MS)

5. Prompt augmentation

6. Merge test cases

7. Greedy minimization

Load Data

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages