-
-
Notifications
You must be signed in to change notification settings - Fork 147
Tests
Ivadomed has two main types of tests:
-
Unit Tests
- Unit tests typically test a small portion of code, such as a single function.
- They test individual modules/functions in the
ivadomedPython API, or isolated functions in CLI scripts (but not the full script).
-
Functional Tests
- Functional tests aim to test a specific feature/outcome from the code.
- Often multiple functions/scripts are involved.
- We don't care about the implementation of each function, as long as the desired outcome is achieved.
- The nomenclature is unfortunately confusing, since the use of the word
functioninfunctionaltesting refers to a purpose or outcome, as opposed to the wordfunctionused in unit testing, which refers to a single programmatic method. - We test
ivadomed's CLI scripts in full (as if called by a user).
Tests are run in parallel using the pytest command. The default configuration uses the xdist plugin to perform distributed testing (num_workers == num_vcpu). Tests are grouped by module for test functions and by class for test methods.
- Ensure your tests clean up after themselves, so that no new files are left behind. Ideally, the test data directory (
testing_data/) should be left in the same state it was in prior to the test.- The built-in pytest fixture
tmp_pathprovides an place to output files that will take care of cleanup automatically.
- The built-in pytest fixture
- Use logging in tests (don't print).
- It is a good idea to test the
passcase and thefailcase as well.
- Should be located in
testing/unit_tests/ testing/unit_tests/test_${module}.py- Unit tests should aim to be simple and specific.
- Unit tests should aim to avoid coupling/interdependence.
- Unit tests should not cause unexpected side effects (modify an input file, create unwanted artefacts on the filesystem, etc).
- It is preferable to generate input data instead of relying on external input data where possible. This avoids regressions caused by the external data changing without the test being updated.
- It is a good idea to parameterize tests.
- Should be located in
testing/functional_tests/ testing/functional_tests/test_${script_to_test}.py- Should call the script using
script_to_test.main(args)instead of usingsubprocess/run. This allows things likepytest.raisesto be used if checking for an expected exception. - Shared initialization/setup logic should be implemented in a fixture.
- Sets of
(input_data/expected_results)can be passed to the functional test via parameterization. - Should make a best effort at testing the main functionalities provided by the script.
- Should not call logic from another function/module. That would be a higher-level functional test (TBD where to put?)
One thing to note is that when testing any of the command line scripts that use multiprocessing, you should use script_runner to call the module instead of script_to_test.main(args). This is because if you do the latter, the processes aren't closed between the tests in your script.
For example, let's say you have a script, mp_script.py:
import multiprocessing as mp
import time
def sleep_for_a_bit(seconds):
print(f"Sleeping for {seconds} second(s)")
time.sleep(seconds)
print("Done sleeping!")
print(mp.current_process())
def main():
pool = mp.Pool(processes=2)
pool.map(func=sleep_for_a_bit, iterable=[1 for i in range(0, 2)])
pool.close()
if __name__ == "__main__":
main()
Let's say you want to test this script using script_to_test.main(args):
import mp_script
def test_mp_script_a():
mp_script.main()
def test_mp_script_b():
mp_script.main()
If you run this test, you will get something like:
Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-2' parent=91203 started daemon>
Sleeping for 1 second(s)
Sleeping for 1 second(s)
Done sleeping!
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-4' parent=91203 started daemon>
<SpawnProcess name='SpawnPoolWorker-3' parent=91203 started daemon>
Note here how between the two separate tests, test_mp_script_a and test_mp_script_b, the SpawnPoolWorkers aren't resetting? In this example, the tests would pass fine, because the script isn't doing anything other than sleeping. However, if your script relies on the processes to be distinct, there could be a problem, as in automate_training.py.
To correctly test, you will need to use script_runner from pytest-console-scripts:
import pytest
@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_a(script_runner):
ret = script_runner.run('mp_script')
print(f"{ret.stdout}")
print(f"{ret.stderr}")
assert ret.success
@pytest.mark.script_launch_mode('subprocess')
def test_mp_script_b(script_runner):
ret = script_runner.run('mp_script')
print(f"{ret.stdout}")
print(f"{ret.stderr}")
assert ret.success
If you run this test, you should see something like:
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-1' parent=91378 started daemon>
Sleeping for 1 second(s)
Done sleeping!
<SpawnProcess name='SpawnPoolWorker-2' parent=91378 started daemon>
As you can see, the SpawnPoolWorker is not being carried over between tests anymore.