Limit cleanup loop in test_tbb_fork to 5 iterations.#2041
Limit cleanup loop in test_tbb_fork to 5 iterations.#2041petterreinholdtsen wants to merge 1 commit into
Conversation
09026ab to
f1a2853
Compare
| ASSERT(0, "sigwait failed"); | ||
| } else | ||
| break; | ||
| sleep(1); |
There was a problem hiding this comment.
I agree, with limited number of iterations the test is more reliable.
But why do we need sleep() call?
|
[Alexandr-Konovalov]
But why do we need `sleep()` call?
I added the sleep call to make sure the loop was not a tight loop
blocking other processes on single threaded and slow machines to run and
wrap up their stuff before the five iterations were done.
I doubt it is _needed_, but suspect it is smart to give the test 5
seconds to complete instead of being done in a fraction of a second.
…--
Happy hacking
Petter Reinholdtsen
|
According to my understanding of the test, a child process is already terminated at this point, and we are just reading pending signals from the kernel. Is it not true? |
|
[Alexandr-Konovalov]
According to my understanding of the test, a child process is already
terminated at this point, and we are just reading pending signals from
the kernel. Is it not true?
It sure can be. Or an alarm might have triggered. Hard to know how
different systems fail, and I decided for what I believe is a safer
approach. If you believe it is better to drop the sleep(), I am not
going to argue. I do not see the advantage of taking the chance,
though.
--
Happy hacking
Petter Reinholdtsen
|
Without any doubt, busy waiting is evil, especially in 1-CPU configuration. But is there a busy waiting that we may want to improve by Did you see improvements in passrate after adding |
|
[Alexandr-Konovalov]
Did you see improvements in passrate after adding `sleep(1)` or is
this just in case?
I did not measure the effect with or without sleep(1). I started with a
success rate of around 2/3 using the sleep(), which I added because I
knew Hurd do not handle multiple cores or CPUs, so I suspected a busy
wait loop would complete before yielding the CPU to any other process in
time for the loop to get any useful updates while the loop is running.
The most important part of the change is to get rid of the endless loop,
which tended to be the 1/3 failing cases. :)
…--
Happy hacking
Petter Reinholdtsen
|
So, are you for checking exact effect of
Sure, the infinite loop is not reliable, let's get rid of it. |
|
[Alexandr-Konovalov]
So, are you for checking exact effect of `sleep(0)`? Or is it better
to just remove it?
Once the hurd patches are in master, I'll run a new test, and provide
more patches if it is needed. Feel free to use my patch, make a
different patch with sleep(0) or without sleep(). I'll circle back to
the hurd port when there is reason to believe the master branch should
work, and in the mean time I focus on other projects.
…--
Happy hacking
Petter Reinholdtsen
|
There is no reason why the loop to clean up any pending signals should need more than two iterations to complete any SIGALRM and SIGCHLD remaining. Break the loop after 5 iterations and abort if it ever take this long, and sleep for one second every iteration to give even slow machines time to process the child. On GNU Hurd, the loop some times got stuck (around 1/3 of runs during my testing) in a endless stream of SIGCHLD signals. This code change ensure the test exits with an error instead of getting stuck forever.
f1a2853 to
cccb28f
Compare
Description
There is no reason why the loop to clean up any pending signals should need more than two iterations to complete any SIGALRM and SIGCHLD remaining. Break the loop after 5 iterations and abort if it ever take this long, and sleep for one second every iteration to give even slow machines time to process the child.
On GNU Hurd, the loop some times got stuck (around 1/3 of runs during my testing) in a endless stream of SIGCHLD signals. This code change ensure the test exits with an error instead of getting stuck forever.
Type of change
Choose one or multiple, leave empty if none of the other choices apply
Add a respective label(s) to PR if you have permissions
Tests
Documentation
Breaks backward compatibility