Limit cleanup loop in test_tbb_fork to 5 iterations. by petterreinholdtsen · Pull Request #2041 · uxlfoundation/oneTBB

petterreinholdtsen · 2026-04-01T20:22:24Z

Description

There is no reason why the loop to clean up any pending signals should need more than two iterations to complete any SIGALRM and SIGCHLD remaining. Break the loop after 5 iterations and abort if it ever take this long, and sleep for one second every iteration to give even slow machines time to process the child.

On GNU Hurd, the loop some times got stuck (around 1/3 of runs during my testing) in a endless stream of SIGCHLD signals. This code change ensure the test exits with an error instead of getting stuck forever.

Type of change

Choose one or multiple, leave empty if none of the other choices apply

Add a respective label(s) to PR if you have permissions

bug fix - change that fixes an issue
new feature - change that adds functionality
tests - change in tests
infrastructure - change in infrastructure and CI
documentation - documentation update

Tests

added - required for new features and some bug fixes
not needed

Documentation

updated in # - add PR number
needs to be updated
not needed

Breaks backward compatibility

Yes
No
Unknown

Alexandr-Konovalov · 2026-04-08T12:32:14Z

                                ASSERT(0, "sigwait failed");
                        } else
                            break;
+                        sleep(1);


I agree, with limited number of iterations the test is more reliable.

But why do we need sleep() call?

petterreinholdtsen · 2026-04-08T18:49:56Z

[Alexandr-Konovalov]

But why do we need `sleep()` call?

I added the sleep call to make sure the loop was not a tight loop blocking other processes on single threaded and slow machines to run and wrap up their stuff before the five iterations were done. I doubt it is _needed_, but suspect it is smart to give the test 5 seconds to complete instead of being done in a fraction of a second.

…

-- Happy hacking Petter Reinholdtsen

Alexandr-Konovalov · 2026-04-10T17:59:15Z

I added the sleep call to make sure the loop was not a tight loop blocking other processes on single threaded and slow machines to run and wrap up their stuff before the five iterations were done.

According to my understanding of the test, a child process is already terminated at this point, and we are just reading pending signals from the kernel. Is it not true?

petterreinholdtsen · 2026-04-11T10:41:12Z

[Alexandr-Konovalov]

According to my understanding of the test, a child process is already terminated at this point, and we are just reading pending signals from the kernel. Is it not true?

It sure can be. Or an alarm might have triggered. Hard to know how different systems fail, and I decided for what I believe is a safer approach. If you believe it is better to drop the sleep(), I am not going to argue. I do not see the advantage of taking the chance, though. -- Happy hacking Petter Reinholdtsen

Alexandr-Konovalov · 2026-04-14T16:54:30Z

[Alexandr-Konovalov]
According to my understanding of the test, a child process is already terminated at this point, and we are just reading pending signals from the kernel. Is it not true?
It sure can be. Or an alarm might have triggered. Hard to know how different systems fail, and I decided for what I believe is a safer approach. If you believe it is better to drop the sleep(), I am not going to argue. I do not see the advantage of taking the chance, though. -- Happy hacking Petter Reinholdtsen

Without any doubt, busy waiting is evil, especially in 1-CPU configuration. But is there a busy waiting that we may want to improve by sleep() adding? Wrt alarm, the alarm can be triggered already and here we want only collect it status, not wait for something.

Did you see improvements in passrate after adding sleep(1) or is this just in case?

petterreinholdtsen · 2026-04-14T18:11:21Z

[Alexandr-Konovalov]

Did you see improvements in passrate after adding `sleep(1)` or is this just in case?

I did not measure the effect with or without sleep(1). I started with a success rate of around 2/3 using the sleep(), which I added because I knew Hurd do not handle multiple cores or CPUs, so I suspected a busy wait loop would complete before yielding the CPU to any other process in time for the loop to get any useful updates while the loop is running. The most important part of the change is to get rid of the endless loop, which tended to be the 1/3 failing cases. :)

…

-- Happy hacking Petter Reinholdtsen

Alexandr-Konovalov · 2026-04-15T09:01:25Z

[Alexandr-Konovalov]
Did you see improvements in passrate after adding sleep(1) or is this just in case?
I did not measure the effect with or without sleep(1). I started with a success rate of around 2/3 using the sleep(), which I added because I knew Hurd do not handle multiple cores or CPUs, so I suspected a busy wait loop would complete before yielding the CPU to any other process in time for the loop to get any useful updates while the loop is running.

So, are you for checking exact effect of sleep(0)? Or is it better to just remove it?

The most important part of the change is to get rid of the endless loop, which tended to be the 1/3 failing cases. :)

Sure, the infinite loop is not reliable, let's get rid of it.

petterreinholdtsen · 2026-04-15T10:33:53Z

[Alexandr-Konovalov]

So, are you for checking exact effect of `sleep(0)`? Or is it better to just remove it?

Once the hurd patches are in master, I'll run a new test, and provide more patches if it is needed. Feel free to use my patch, make a different patch with sleep(0) or without sleep(). I'll circle back to the hurd port when there is reason to believe the master branch should work, and in the mean time I focus on other projects.

…

-- Happy hacking Petter Reinholdtsen

There is no reason why the loop to clean up any pending signals should need more than two iterations to complete any SIGALRM and SIGCHLD remaining. Break the loop after 5 iterations and abort if it ever take this long, and sleep for one second every iteration to give even slow machines time to process the child. On GNU Hurd, the loop some times got stuck (around 1/3 of runs during my testing) in a endless stream of SIGCHLD signals. This code change ensure the test exits with an error instead of getting stuck forever.

petterreinholdtsen force-pushed the test-fork-no-endless-loop branch 3 times, most recently from 09026ab to f1a2853 Compare April 3, 2026 07:02

Alexandr-Konovalov reviewed Apr 8, 2026

View reviewed changes

petterreinholdtsen force-pushed the test-fork-no-endless-loop branch from f1a2853 to cccb28f Compare April 24, 2026 06:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit cleanup loop in test_tbb_fork to 5 iterations.#2041

Limit cleanup loop in test_tbb_fork to 5 iterations.#2041
petterreinholdtsen wants to merge 1 commit into
uxlfoundation:masterfrom
petterreinholdtsen:test-fork-no-endless-loop

petterreinholdtsen commented Apr 1, 2026 •

edited

Loading

Uh oh!

Alexandr-Konovalov Apr 8, 2026

Uh oh!

petterreinholdtsen commented Apr 8, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 10, 2026

Uh oh!

petterreinholdtsen commented Apr 11, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 14, 2026

Uh oh!

petterreinholdtsen commented Apr 14, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 15, 2026

Uh oh!

petterreinholdtsen commented Apr 15, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

petterreinholdtsen commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tests

Documentation

Breaks backward compatibility

Uh oh!

Alexandr-Konovalov Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

petterreinholdtsen commented Apr 8, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 10, 2026

Uh oh!

petterreinholdtsen commented Apr 11, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 14, 2026

Uh oh!

petterreinholdtsen commented Apr 14, 2026 via email

Uh oh!

Alexandr-Konovalov commented Apr 15, 2026

Uh oh!

petterreinholdtsen commented Apr 15, 2026 via email

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

petterreinholdtsen commented Apr 1, 2026 •

edited

Loading