Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport of #108773 to release/9.0
/cc @Maoni0
Customer Impact
This was found as a hang during the build process, specifically always when it executes this command -
"some_dir\dotnet.exe" "repo_root\artifacts\bin\ILLink.Tasks\Release\net9.0\illink.dll" @"C:\Users\andya\AppData\Local\Temp\MSBuildTemp\tmp.rsp"
some_dir could be
repo_root\.dotnet
orC:\Program Files\dotnet
.This will using the runtime from the 9.0 RC1 build.
This issue can cause applications to deadlock on startup when using DATAS (on by default in 9.0 for SVR workloads).
Regression
this was a regression introduced in #105545. this would only manifest during the beginning of a process when we needed to grow the # of heaps (and can grow) and we haven't done a gen2 GC yet so we set it to doing one.
the problem is the check I made did not include the ephemeral GC that may happen at the beginning of a BGC before we set gc_background_running to true. so at the end of that eph GC, we are in calculate_new_heap_count and set trigger_initial_gen2_p to true, not realizing we are already in a BGC.
then during the joined_generation_to_condemn at the beginning of the next GC, if our conclusion was still doing an eph GC we'd make it a BGC due to trigger_initial_gen2_p which obviously would cause problem if a BGC is already in progress (I do have an assert for this but we haven't seen this in a dbg build...).
the fix is to simply use the right check - is_bgc_in_progress() instead of background_running_p() which includes that eph GC case.
Testing
#105545 went through the GC team's stress testing but this particular issue was not observed (this manifests as a hang in retail and an assert in debug but neither was observed). this basically only happens during the beginning of a process. I could repro it with that particular illink command so I tested it using that command. will be adding more testing specifically for this area in .NET 10.
Risk
Low, this missed a case where we should be checking during a possible ephemeral GC at the beginning of a BGC and the fix is to add that check.
IMPORTANT: If this backport is for a servicing release, please verify that:
release/X.0-staging
, notrelease/X.0
.