Skip to content

fix: align output kind with LD in more cases#1279

Merged
mati865 merged 24 commits into
wild-linker:mainfrom
mati865:push-rowqlvownpro
Nov 8, 2025
Merged

fix: align output kind with LD in more cases#1279
mati865 merged 24 commits into
wild-linker:mainfrom
mati865:push-rowqlvownpro

Conversation

@mati865
Copy link
Copy Markdown
Member

@mati865 mati865 commented Nov 7, 2025

When testing eyra I've once more encountered a problem similar to #879 and decided to solve it properly this time.
FYI, this time the problem was caused by Wild outputting static PIE even though dynamic linker was set.

Spending countless hours on reverse engineering the output kind based on dozens of scenarios probably wasn't the best way to spend that time, but hopefully this is now robust enough that we won't have to deal with all this mess again.

Wild will now turn static PIE into dynamic one even if no DSO is linked (pie-over-shared).

@mati865
Copy link
Copy Markdown
Member Author

mati865 commented Nov 7, 2025

These CI errors surely will be fun.

@mati865 mati865 marked this pull request as draft November 7, 2025 14:50
@mati865
Copy link
Copy Markdown
Member Author

mati865 commented Nov 7, 2025

These CI errors surely will be fun.

The failures I was talking about:

 ---- integration_test::program_name_49___libc_integration_c__ stdout ----
Error: Test failed: `libc-integration.c` with linker `wild` config `gcc-dynamic-no-pie`
  Caused by:
    Linker failed:
wild: error: Tried to create dynamic symbol index 0
collect2: error: ld returned 255 exit status

Relink with:
WILD_WRITE_LAYOUT=1 WILD_WRITE_TRACE=1 OUT=/__w/wild/wild/wild/tests/build/libc-integration.c-gcc-dynamic-no-pie-host.wild /__w/wild/wild/wild/tests/build/libc-integration.c-gcc-dynamic-no-pie-host.save/run-with cargo run --bin wild --



failures:
    integration_test::program_name_49___libc_integration_c__
    integration_test::program_name_53___cpp_integration_cc__

https://github.com/davidlattimore/wild/actions/runs/19172217881/job/54807556902?pr=1279

Some context:

I have noticed that adding DSO with --as-needed should not turn static executable into dynamic one if the final output doesn't really need it. Previous logic would turn the output into dynamic executable without taking that into account.
This can be reproduced by Config:non-loaded-dso test.
I have tried to fix that with 4cd1cff (#1279) but misunderstood how it works and plugged it too early in the chain.

This worked on Arch and openSUSE but failed on Ubuntu. As it turns out, GCC on Ubuntu prepends --as-needed before DSOs from our tests. So libc-integration-0.gcc-dynamic-no-pie-host-2d856e7ac0d580d9.wild.so had no modifierson Arch Linux and as_needed on Ubuntu. Minimised it as Config:wip test.
Obviously, that new logic fell apart because it didn't really check whether DSO passed for linking ended up loaded.

The dubious fix:

I tried to fix it with 4654328 (#1279) commit, but that logic is complex, probably even too much so.

Looks like the logic from #879 wasn't too far off, and upon getting DSO as an argument, we should try to produce a dynamic executable. However, if no DSO ends up being used after symbol_db build and resolution (both these parts behave differently depending on resulting binary type) phases, we have to go back to static executable format.
I wish I'm wrong somewhere, and it's not as complicated as I made it, so I could use some help with this maze.
Also, there is Alpine failure, which increases the likelihood of something being wrong with this attempt.

@davidlattimore
Copy link
Copy Markdown
Member

Looking through the callers of args.output_kind, I'm concerned about switching output kinds so late. Even where it was being done before is dubious, since there's things in parsing that check if we're producing a static executable. The things in parsing are for the prelude and I suspect might be able to be moved later. However, during symbol loading, we compute symbol interposability based on whether the output kind is a static executable. It might be possible to move some of that computation later. Whether we can do that without a performance impact is unclear.

I guess the ideal would be to make the decision at one specific point in the link and have nothing prior to that point in linking depend on the output kind. Possibly we should even remove output_kind from Args so that nothing can even try to make use of the output kind until it's actually available. However, some refactorings will be needed first to remove / move those early uses of output_kind.

It's a shame that GNU ld has such vague semantics with regard to determining the output type. Does lld do the same thing?

So eyra when it builds passes --as-needed shared objects to the linker, then doesn't use any of them and expects to get a statically linked executable?

@davidlattimore
Copy link
Copy Markdown
Member

Does lld do the same thing?

To answer my own question, I just tried and if I enable lld for the test config non-loaded-dso, lld, unlike GNU ld, produces a dynamically linked executable. Does the problem you encountered when building eyra happen if you use lld?

@mati865
Copy link
Copy Markdown
Member Author

mati865 commented Nov 7, 2025

IIRC for eyra the problem was related to Wild creating static PIE when given --dynamic-linker <path> -pie without any DSO and this hack did help: mati865@ea506a1 but I decided to make it more comprehensible. I'm sure LLD also creates dynamic PIE if passed --dynamic-linker <path> -pie but without any DSO.
For that case, just https://github.com/davidlattimore/wild/pull/1279/files#diff-2e883322d393dfb4895ead5feb987783ec39481a8d43481a6215c157464b9477R452-R458 would be enough and it doesn't complicate code a lot. Unless it breaks other tests without pulling other changes...

The test case for that should be pie-over-shared from this PR.

Looking through the callers of args.output_kind, I'm concerned about switching output kinds so late. Even where it was being done before is dubious, since there's things in parsing that check if we're producing a static executable.

It's a shame that GNU ld has such vague semantics with regard to determining the output type.

Yeah, hence my "I wish I'm wrong somewhere, and it's not as complicated as I made it, so I could use some help with this maze."
This just feels wrong, and I have no problem with sticking closer to LLD than LD in some cases, like I did in pie-default-dynamic-linker.

I guess the ideal would be to make the decision at one specific point in the link and have nothing prior to that point in linking depend on the output kind. Possibly we should even remove output_kind from Args so that nothing can even try to make use of the output kind until it's actually available. However, some refactorings will be needed first to remove / move those early uses of output_kind.

Yes, that's my goal too, but I didn't want to cram too much into a single PR, thanks to LD decisions it's complicated enough already.
As a starter, I'd move OutputKind out of Args and turn hacks like this one: https://github.com/mati865/wild/blob/297c563b224f2f6bddffc5c3bf81e06fd9209fa3/libwild/src/args.rs#L1562
into is_shared_object = false or rather use an enum with states: unspecified, shared, executable. Then inside lib.rs determine output kind based on that info, once and for all*, even without the current AtomicBool called is_dynamic_executable.
If you want, I can do that refactor first.

*unless we want to match LD's auto revert to static executable if no DSO is loaded...

@davidlattimore
Copy link
Copy Markdown
Member

I agree that this is a case where we should probably match lld's behaviour rather than GNU ld's.

I imagine that GNU ld isn't so much reverting to static, but rather its behaviour is a result of where in the link process it decides to output a shared object. Presumably it's making that decision after --as-needed shared objects have been selected.

I'd be happy with bugfix then refactoring or refactoring then bugfix - whatever is easiest. Let me know if you'd like any help.

@mati865
Copy link
Copy Markdown
Member Author

mati865 commented Nov 8, 2025

I agree that this is a case where we should probably match lld's behaviour rather than GNU ld's.

Sure, everything is so much easier now.

I imagine that GNU ld isn't so much reverting to static, but rather its behaviour is a result of where in the link process it decides to output a shared object. Presumably it's making that decision after --as-needed shared objects have been selected.

I didn't word it properly, my bad. I meant that upon noticing DSO, LD would try to resolve symbols as it was creating the dynamic executable. If no DSO ends up loaded, it'd revert to static executable output and then proceed to figure out symbols details (considering how slow it is, it might as well resolve the symbols again).

Also, there is Alpine failure, which increases the likelihood of something being wrong with this attempt.

Even where it was being done before is dubious, since there's things in parsing that check if we're producing a static executable.

Yeah, the decision whether to try proceeding as dynamic executable really needs to happen before parsing.


Current version passes almost all the tests with Eyra (ctor ordering doesn't work yet, so there is a single failure) but shows a rather confusing diff:

rel.R_X86_64_PC32.R_X86_64_PC32
  `./home/mateusz/Projects/eyra/example-crates/no-std/target/x86_64-unknown-linux-gnu/debug/deps/libc_scape-c81b701b4d7cddaf.rlib` @ `c_scape-c81b701b4d7cddaf.c_scape.30858df2f97925db-cgu.10.rcgu.o` .text.dl_iterate_phdr dl_iterate_phdr
  ORIG 0x0007d: [ 48 8b 05 00 00 00 00 ] mov 0x84,%rax
                           ^^^^^^^^^^^ R_X86_64_GOTPCREL
  ORIG __executable_start -4
  wild 0x1f79d: [ 48 8d 05 5c 08 fe ff ] lea 0,%rax
                           ^^^^^^^^^^^ R_X86_64_PC32 MovIndirectToLea
  wild 0x0 (symbol is too far away)
  wild TRACE: relaxation applied relaxation=MovIndirectToLea, flags=NON_INTERPOSABLE | DIRECT,
  wild TRACE: rel_kind=Relative,
  wild TRACE: value=0xfffffffffffe085c, symbol_name=__executable_start
  ref  0x0cfed: [ 48 8d 05 0c 30 ff ff ] lea 0,%rax
                           ^^^^^^^^^^^ R_X86_64_PC32 MovIndirectToLea
  ref  __executable_start

Not that it matters at all.

I'll clean up the comments and tests tomorrow.

@mati865 mati865 marked this pull request as ready for review November 8, 2025 12:48
@mati865 mati865 merged commit 4f897ea into wild-linker:main Nov 8, 2025
20 checks passed
@mati865 mati865 deleted the push-rowqlvownpro branch November 8, 2025 23:18
@mati865
Copy link
Copy Markdown
Member Author

mati865 commented Nov 8, 2025

Thank you, @davidlattimore, for the insights!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants