Implement more arguments for exporting symbols#974
Conversation
b84b171 to
bcbf0fd
Compare
--export-dynamic-symbol and --export-dynamic-symbol-list--export-dynamic-symbol and --export-dynamic-symbol-list
4ed0f0f to
29eec02
Compare
mati865
left a comment
There was a problem hiding this comment.
There is more code repetition than I'd like, but apart from the missing unit tests, this is quite serviceable now. I don't know if this impacts performance yet.
| if !export_all_dynamic | ||
| && !&symbol_db | ||
| .explicitly_explicitly_export_symbol_list_data_list | ||
| .contains(&UnversionedSymbolName::prehashed( |
There was a problem hiding this comment.
prehashed is called quite often now:
❯ rg "::prehashed\("
libwild/src/version_script.rs
77: self.exact.insert(UnversionedSymbolName::prehashed(exact));
98: .contains(&UnversionedSymbolName::prehashed(demangled_name.as_bytes()))
532: let Some(m) = script.find_match(&UnversionedSymbolName::prehashed(name.as_bytes())) else {
libwild/src/diagnostics.rs
64: .get_unversioned(&UnversionedSymbolName::prehashed(name.as_bytes()));
libwild/src/symbol_db.rs
1119: let name = UnversionedSymbolName::prehashed(info.name_bytes);
1231: let name = UnversionedSymbolName::prehashed(name_bytes);
1553: Self::from_prehashed(symbol_id, UnversionedSymbolName::prehashed(name))
1572: name: VersionedSymbolName::prehashed(name, version),
libwild/src/symbol.rs
86: let name = UnversionedSymbolName::prehashed(name_info.name_bytes);
88: PreHashedSymbolName::Versioned(VersionedSymbolName::prehashed(name, version))
libwild/src/layout.rs
3173: .get_unversioned(&UnversionedSymbolName::prehashed(
4644: .contains(&UnversionedSymbolName::prehashed(
libwild/src/resolution.rs
627: .get_unversioned(&UnversionedSymbolName::prehashed(name))
libwild/src/elf_writer.rs
2807: .version_for_symbol(&UnversionedSymbolName::prehashed(sym_def.name))
but I don't know if this is avoidable.
There was a problem hiding this comment.
Might be something that can be included into refactoring as mentioned in #994.
There was a problem hiding this comment.
While multiple calls for the same symbol could be bad for performance, I'd say what perhaps matters most here is whether we actually make that call most of the time. I'd assume that in the vast majority of linker usages, the list of explicit exports will be empty. So, if we can avoid calling prehashed in that case, then we're probably OK.
There was a problem hiding this comment.
I was wondering if it would be possible and sensible to cache symbol_id -> symbol name lookups in additional hashmap/BTreeMap.
All the calls of prehashed in this PR should be avoidable unless we are given names of the symbols to export.
There was a problem hiding this comment.
We could. I'm unsure if it would be a performance win or not though. How many times do we hash a symbol name in the common case? I'd kind hope that it would still be 1. For the extra calls that we make in less common cases, I guess we could run some benchmarks that hit those cases and see how much of a problem it is.
It may be possible to get rid of some unnecessary calls via other means. e.g. the call you're adding in compute_value_flags - that function is called from load_symbols which is the primary place where symbol name hashes are computed. So in that case, I could imagine passing a &mut SymbolInfo to compute_value_flags where SymbolInfo contains the &crate::elf::Symbol that we currently pass, but also an Option<Prehashed<UnversionedSymbolName>>.
There was a problem hiding this comment.
Based on samply profile (with 10 000 Hz) when linking Clang using main branch with a single thread, prehashed() is called from:
libwild::symbol::UnversionedSymbolName::prehashedinsidelibwild::symbol::PreHashedSymbolName::from_rawcalled bylibwild::resolution::resolve_symbolinsidelibwild::resolution::resolve_symbols- 7.9% of sampleslibwild::symbol::UnversionedSymbolName::prehashedcalled bylibwild::symbol_db::SymbolLoader::load_symbols- 3.2% of sampleslibwild::symbol::UnversionedSymbolName::prehashedcalled bylibwild::symbol_db::SymbolLoader::load_symbols(again, looks like a different object given to Rayon?) - 1.6% of samples-
libwild::symbol::UnversionedSymbolName::prehashedcalled bylibwild::symbol_db::SymbolLoader::load_symbols(again, looks like a different object given to Rayon?) - 0.8% of samples
- ... and so on with
libwild::symbol_db::SymbolLoader::load_symbols
So, looks like 2 calls sites. One of them is parallel already and the other one will be after #1003
There was a problem hiding this comment.
Thanks for looking at that. Those two call sites should in theory amount to calling once per symtab entry. The call in resolve_symbol should only be for undefined symbols (references) and the call in load_symbols should only be for definitions.
--export-dynamic-symbol and --export-dynamic-symbol-list729c917 to
d7bbb12
Compare
8c3347d to
44dbb8d
Compare
|
This should be probably ready for the review, but I want to test it with Qt first. If somebody has ideas for better naming of these things, don't hesitate to leave a comment. |
| return Ok(ParsedSymbolMatcher::CxxMatchers(matchers)); | ||
| } | ||
|
|
||
| // TODO: This is the only significant divergence between `parse_matcher` from this file and version_script.rs |
There was a problem hiding this comment.
Is this a TODO that you intend to resolve before merging, or something for later?
There was a problem hiding this comment.
This was kind of "loud thinking" in case I or somebody who does the review gets any idea on how to improve it.
There was a problem hiding this comment.
Would changing the version in version_script.rs to accept from_arg be an option?
There was a problem hiding this comment.
Yeah, while not useful for version scripts, it deduplicates things a lot. So, probably worth the tradeoff.
| version_script_data: Option<VersionScriptData<'data>>, | ||
| args: &'data Args, | ||
| linker_scripts: &[InputLinkerScript<'data>], | ||
| explicitly_explicitly_export_symbol_list_data_list_data: Option<ExportListData<'data>>, |
There was a problem hiding this comment.
This name repeats "explicitly" and "list_data" twice. The field above also has "explicitly" twice. I haven't thought deeply about the name, so feel free to disregard, but what about just export_list, explicit_export_list, explicit_symbol_exports or some similar variant?
There was a problem hiding this comment.
Yeah, I had misclicked something in Helix and didn't notice the change (yay, feels like the early days in Vim). I have fixed it before seeing your comment, I think I'll go ahead with export_list.
| if !export_all_dynamic | ||
| && !&symbol_db | ||
| .explicitly_explicitly_export_symbol_list_data_list | ||
| .contains(&UnversionedSymbolName::prehashed( |
There was a problem hiding this comment.
While multiple calls for the same symbol could be bad for performance, I'd say what perhaps matters most here is whether we actually make that call most of the time. I'd assume that in the vast majority of linker usages, the list of explicit exports will be empty. So, if we can avoid calling prehashed in that case, then we're probably OK.
|
Turns out Qt also uses |
760a176 to
b522808
Compare
bd48b4c to
22d159e
Compare
| struct MatchRules<'data> { | ||
| general: BasicMatchRules<'data>, | ||
| cxx: BasicMatchRules<'data>, | ||
| pub(crate) struct MatchRules<'data> { |
There was a problem hiding this comment.
Dunno if we should somehow reorganise or perhaps extract the code that is shared between version scripts and export list.
There was a problem hiding this comment.
Since there is other work around this area that is right now blocked by this PR, let's leave this discussion for later.
22d159e to
31d9982
Compare
davidlattimore
left a comment
There was a problem hiding this comment.
Looks good to me! Only one very minor comment, that you could ignore if you wanted.
31d9982 to
430fb26
Compare
|
Thanks for the reviews. |
For example used by deno:
https://github.com/denoland/deno/blob/45dfae18eec45c91146231aa05f4b249378691b5/ext/napi/generated_symbol_exports_list_linux.def
Fixes #955