Skip to content

Conversation

@mcourteaux
Copy link
Contributor

@mcourteaux mcourteaux commented Aug 2, 2025

I implemented this as a target feature flag. This way we can still disable this for actual performance optimized releases of software.

We could use Feature::Debug as another reason to enable this LLVM flag.

Fixes #8440
Fixes #8702
Fixes #8754

@mcourteaux mcourteaux requested a review from abadams August 2, 2025 17:32
@mcourteaux mcourteaux added enhancement New user-visible features or improvements to existing features. usability labels Aug 2, 2025
@mcourteaux
Copy link
Contributor Author

@alexreinking Can you fix this? https://buildbot.halide-lang.org/master/#/builders/71/builds/223/steps/9/logs/stdio

@alexreinking
Copy link
Member

@alexreinking Can you fix this? https://buildbot.halide-lang.org/master/#/builders/71/builds/223/steps/9/logs/stdio

I'm working on it -- sorry, I figured the weekend would be an OK time to mess with the bots.

@mcourteaux
Copy link
Contributor Author

Aha hahahah! Carry on. 🫡 Lemme know once you think it's fixed, then I'll retrigger the bots. Thanks Alex!

for (const char *s : {"halide_use_soft_float_abi",
"halide_use_pic"}) {
"halide_use_pic",
"halide_keep_frame_pointer"}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I very much prefer the term "keep frame pointer", but in being consistent with compilers out there, should we change to "no_omit_frame_pointer"?

@slomp
Copy link
Contributor

slomp commented Aug 3, 2025

I just gave the fn.addFnAttr("frame-pointer", "all") a try on Windows, and it did not give me what I was expecting.
Still only get the FunctionName_par_for_output__1_s0_y_y, and no sign of any of its callers (halide_do_task, halide_do_par_for, the "FunctionName" Halide interface, and the application code that calls it).

In samples where there's no Halide code executing, but is in the Halide runtime, the stack unwind stops right there at the Halide runtime (the halide_mutex_lock you see there is not actually from the Halide runtime, but from the detour I installed):
image

This seems to be more indicative that we also need to emit frame pointers in the runtime as well.

@shoaibkamil
Copy link
Contributor

I wonder if we need to add -fno-omit-frame-pointer here:

@slomp
Copy link
Contributor

slomp commented Aug 4, 2025

Oh, and we are also purposedly stripping unwind information from the runtime with -fno-unwind-tables:

-fno-unwind-tables

I think unwind tables are there to assist unwinding callstacks (maybe exceptions too?) when frame pointers are omitted.

@abadams
Copy link
Member

abadams commented Aug 4, 2025

Compiler flags in the cmakefile should be irrelevant here, if we do things right. There are other ways to build libHalide, so these should be controlled in code, not in the build. These flags should just set the initial state of the function attributes in the runtime functions, but they should get clobbered by CodeGen_LLVM.cpp line 629.

These functions are currently marked nounwind because it's impossible to throw exceptions safely through Halide-generated code (we'd leak all over the place), and according to the llvm docs, nounwind is for functions you can't throw exceptions through. There's another attribute "uwtable" which indicates "the ABI being targeted requires that an unwind table entry be produced for this function even if we can show that no exceptions passes by it." We could try adding that, with fn.addFnAttr("uwtable") in CodeGen_Internal.cpp in set_function_attributes_from_halide_target_options.

@slomp
Copy link
Contributor

slomp commented Aug 4, 2025

Yeah, I was doing some reading, and found out that the unwind tables for callstack unwinding just piggy-backs on the infrastructure arounf exception handling .eh_frame sections, which we can't rely on anyway.

@shoaibkamil I tried adding -fno-omit-frame-pointer to the runtime flags, but still no stack frames past the first runtime symbol that shows up.
Not sure what else we may need to change to ensure we emit frame pointers for things generated by Halide::compile_standalone_runtime().

@mcourteaux
Copy link
Contributor Author

I just gave the fn.addFnAttr("frame-pointer", "all") a try on Windows, and it did not give me what I was expecting.

We could add those flags for the runtime functions too. I'll try that tomorrow and push it. You can try again. I'll ping you when I did this.

@abadams
Copy link
Member

abadams commented Aug 4, 2025

Working with Marcos, the thing that worked on windows was:

    fn.addFnAttr(llvm::Attribute::getWithUWTableKind(module.getContext(), llvm::UWTableKind::Default));

Turns out windows x64 doesn't really do frame pointers at all.

@mcourteaux
Copy link
Contributor Author

mcourteaux commented Aug 5, 2025

Turns out windows x64 doesn't really do frame pointers at all.

What does that mean? The windows debugger ignores them for back tracing the call stack?


I see two options, and the "New Proposal" would get my preference.

Backwards-compatible way forward

What do we do with this flag? Unwind tables are big to store in the .eh_frame section. This should probably not be the default in Halide? Thoughts? Did you compare file size before and after?

I'm assuming Unwind Tables generated by LLVM would replace the need for the frame pointer. This is kinda pity, because on Linux, it does suffice to produce a backtrace.

So do we keep both flags as feature flags? KeepFramePointer/NoEmitFramePointer and UnwindTables?

New Proposal

I'm sort of thinking that the Debug feature flag should enable the required flags to make it debuggable. But right now, we use that for verbose logging from the runtime. I'm thinking that perhaps we enable all the verbose debug logging when enabling a new VerboseDebug/VerboseRuntime feature flag, and have the unwind-tables/frame-pointer stuff be influenced by Debug. That would be my preferred way forward. I never use the Debug flag right now, as it just spams so much stuff in the console, that it's not useful unless you're working on Halide GPU backends itself. With my proposed flags, it would start to make sense to enable the Debug feature flag in your development/debug builds of your software/application for increased debug-ability.

@slomp @shoaibkamil @abadams @alexreinking @steven-johnson @vksnk

@slomp
Copy link
Contributor

slomp commented Aug 5, 2025

What does that mean? The windows debugger ignores them for back tracing the call stack?

According to the MSVC compiler: "/Oy enables frame-pointer omission and /Oy- disables omission. In x64 compilers, /Oy and /Oy- are not available"
https://learn.microsoft.com/en-us/cpp/build/reference/oy-frame-pointer-omission?view=msvc-170

Going a bit deeper into the Windows 64 ABI, for proper unwinding, the PE image must include pdata and xdata sections that provide the necessary information. No frame pointers are needed.
https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170
(I think this is the same mechanism they use for SEH (structured exception handling), the Windows-equivalent to synchronous signals/traps).

Now, whether or not providing frame pointers on top of that will work and have benefits, is unclear. I am sure third-party libraries could use them to do the unwind, but I can't find much about the behavior of the "Windows unwinder" when frame pointers are used.

@slomp
Copy link
Contributor

slomp commented Aug 5, 2025

I'm sort of thinking that the Debug feature flag should enable the required flags to make it debuggable.

The thing is: callstacks are also invaluable for profiling, and profiling Debug modules is not ideal.

What do we do with this flag?

The safe thing to do for now is for unwinding to be "opt-in".

We would need to do some research to see if the impact of adding .eh_frame sections for Halide code and runtime modules is either onerous or negligible. The same goes for frame-pointer (since this works on Linux, and probably on Mac too).

@abadams
Copy link
Member

abadams commented Aug 5, 2025

My inclination would be to have a single flag that turns on frame pointers and unwind tables, and call it something to do with backtraces or call stacks. EnableBacktraces? PreserveCallStack?

@mcourteaux mcourteaux force-pushed the keep-frame-pointers branch from 61ea8e1 to 4246ad3 Compare August 6, 2025 08:32
…efault in that case. Also enable that feature when Target::Feature::Debug is set. Cleanup some helper functions.
@mcourteaux mcourteaux force-pushed the keep-frame-pointers branch from 4246ad3 to f3dc266 Compare August 6, 2025 08:48
@mcourteaux
Copy link
Contributor Author

mcourteaux commented Aug 6, 2025

Update: changed the flag name to EnableBacktraces and added the UWTable fn attribute, and additionally also enable this when Debug feature flag is set. Instead of what @abadams wrote, I used this helper function:

https://llvm.org/doxygen/IR_2Function_8h_source.html#l00687

@mcourteaux mcourteaux requested a review from slomp August 6, 2025 09:53
@alexreinking
Copy link
Member

That correctness_register_shuffle failure seems specific to LLVM 21, which we weren't testing until this weekend.

fn.addFnAttr("target-features", mattrs);
if (enable_bt) {
fn.addFnAttr("frame-pointer", "all");
fn.setUWTableKind(llvm::UWTableKind::Default);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting both frame pointer and unwind tables is fine for now, but I'd add a comment mentioning that having both set is not necessary, and which one is more appropriate depends on the system/arch/abi.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well both is nice, because unwinding is very fast if you have frame pointers enabled. Unwinding via unwind information in .eh_frame is very slow in comparison. Although I don't think it would be a big deal, as the callstack depth in Halide routines is never very big. So perhaps its better to only rely on the unwind table...?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage, we don't need to overthink too much, since it's opt-in.
Have both enabled, and we can tune things later, once we start using and having a better understanding of the implications in practice.

@slomp slomp changed the title New feature flag to keep the frame pointer for LLVM-generated functions. New feature flag to allow for stack backtrace/unwind Aug 6, 2025
@mcourteaux
Copy link
Contributor Author

@slomp is this working for you, on Windows?

@slomp
Copy link
Contributor

slomp commented Aug 6, 2025

@slomp is this working for you, on Windows?

Yes, just tested the fn.setUWTableKind(llvm::UWTableKind::Default); and it worked.

@slomp
Copy link
Contributor

slomp commented Aug 13, 2025

Where do we all stand with this one?
I'm inclined in merging, we just need to all agree on the name/label of the feature flag.
(I'm OK with the current "enable_backtraces".)

@mcourteaux
Copy link
Contributor Author

I'm also in favor of merging as is. Waiting for approval.

@mcourteaux
Copy link
Contributor Author

@abadams @alexreinking @slomp Approval to land? I'm juggling PRs.

halide_target_feature_no_bounds_query, ///< Disable the bounds querying functionality.
halide_target_feature_jit = 0, ///< Generate code that will run immediately inside the calling process.
halide_target_feature_debug, ///< Turn on debug info and output for runtime code.
halide_target_feature_enable_backtraces, ///< Keep the frame pointer in tact in functions produced by LLVM.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be something about unwind tables. Maybe "Preserve frame pointers and include unwind tables to support accurate backtraces for debugging and profiling."

Copy link
Member

@abadams abadams left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me with a minor comment about the description of the feature flag

@alexreinking
Copy link
Member

@mcourteaux - go ahead and make the documentation tweak @abadams suggested and then just merge.

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a one-line change for Python 😭

@alexreinking alexreinking self-requested a review August 15, 2025 01:38
Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, let's get this in

@alexreinking alexreinking merged commit 6709900 into halide:main Aug 15, 2025
5 of 17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New user-visible features or improvements to existing features. usability

Projects

None yet

5 participants