build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz #31191

maflcko · 2024-10-31T13:00:35Z

g_fuzzing is used inside Assume at runtime, causing significant overhead in hot paths. See #31178

One could simply remove the g_fuzzing check from the Assume, but this would make fuzzing a bit less useful. Also, it would be unclear if g_fuzzing adds a runtime overhead in other code paths today or in the future.

Fix all issues by making G_FUZZING equal to the build option BUILD_FOR_FUZZING, and for consistency in fuzzing, require it to be set when executing any fuzz target.

Fixes #31178

Temporarily this drops fuzzing from two CI tasks, but they can be re-added in a follow-up with something like #31073

The fuzz binary is still compiled. This is required for the next commit.

… fuzz target

DrahtBot · 2024-10-31T13:00:38Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31191.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	dergoegge, marcofleon, davidgumberg, ryanofsky

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#31161 (cmake: Set top-level target output locations by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

dergoegge

utACK fafbf8a

src/util/check.h

DrahtBot · 2024-10-31T14:41:51Z

🚧 At least one of the CI tasks failed.
_{Debug: https://github.com/bitcoin/bitcoin/runs/32337137613}

Hints

Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

marcofleon · 2024-10-31T15:50:31Z

Tested ACK fafbf8a

davidgumberg · 2024-10-31T22:49:34Z

Tested ACK fafbf8a

This solves #31178

./build/src/bench/bench_bitcoin -filter=LinearizeOptimallyExample11 -min-time=30000

branch	ns/op	op/s	err%	ins/op	cyc/op	IPC	bra/op	miss%	total	benchmark
master (`f07a533`)	380,031,566.62	2.63	0.3%	7,324,884,836.62	1,321,463,005.62	5.543	898,388,779.12	0.2%	32.69	`LinearizeOptimallyExample11`
branch (`fafbf8a`)	301,497,436.82	3.32	0.2%	4,990,499,128.40	1,046,022,020.91	4.771	352,054,052.20	0.3%	32.53	`LinearizeOptimallyExample11`

It also seems to reasonably solve #30950 & #31057 which were the motivation of #31093 which introduced the regression.

I don't have a view on this, but just want to document that this PR does not address the described use case of building for fuzzing without building the fuzz binary (#31057 (comment)) which is what motivated making g_fuzzing a runtime check.

BUILD_FOR_FUZZING:BOOL=OFF
BUILD_FUZZ_BINARY:BOOL=ON
I build with these options to be able to be able to know if changes not related to fuzzing will break the build.

maflcko · 2024-11-01T07:08:06Z

I don't have a view on this, but just want to document that this PR does not address the described use case of building for fuzzing without building the fuzz binary (#31057 (comment)) which is what motivated making g_fuzzing a runtime check.
BUILD_FOR_FUZZING:BOOL=OFF
BUILD_FUZZ_BINARY:BOOL=ON
I build with these options to be able to be able to know if changes not related to fuzzing will break the build.

Can you explain this a bit better? This pull request does not change anything about being able to build with these options and be able to see if the fuzz binary build breaks.

Specifically,

Everything in an if constexpr is still looked at by the compiler and any code issues should be detected, even with the above build options, before and after this pull request. See also refactor: Compile unreachable walletdb code #29315 for a different change using if constexpr for that purpose (as opposed to preprocessor directives).
All fuzz pre-run options such as PRINT_ALL_FUZZ_TARGETS_AND_ABORT, or WRITE_ALL_FUZZ_TARGETS_AND_ABORT, etc will still work normally before and after this pull request.
Only executing a fuzz target with the above build options is not possible, which is the reason for the first temporary CI commit.

Maybe I am misunderstanding, so it would help to share exact steps to reproduce of the use case that you are claiming is not addressed.

ryanofsky · 2024-11-01T13:22:49Z

Code review ACK fafbf8a but approach -0, because this approach means libraries built for fuzz testing do not function correctly if used in a release, and libraries built for releases are mostly useless for fuzz testing. So I would like to at least consider other solutions to this problem even if we go with this one.

#31178 makes it pretty clear that if we want to be able to write Assume() statements in hot paths, we need to be able to compile them out in release builds to avoid impacting performance, and leave them compiled into builds that are used for fuzzing. So I do think the build system should support compiling libraries differently for fuzz testing and releases. But I don't think the build system should go so far as to make libraries built for fuzz testing and libraries built for releases mutually incompatible.

Also I don't really buy the idea that if you put an Assume() statement in a hot path, you just be to assume it has no cost and that compiler will optimize it out. I think majority of Assume() statements are not in hot paths, and main use-case for Assume() is not to be a faster Assert() that you use in performance-critical code, but to be a safer Assert() that you can use to catch bugs during development, but will not crash the entire program and provide a terrible user experience when bugs (inevitably) occur in production.

So I think a better alternative to this PR might just be to provide a better alternative to Assume(). I think most uses of Assume() are fine as they are, but a few really are on hot paths, and I tried adding a simple counter to identify them in the linearizeoptimallyexample11 benchmark. If I disable these checks, the speedup is equivalent to this PR without drawbacks of this PR. The change I would propose based on this is:

diff

--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -225,6 +225,7 @@ if(BUILD_FOR_FUZZING)
   set(BUILD_GUI_TESTS OFF)
   set(BUILD_BENCH OFF)
   set(BUILD_FUZZ_BINARY ON)
+  target_compile_definitions(core_interface INTERFACE ENABLE_SLOWCHECK)
 endif()
 
 include(ProcessConfigurations)
--- a/cmake/module/ProcessConfigurations.cmake
+++ b/cmake/module/ProcessConfigurations.cmake
@@ -126,6 +126,7 @@ target_compile_definitions(core_interface_debug INTERFACE
   DEBUG_LOCKCONTENTION
   RPC_DOC_CHECK
   ABORT_ON_FAILED_ASSUME
+  ENABLE_SLOWCHECK
 )
 # We leave assertions on.
 if(MSVC)
--- a/src/cluster_linearize.h
+++ b/src/cluster_linearize.h
@@ -337,7 +337,7 @@ struct SetInfo
     /** Add a transaction to this SetInfo (which must not yet be in it). */
     void Set(const DepGraph<SetType>& depgraph, ClusterIndex pos) noexcept
     {
-        Assume(!transactions[pos]);
+        SLOWCHECK(!transactions[pos]);
         transactions.Set(pos);
         feerate += depgraph.FeeRate(pos);
     }
--- a/src/util/bitset.h
+++ b/src/util/bitset.h
@@ -281,7 +281,7 @@ class MultiIntBitSet
         /** Progress to the next 1 bit (only if != IteratorEnd). */
         constexpr Iterator& operator++() noexcept
         {
-            Assume(m_idx < N);
+            SLOWCHECK(m_idx < N);
             m_val &= m_val - I{1U};
             if (m_val == 0) {
                 while (true) {
@@ -301,7 +301,7 @@ class MultiIntBitSet
         /** Get the current bit position (only if != IteratorEnd). */
         constexpr unsigned operator*() const noexcept
         {
-            Assume(m_idx < N);
+            SLOWCHECK(m_idx < N);
             return m_pos;
         }
     };
@@ -316,7 +316,7 @@ public:
     /** Set a bit to 1. */
     void constexpr Set(unsigned pos) noexcept
     {
-        Assume(pos < MAX_SIZE);
+        SLOWCHECK(pos < MAX_SIZE);
         m_val[pos / LIMB_BITS] |= I{1U} << (pos % LIMB_BITS);
     }
     /** Set a bit to the specified value. */
--- a/src/util/check.h
+++ b/src/util/check.h
@@ -81,6 +81,12 @@ constexpr T&& inline_assertion_check(LIFETIMEBOUND T&& val, [[maybe_unused]] con
 /**
  * Assume is the identity function.
  *
+ * Assume() should be used instead of Assert() in cases where if the condition
+ * is not true, it indicates there is a bug, and you want to handle the bug by
+ * crashing in debug builds and in fuzz tests, but want to avoid unnecessary
+ * crashes in release builds and handle it with logging, warning, or other
+ * fallback behavior.
+ *
  * - Should be used to run non-fatal checks. In debug builds it behaves like
  *   Assert()/assert() to notify developers and testers about non-fatal errors.
  *   In production it doesn't warn or log anything.
@@ -90,6 +96,24 @@ constexpr T&& inline_assertion_check(LIFETIMEBOUND T&& val, [[maybe_unused]] con
  */
 #define Assume(val) inline_assertion_check<false>(val, __FILE__, __LINE__, __func__, #val)
 
+/**
+ * SLOWCHECK() can be used to perform checks that are enabled in debug and fuzz
+ * builds but are skipped in release builds. It is meant to be used for checks
+ * that are slow, either because they occur in hot code paths, or because the
+ * checks themselves are expensive. Assert() should be preferred for critical
+ * checks which are not in performance-sensitive code.
+ *
+ * SLOWCHECK() is basically equivalent to the traditional C/C++ assert() macro
+ * For historic reasons, Bitcoin Core cannot be compiled with NDEBUG so there is
+ * no way to skip assert() checks in release builds, and SLOWCHECK() restores
+ * this lost functionality.
+ */
+#ifdef ENABLE_SLOWCHECK
+#define SLOWCHECK(val) assert(val);
+#else
+#define SLOWCHECK(val) assert(1 || (val));
+#endif
+
 /**
  * NONFATAL_UNREACHABLE() is a macro that is used to mark unreachable code. It throws a NonFatalCheckError.
  */
--- a/src/util/vecdeque.h
+++ b/src/util/vecdeque.h
@@ -74,7 +74,7 @@ class VecDeque
     /** What index in the buffer does logical entry number pos have? */
     size_t BufferIndex(size_t pos) const noexcept
     {
-        Assume(pos < m_capacity);
+        SLOWCHECK(pos < m_capacity);
         // The expression below is used instead of the more obvious (pos + m_offset >= m_capacity),
         // because the addition there could in theory overflow with very large deques.
         if (pos >= m_capacity - m_offset) {

I would curious to know from @sipa and others if they think this approach makes sense and does not add too much of a burden.

Specifically results I saw testing this with the linearizeoptimallyexample11 benchmark were 1.42 op/sec on master, 1.77 op/sec reverting 9f243cd (#31093), 1.77 op/sec cherry-picking fafbf8a from this PR, and 1.83 op/sec with the proposed change.

maflcko · 2024-11-01T14:40:46Z

this approach means libraries built for fuzz testing do not function correctly if used in a release, and libraries built for releases are mostly useless for fuzz testing.

I don't think it is a supported use-case to take libraries from one build (with different build options) and drop them into another build (with different build options) and expect it to work, or be a supported use-case. This has also been the case up until two weeks ago, before commit 9f243cd. Also your SLOWCHECK suggestion doesn't seem to change the fact that mix-matching differently compiled libraries can result in an unsafe or less useful binary using those libraries.

Also I don't really buy the idea that if you put an Assume() statement in a hot path, you just be to assume it has no cost and that compiler will optimize it out.

I don't think I've claimed this. It has always been the assumption that Assume(expr), Assert(expr), and assert(expr) have a runtime cost of at least (void)(expr). This is also explained in the dev notes: "the expression is always evaluated." On current master, as explained by you in #31178 (comment), the runtime cost of Assume(expr) in some cases increased by an additional check of bool{g_fuzzing}. This change here is simply restoring what has been the assumption in all previous releases of Bitcoin Core.

I think majority of Assume() statements are not in hot paths, and main use-case for Assume() is not to be a faster Assert() that you use in performance-critical code

I agree and I don't think I've claimed otherwise. I just don't see the use-case to mix-match libraries from different builds and I think this change makes sense even if you go ahead with your SLOWCHECK idea.

SLOWCHECK

It may be best to submit this as a separate pull request. I think it is fine if this one keeps sitting for a few more weeks and it can be closed/merged, possibly depending on the result of your SLOWCHECK pull request.

ryanofsky · 2024-11-01T15:38:45Z

I don't think it is a supported use-case to take libraries from one build (with different build options) and drop them into another build

Agree it should not be a generally supported use case. And I do think it's nice to have a BUILD_FOR_FUZZING option that enables the best options for builds specifically doing fuzzing.

But for normal builds, I don't think the whole codebase should need to be recompiled with different options just to get a useful fuzz binary. And I don't think it is good for BUILD_FOR_FUZZING option to create strange libraries that seem functional but skip proof of work checks. I did ACK this PR in case we think these problems are worth having so Assume() can be optimized out in hot code paths. But I think these are negative consequences of this PR, even if worth the tradeoff.

Also I don't really buy the idea that if you put an Assume() statement in a hot path, you just be to assume it has no cost and that compiler will optimize it out.

I don't think I've claimed this.

Yes sorry, the comments above about Assume() usage were meant for @sipa who wrote "I have been using Assume in many places assuming it'll be optimized out entirely in production code" and similar things in the other issues.

It may be best to submit this as a separate pull request.

Yes, will do if it seems like something that is helpful.

maflcko · 2024-11-01T16:45:53Z

And I don't think it is good for BUILD_FOR_FUZZING option to create strange libraries that seem functional but skip proof of work checks.

I don't think this conceptual issue is addressed by your alternative SLOWCHECK suggestion. If this was a supported use-case, someone could link a fuzzing-compiled SLOWCHECK library that seems functional, but is possibly slow, or may even crash in production, when normally it should not. So your use-case would also be a reason against SLOWCHECK. I don't think your use-case is possible to support, so I don't think it should be used as a reason for or against a change.

davidgumberg · 2024-11-01T22:31:24Z

Everything in an if constexpr is still looked at by the compiler and any code issues should be detected.

Didn't think of this, that makes sense to me. I also may have misunderstood or misrepresented the use-case/concern I tried to describe, but it's not my own and I don't have any view on it, so it seems silly that I even mentioned it.

I still ACK fafbf8a for fixing the regression measured in #31178.

ryanofsky · 2024-11-04T14:27:11Z

The use-case which I think this breaks and should be supported is the ability to use fuzzing in a normal build instead of a dedicated build.

As an analogy, if you want use a debugger, the best place to use it is in a dedicated debug build, but you should also be able to generate debug symbols and use debuggers with some limitations in release builds. Or if you want to check memory safety, the best way to do it may be to run MSan or valgrind in dedicated builds, but it should also be possible to use ASan in normal builds. Similarly, if you want to run fuzz tests, the best way to run them is in a dedicated build with -DBUILD_FOR_FUZZING=ON, but I think you should also be able to run fuzz tests in a normal build with other functioning executables. This flexibility lets you have a single build supporting most developer tools, even if it is not the ideal build for running every tool. I think it lowers the barrier to start trying new tools if using them only requires toggling build options not creating entirely new build configurations.

AFAICT before #31093, libraries with BUILD_FOR_FUZZING did not function normally because they skipped proof of work checks, and libraries without BUILD_FOR_FUZZING were not very useful for fuzzing because they skipped most Assume() checks. #31093 fixed both of those issues and this PR breaks them again. Maybe this is acceptable, but doesn't seem good, and an alternate approach like the one in #31191 (comment) might be better. I'd be happy to open a PR with that approach it makes sense, or to learn that I am wrong and it doesn't make sense, or to get feedback that it is not a good solution for other reasons.

I also think this PR is ok, but I don't like the tradeoff it is making, so I'm just pointing out the disadvantages I see here.

marcofleon · 2024-11-04T20:53:53Z

I think it lowers the barrier to start trying new tools if using them only requires toggling build options not creating entirely new build configurations.

I see what you're saying about tools being more accessible if they're able to be used within a single build. However, I do think with cmake it's straightforward to maintain separate builds for different purposes. And fuzz testing might be involved enough to warrant its own build.

I don't see many advantages to fuzzing in a normal build that isn't optimized for it. If we're still able to verify that the fuzz binary builds properly and run pre-fuzz checks, as mentioned in #31191 (comment), then to me it feels reasonable to require BUILD_FOR_FUZZING when we intend to do some actual fuzz testing.

I also think it's cleaner to isolate fuzz-specific code paths (like skipping the PoW check) to a dedicated build. Of course, I'm open to reviewing other potential solutions.

MarcoFalke added 2 commits October 31, 2024 13:51

ci: Temporarily disable macOS/Windows fuzz step

fae3cf0

The fuzz binary is still compiled. This is required for the next commit.

Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to execute a…

fafbf8a

… fuzz target

maflcko changed the title ~~Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz~~ build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz Oct 31, 2024

DrahtBot added the Build system label Oct 31, 2024

dergoegge approved these changes Oct 31, 2024

View reviewed changes

src/util/check.h Show resolved Hide resolved

maflcko force-pushed the 2410-fuzz-build branch from fafbf8a to fafbfdc Compare October 31, 2024 13:27

DrahtBot added the CI failed label Oct 31, 2024

maflcko force-pushed the 2410-fuzz-build branch from fafbfdc to fafbf8a Compare October 31, 2024 14:51

maflcko mentioned this pull request Oct 31, 2024

bench: linearizeoptimallyexample11 benchmark now running 4x slow than previously #31178

Open

DrahtBot mentioned this pull request Oct 31, 2024

cmake: Set top-level target output locations #31161

Open

DrahtBot removed the CI failed label Oct 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz #31191

build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz #31191

maflcko commented Oct 31, 2024

DrahtBot commented Oct 31, 2024 •

edited

Loading

dergoegge left a comment

DrahtBot commented Oct 31, 2024

marcofleon commented Oct 31, 2024

davidgumberg commented Oct 31, 2024 •

edited

Loading

maflcko commented Nov 1, 2024

ryanofsky commented Nov 1, 2024

maflcko commented Nov 1, 2024

ryanofsky commented Nov 1, 2024

maflcko commented Nov 1, 2024

davidgumberg commented Nov 1, 2024

ryanofsky commented Nov 4, 2024 •

edited

Loading

marcofleon commented Nov 4, 2024

build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz #31191

Are you sure you want to change the base?

build: Make G_FUZZING constexpr, require -DBUILD_FOR_FUZZING=ON to fuzz #31191

Conversation

maflcko commented Oct 31, 2024

DrahtBot commented Oct 31, 2024 • edited Loading

Code Coverage & Benchmarks

Reviews

Conflicts

dergoegge left a comment

Choose a reason for hiding this comment

DrahtBot commented Oct 31, 2024

marcofleon commented Oct 31, 2024

davidgumberg commented Oct 31, 2024 • edited Loading

maflcko commented Nov 1, 2024

ryanofsky commented Nov 1, 2024

maflcko commented Nov 1, 2024

ryanofsky commented Nov 1, 2024

maflcko commented Nov 1, 2024

davidgumberg commented Nov 1, 2024

ryanofsky commented Nov 4, 2024 • edited Loading

marcofleon commented Nov 4, 2024

DrahtBot commented Oct 31, 2024 •

edited

Loading

davidgumberg commented Oct 31, 2024 •

edited

Loading

ryanofsky commented Nov 4, 2024 •

edited

Loading