Skip to content

Clear HELD-BY-ROOTS slot of lock targets for all successful operations#88

Merged
karalekas merged 1 commit into
mainfrom
clear-held-by-roots
Oct 12, 2025
Merged

Clear HELD-BY-ROOTS slot of lock targets for all successful operations#88
karalekas merged 1 commit into
mainfrom
clear-held-by-roots

Conversation

@karalekas

@karalekas karalekas commented Oct 12, 2025

Copy link
Copy Markdown
Member

In #80 we didn't go far enough -- the held-by-roots slot should be unset at the end of every successful operation -- not just multireweight operations. Otherwise stale information in this slot can lead to a strange situation where we lock superfluous targets.

In the worst case, this can actually result in livelock! 🚨

For example, in the situation below, we get stuck on a GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123> that keeps failing.

This happens because the target-root #<93356> still has its held-by-roots slot set to (#<96678>) from an earlier failed HOLD operation. #<93356> eventually moves on and is pulled into an AUGMENT, but the slot remains set. #<96678> ends up being contracted in several layers of macrovertices, the topmost being #<372123>. Then #<372123> tries to GRAFT 1 with #<93356> as the target-root and the lock targets ultimately contain both #<372123> (a macrovertex) and #<96678> (a vertex contained in macrovertex #<372123>) and the lock process fails.

If we always unset held-by-roots of all lock targets when a supervisor operation is successful, this cannot happen.

241.6: [#<D-S 257393>] got HOLD 0 #<96699>--->#<96598> from #<93356> w/ root-bucket (#<96678>)
242.2: [#<D-S 257393>] closing with failure

268.3: [#<D-S 268892>] got AUGMENT 0 #<90201>--->#<93356> from #<83940>
276.1: [#<D-S 268892>] closing with success

500.4: [#<D-S 333791>] got CONTRACT 0 #<96699>--->#<96678> from #<96662>
507.3: [#<D-N 335214>] completed setting up (peduncle: #<96598>--->#<96678>; match: [#<335214>: #<96678>-]-->#<96598>; children: ; parent: [#<335214>: #<96678>-]-->#<96598>; petals: #<96678>--->#<93352> #<93352>--->#<96699> #<96699>--->#<96678>; pistil: NIL)
508.3: [#<D-S 333791>] closing with success

549.4: [#<D-S 345061>] got CONTRACT 0 [#<335214>: #<93352>-]-->#<93333> from #<96662>
552.6: [#<D-N 345528>] completed setting up (peduncle: NIL; match: NIL; children: [#<345528>: #<93321>-]-->#<93251> [#<345528>: #<96662>-]-->#<96703>; parent: NIL; petals: #<96662>--->#<93331> #<93331>--->#<93321> #<93321>--->#<93245> #<93245>--->#<93327> #<93327>--->#<96598> #<96598>--[->#<96678> :#<335214>] [#<335214>: #<93352>-]-->#<93333> #<93333>--->#<93296> #<93296>--->#<96713> #<96713>--->#<96601> #<96601>--->#<96662>; pistil: NIL)
553.7: [#<D-S 345061>] closing with success

617.7: [#<D-S 361242>] got CONTRACT 0 #<93339>--->#<93222> from #<345528>
621.5: [#<D-N 362271>] completed setting up (peduncle: NIL; match: NIL; children: [#<362271>: #<93339>-]-->#<93316> [#<362271>: #<93222>-]-->#<90172>; parent: NIL; petals: [#<345528>: #<96662>-]-->#<96703> #<96703>--->#<93339> #<93339>--->#<93222> #<93222>--->#<93251> #<93251>--[->#<93321> :#<345528>]; pistil: NIL)
622.5: [#<D-S 361242>] closing with success

639.4: [#<D-S 367142>] got CONTRACT 0 #<86624>--->#<86667> from #<362271>
642.1: [#<D-N 367885>] completed setting up (peduncle: #<90157>--->#<86667>; match: [#<367885>: #<86667>-]-->#<90157>; children: ; parent: [#<367885>: #<86667>-]-->#<90157>; petals: #<86667>--->#<86661> #<86661>--->#<86624> #<86624>--->#<86667>; pistil: NIL)
643.2: [#<D-S 367142>] closing with success

656.1: [#<D-S 371609>] got CONTRACT 0 [#<367885>: #<86624>-]-[->#<93327> :#<362271>] from #<362271>
658.8: [#<D-N 372123>] completed setting up (peduncle: NIL; match: NIL; children: [#<372123>: #<93339>-]-->#<93316>; parent: NIL; petals: [#<362271>: #<93222>-]-->#<90172> #<90172>--->#<90207> #<90207>--->#<90157> #<90157>--[->#<86667> :#<367885>] [#<367885>: #<86624>-]-[->#<93327> :#<362271>]; pistil: NIL)
659.6: [#<D-S 371609>] closing with success

1001.: [#<D-S 460087>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1003.: [#<D-S 460087>] closing with failure
1009.: [#<D-S 461542>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1010.: [#<D-S 461542>] closing with failure
1016.: [#<D-S 462588>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1018.: [#<D-S 462588>] closing with failure
1024.: [#<D-S 463787>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1025.: [#<D-S 463787>] closing with failure
1031.: [#<D-S 465238>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1034.: [#<D-S 465238>] closing with failure
1040.: [#<D-S 466628>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1041.: [#<D-S 466628>] closing with failure
1047.: [#<D-S 467775>] got GRAFT 1 #<93356>--->#<90201> [#<372123>: #<93352>-]-->#<93356> from #<372123>
1049.: [#<D-S 467775>] closing with failure

@ecpeterson ecpeterson left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice find & diagnosis

@karalekas karalekas merged commit bf1f0dd into main Oct 12, 2025
1 check passed
@karalekas karalekas deleted the clear-held-by-roots branch October 12, 2025 20:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants