-
Notifications
You must be signed in to change notification settings - Fork 26.9k
Teach git refs migrate to support worktrees
#2077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Welcome to GitGitGadgetHi @sambostock, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests. Please make sure that either:
You can CC potential reviewers by adding a footer to the PR description with the following syntax: NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description, Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:
It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code. Contributing the patchesBefore you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form Both the person who commented An alternative is the channel Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment If you want to see what email(s) would be sent for a After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail). If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the curl -g --user "<EMailAddress>:<Password>" \
--url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txtTo iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description): To send a new iteration, just add another PR comment with the contents: Need help?New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join. You may also be able to find help in real time in the developer IRC channel, |
|
There are issues in commit 8f1562d: |
8f1562d to
29deaed
Compare
|
/allow |
|
User sambostock is now allowed to use GitGitGadget. WARNING: sambostock has no public email address set on GitHub; GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use. |
3c42f80 to
5cd905f
Compare
|
/preview |
|
Preview email sent as pull.2077.git.git.1761586833769.gitgitgadget@gmail.com |
Remove the worktree limitation from `git refs migrate` by implementing migration support for repositories with linked worktrees. Previously, attempting to migrate a repository with worktrees would fail with "migrating repositories with worktrees is not supported yet". This limitation existed because each worktree has its own ref storage that needed to be migrated separately. Migration now uses a multi-phase approach to safely handle multiple worktrees: 1. Phase 1: Iterate through all worktrees and create temporary new ref storage for each in a staging directory. 2. Phase 2: For each worktree, backup the existing ref storage, then move the new storage into place. 3. Phase 3: Update the repository format config, clear cached ref stores, and delete all backups. On failure, restore from backups and report where the migrated refs can be found for manual recovery. This approach ensures that if migration fails partway through, the repository can be restored to its original state. Key implementation details: - For files backend: Create a commondir file in temp directories for linked worktrees so the files backend knows where the common git directory is located. - For linked worktrees: Use non-INITIAL transactions to avoid creating packed-refs files (linked worktrees should never have packed-refs). - Filter refs during iteration: Linked worktrees only migrate their per-worktree refs (refs/bisect/*, refs/rewritten/*, refs/worktree/*). Shared refs are migrated once in the main worktree. - Write per-worktree refs as loose files: The files backend's transaction_finish_initial() optimization writes most refs to packed-refs, but per-worktree refs must be stored as loose files to maintain proper worktree isolation. - Backup root refs: During Phase 2, backup all root refs (HEAD, ORIG_HEAD, etc.) by iterating files in the git directory and using is_root_ref() to identify them. This ensures safe rollback if migration fails. Tests are updated to expect migration with worktrees to succeed, and new tests verify: - Basic worktree migration in both directions (files ↔ reftable) - Migration with multiple worktrees - Dry-run mode with worktrees - Physical separation of per-worktree refs - Bare repository with worktrees Documentation is updated to reflect the worktree support and note that migration must be run from the main worktree. As the author's familiarity with git internals and C is limited, this change was made with the assistance of Claude Code. However, the author has carefully reviewed and iterated on the work to ensure quality to the best of their ability. Signed-off-by: Sam Bostock <sam.bostock@shopify.com> Co-Authored-By: Claude <noreply@anthropic.com>
5cd905f to
29ba72c
Compare
|
The CI failure here appears to be unrelated, and the same as what's failing on current win+Meson build |
|
/submit |
|
Submitted as pull.2077.git.git.1761589580028.gitgitgadget@gmail.com To fetch this version into To fetch this version to local tag |
Indeed. I fail to understand how the |
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Mon, Oct 27, 2025 at 06:26:20PM +0000, Sam Bostock via GitGitGadget wrote:
> From: Sam Bostock <sam.bostock@shopify.com>
>
> Remove the worktree limitation from `git refs migrate` by implementing
> migration support for repositories with linked worktrees.
>
> Previously, attempting to migrate a repository with worktrees would fail
> with "migrating repositories with worktrees is not supported yet". This
> limitation existed because each worktree has its own ref storage that
> needed to be migrated separately.
>
> Migration now uses a multi-phase approach to safely handle multiple
> worktrees:
>
> 1. Phase 1: Iterate through all worktrees and create temporary new ref
> storage for each in a staging directory.
>
> 2. Phase 2: For each worktree, backup the existing ref storage, then
> move the new storage into place.
>
> 3. Phase 3: Update the repository format config, clear cached ref stores,
> and delete all backups. On failure, restore from backups and report
> where the migrated refs can be found for manual recovery.
Makes sense.
> This approach ensures that if migration fails partway through, the
> repository can be restored to its original state.
>
> Key implementation details:
>
> - For files backend: Create a commondir file in temp directories for
> linked worktrees so the files backend knows where the common git
> directory is located.
Hm, okay. Not yet sure why we need this, but let's read on.
> - For linked worktrees: Use non-INITIAL transactions to avoid creating
> packed-refs files (linked worktrees should never have packed-refs).
Hm, this is unfortunate. The reason why we use initial transactions is
twofold:
- First, we want to avoid creating loose refs, only. This is indeed
something we must not do with worktrees, as you point out.
- But second, we also want to skip pointless checks like the conflict
checks. This results in quite a saving.
Would've been great to retain the second property, but I guess as long
as we only do this for worktrees it's okayish and something we can worry
about at a later point in time. Better to migrate the refs slowish than
not at all.
> - Filter refs during iteration: Linked worktrees only migrate their
> per-worktree refs (refs/bisect/*, refs/rewritten/*, refs/worktree/*).
> Shared refs are migrated once in the main worktree.
Makes sense.
> - Write per-worktree refs as loose files: The files backend's
> transaction_finish_initial() optimization writes most refs to
> packed-refs, but per-worktree refs must be stored as loose files
> to maintain proper worktree isolation.
Isn't this roughly the same as the second bullet point? Feels like they
should be merged together.
> diff --git a/Documentation/git-refs.adoc b/Documentation/git-refs.adoc
> index fa33680cc7..f6a3bf4f03 100644
> --- a/Documentation/git-refs.adoc
> +++ b/Documentation/git-refs.adoc
> @@ -30,7 +30,8 @@ COMMANDS
> --------
>
> `migrate`::
> - Migrate ref store between different formats.
> + Migrate ref store between different formats. Supports repositories
> + with worktrees; migration must be run from the main worktree.
It feels a bit weird to single our worktrees specifically. We don't say
that the tool supports bare and non-bare repositories, either, so the
only reason why we'd have the note about worktrees is historic legacy.
How about this instead:
Migrate ref storage between different formats. Must be run from the
main worktree in case the repository uses worktrees.
> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>
> The ref format migration has several known limitations in its current form:
>
> -* It is not possible to migrate repositories that have worktrees.
> +* Migration must be run from the main worktree.
>
I'd drop this bullet point entirely, as I don't really see this as a
limitation anymore.
> diff --git a/refs.c b/refs.c
> index 965381367e..0834329e5a 100644
> --- a/refs.c
> +++ b/refs.c
I'm sorry, but this is _extremely_ hard to review as we're changing
almost all of the implementation at once with random changes left and
right. Furthermore, I feel like we're getting way to intimate with the
different backends here -- that shouldn't be the case though, the logic
that is specific to the backends should really live in the backends
themselves.
The way I'd expect a series like this to look like is to have commits
that:
1. Do preparatory changes, e.g. teach the files backend to not create
a packed-refs file for worktrees during migration.
2. Pull out the logic to migrate a single reference backend that we
already have into a separate function that can be called in a loop.
The end result should be a function that accepts the old refdb as
input and that returns the new refdb.
3. Implement the logic that calls the function we introduced in (2)
for each worktree. This can be done by iterating through all the
worktrees, calling `get_worktree_ref_store()` on it and then
passing the refdb to the new function.
> @@ -2974,8 +2976,296 @@ struct migration_data {
[snip]
> +/*
> + * Create a commondir file in the temporary migration directory for a linked
> + * worktree. The files backend needs this to locate the common git directory.
> + * Returns 0 on success, -1 on failure.
> + */
> +static int create_commondir_file(const char *new_dir, const char *worktree_path,
> + struct strbuf *errbuf)
> +{
I still don't get why we need this. We should have access to both the
ref store of the worktree and the repository, and both of these are
handled in the same process. So there shouldn't be a need to propagate
the commondir via a file.
[snip]
> +/*
> + * Returns the list of ref storage items to backup/restore for a worktree.
> + * Main worktrees include packed-refs, linked worktrees do not.
> + */
> +static const char **get_ref_storage_items(int is_main_worktree)
> +{
> + static const char *main_items[] = {"refs", "logs", "reftable", "packed-refs", NULL};
> + static const char *linked_items[] = {"refs", "logs", "reftable", NULL};
> +
> + return is_main_worktree ? main_items : linked_items;
> +}
This here is what I was referring to as "too intimate with the
backends". This logic should be entirely self-contained in the backends,
and it already is for migrating the main worktree. We have
`ref_store_remove_on_disk()` to prune old data, and as the new ref
storage is written into a temporary directory we don't need to enumerate
its contents, but can instead move all of its entries into the gitdir
directly.
I guess the reason why you have this and all of the following functions
is to create the backups. But that logic must not live in "refs.c", but
it really should live in the backends. I could for example see a new
function that moves a ref store to a different directory.
An alternative would be to not do the backups at all. We only start
doing "destructive" operations when all the new backends have already
been created, so the last step would be to rename everything into place.
If this operation fails or gets cancelled we are left with a broken
repository, true. But the data is not lost, as we can in theory continue
to rename the remaining data into place.
So maybe that's good enough? The user would have to manually restore in
either of the cases, so we don't really gain that much by having a
backup in the first place.
> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8d7007f4aa..6be56274d3 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3210,11 +3210,13 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
> string_list_append(&refnames_to_check, update->refname);
>
> /*
> - * packed-refs don't support symbolic refs, root refs and reflogs,
> - * so we have to queue these references via the loose transaction.
> + * packed-refs don't support symbolic refs, root refs, per-worktree
> + * refs, and reflogs, so we have to queue these references via the
> + * loose transaction.
> */
> if (update->new_target ||
> is_root_ref(update->refname) ||
> + is_per_worktree_ref(update->refname) ||
> (update->flags & REF_LOG_ONLY)) {
> if (!loose_transaction) {
> loose_transaction = ref_store_transaction_begin(&refs->base, 0, err);
This is one of these changes where I think it would make sense to split
them out into separate commits so that they can be properly singled out
and explained.
Patrick |
|
User |
|
On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <ps@pks.im> writes:
>> `migrate`::
>> - Migrate ref store between different formats.
>> + Migrate ref store between different formats. Supports repositories
>> + with worktrees; migration must be run from the main worktree.
>
> It feels a bit weird to single our worktrees specifically. We don't say
> that the tool supports bare and non-bare repositories, either, so the
> only reason why we'd have the note about worktrees is historic legacy.
> How about this instead:
>
> Migrate ref storage between different formats. Must be run from the
> main worktree in case the repository uses worktrees.
Two thoughts.
* Would it be unacceptable if the primary repository and refstore
uses reftable backend, and a newly attached worktree to the
repository uses ref-files only for its per-worktree refs? If we
should allow it, then "if the ref store you are migrating is in a
repository with multiple worktrees, you must migrate from the
primary and migrate _all_ ref store for all worktrees at once,
into the same backend", which the design of this patch seems to
aim at, would contradict with it, no?
* If "you must do so from the primary worktree and we convert all
the worktrees attached to the same repository" is the only mode
of operation we support (which by the way I have no problem
with---the first bullet point above was asking question, not
suggesting change of design), then would it be easier for the
user to use if the command noticed that it is not in the primary
worktree and switched to it for the user, instead of complaining
and failing?
>> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>>
>> The ref format migration has several known limitations in its current form:
>>
>> -* It is not possible to migrate repositories that have worktrees.
>> +* Migration must be run from the main worktree.
>>
>
> I'd drop this bullet point entirely, as I don't really see this as a
> limitation anymore.
I agree that such a limitation should be lifted, but if we have to
say "you must do it this way, not that way", that is still a
limitation ;-). |
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
>
> >> `migrate`::
> >> - Migrate ref store between different formats.
> >> + Migrate ref store between different formats. Supports repositories
> >> + with worktrees; migration must be run from the main worktree.
> >
> > It feels a bit weird to single our worktrees specifically. We don't say
> > that the tool supports bare and non-bare repositories, either, so the
> > only reason why we'd have the note about worktrees is historic legacy.
> > How about this instead:
> >
> > Migrate ref storage between different formats. Must be run from the
> > main worktree in case the repository uses worktrees.
>
> Two thoughts.
>
> * Would it be unacceptable if the primary repository and refstore
> uses reftable backend, and a newly attached worktree to the
> repository uses ref-files only for its per-worktree refs? If we
> should allow it, then "if the ref store you are migrating is in a
> repository with multiple worktrees, you must migrate from the
> primary and migrate _all_ ref store for all worktrees at once,
> into the same backend", which the design of this patch seems to
> aim at, would contradict with it, no?
The problem we have here is backwards compatibility. Right now we assume
that `extensions.refStorage` applies to all worktrees, so if we wanted
to change it like you propose then we'd have to introduce a backwards
incompatible change.
I agree though that it would've been great if we would have said from
the beginning that the worktree-specific configuration is allowed to
override the ref storage format for a worktree. If so, we could easily
convert any of the worktrees (including the main one) by without having
any impact on all the other worktrees.
But we do not live in such a world right now, and getting there would
require some significant reworking of how we handle per-worktree
references. Unfortunate, but I also don't think there's a strong enough
reason to change this.
> * If "you must do so from the primary worktree and we convert all
> the worktrees attached to the same repository" is the only mode
> of operation we support (which by the way I have no problem
> with---the first bullet point above was asking question, not
> suggesting change of design), then would it be easier for the
> user to use if the command noticed that it is not in the primary
> worktree and switched to it for the user, instead of complaining
> and failing?
I'm not sure. The question is whether the user recognizes that migrating
references in the worktree would also migrate references in the main
repository. It might be surprising behaviour if we did that without
asking.
It might of course also be surprising if you do that from the main
working tree. But I think there's an argument to be made that it's at
least _less_ surprising.
> >> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
> >>
> >> The ref format migration has several known limitations in its current form:
> >>
> >> -* It is not possible to migrate repositories that have worktrees.
> >> +* Migration must be run from the main worktree.
> >>
> >
> > I'd drop this bullet point entirely, as I don't really see this as a
> > limitation anymore.
>
> I agree that such a limitation should be lifted, but if we have to
> say "you must do it this way, not that way", that is still a
> limitation ;-).
So with the above reasoning I'm not sure I'd call this a limitation.
It's rather a mechanism to protect users from unexpected consequences.
Patrick |
|
On the Git mailing list, "Kristoffer Haugsbakk" wrote (reply to this): On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
> On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> >> `migrate`::
>> >> - Migrate ref store between different formats.
>> >> + Migrate ref store between different formats. Supports repositories
>> >> + with worktrees; migration must be run from the main worktree.
>> >
>> > It feels a bit weird to single our worktrees specifically. We don't say
>> > that the tool supports bare and non-bare repositories, either, so the
>> > only reason why we'd have the note about worktrees is historic legacy.
>> > How about this instead:
>> >
>> > Migrate ref storage between different formats. Must be run from the
>> > main worktree in case the repository uses worktrees.
>>
>> Two thoughts.
>>
>> * Would it be unacceptable if the primary repository and refstore
>> uses reftable backend, and a newly attached worktree to the
>> repository uses ref-files only for its per-worktree refs? If we
>> should allow it, then "if the ref store you are migrating is in a
>> repository with multiple worktrees, you must migrate from the
>> primary and migrate _all_ ref store for all worktrees at once,
>> into the same backend", which the design of this patch seems to
>> aim at, would contradict with it, no?
>
> The problem we have here is backwards compatibility. Right now we assume
> that `extensions.refStorage` applies to all worktrees, so if we wanted
> to change it like you propose then we'd have to introduce a backwards
> incompatible change.
I don’t understand the motivation or use case for supporting different
backends for different worktrees. But Junio would have to explain that.
Maybe the motivation is this weird (from a user’s perspective) limi-
tation that you have to run a command from the main worktree? Okay,
that’s strange but you get the error and switch to wherever the main
worktree is (that the error message hopefully helpfully provides you
with) and run the command there. Then you forget that weird thing five
minutes later since this was a one-off command.
>
> I agree though that it would've been great if we would have said from
> the beginning that the worktree-specific configuration is allowed to
> override the ref storage format for a worktree. If so, we could easily
> convert any of the worktrees (including the main one) by without having
> any impact on all the other worktrees.
As a user I don’t understand why that is a great thing to have.
>
> But we do not live in such a world right now, and getting there would
> require some significant reworking of how we handle per-worktree
> references. Unfortunate, but I also don't think there's a strong enough
> reason to change this.
>
>> * If "you must do so from the primary worktree and we convert all
>> the worktrees attached to the same repository" is the only mode
>> of operation we support (which by the way I have no problem
>> with---the first bullet point above was asking question, not
>> suggesting change of design), then would it be easier for the
>> user to use if the command noticed that it is not in the primary
>> worktree and switched to it for the user, instead of complaining
>> and failing?
>
> I'm not sure. The question is whether the user recognizes that migrating
> references in the worktree would also migrate references in the main
> repository. It might be surprising behaviour if we did that without
> asking.
On the contrary, as a user I think it mattering what worktree I run this
command from sounds very weird. (But again I can tolerate it requiring
me to run it from the main worktree if there are technical difficulties/
limitations. But using different backends for different
worktrees is very weird, again.)
If I run `git gc` I don’t want it to do different things based on what
worktree I am. I want to operate on the repository, and the repository
is the same no matter what worktree I am in. The same principle applies
to this command in my mind.
Is the “main worktree” even something that makes sense from the user’s
perspective? It seems like it’s just a side-effect of the fact that the
repository itself has to live somewhere. Imagine I have one main
worktree and two linked ones. I delete the main worktree. Imagine that
it works because the repository itself is moved to one of the linked
worktrees (arbitrary). Which then becomes the main worktree. But the
user does not have to care as long the user does not poke inside the
`.git` directory. Which the user should not have to do (there should be
commands to answer whatever `.git`-poking motivations).
I am of course not suggesting such a change. But the point is that the
“main worktree” is not such a useful end-user concept.
Sure, I happen to use a “main worktree” in the informal sense that I
often have the original path where I cloned or created the repository
and I have the other ones in satellite locations with more
pointed/topical names (e.g. `git-mine` is the basename of the Git
worktree that I use to `make install`). But I never ever consult `git
worktree list` to remind me what the main worktree is.
Okay. Let’s say I get tripped up by the gitlink or whatever it is kind
of file that worktrees use for `.git`. Because I really want to poke at
the `.git` directory. Then I think “I need to find the main worktree”
because it happens to have the repository and the link to that directory
could not be implemented using a symlink, maybe because of Windows
filesystems, I don’t know. Again a technical limitation to my mind.
No worktree is special except because of technical limitations.
(The “main worktree” even becomes a technically contradictory concept in
the case when the “main worktree” is bare. And that is a popular
practice for some reason.)
And I wonder how many worktree users even actively think about the fact
that per-worktree refs exist. It’s the kind of thing that you have to
logically conclude *has* to be the case:
1. `HEAD` is a ref and you need that for a worktree
2. You can have a bisect session in a worktree and that uses refs under
the hood
But:
1. Conceptually I never really think about `HEAD` as a ref; “what
branch/commit am I on” is what I care about. It’s the only builtin
symref that I know of (or ref or symref depending on...). Not a usual
ref at all.
2. I use git-bisect(1) to find a commit given a criteria. Ones I have
it I note the commit. I don’t care that refs are used to store the
bisect state while a session is active.
According to gitglossary(7) these are currently the only per-worktree
refs. I do not know if you are allowed to use the `refs/worktree/`
hierarchy to create refs beyond that.
>
> It might of course also be surprising if you do that from the main
> working tree. But I think there's an argument to be made that it's at
> least _less_ surprising.
>
>> >> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>> >>
>> >> The ref format migration has several known limitations in its current form:
>> >>
>> >> -* It is not possible to migrate repositories that have worktrees.
>> >> +* Migration must be run from the main worktree.
>> >>
>> >
>> > I'd drop this bullet point entirely, as I don't really see this as a
>> > limitation anymore.
>>
>> I agree that such a limitation should be lifted, but if we have to
>> say "you must do it this way, not that way", that is still a
>> limitation ;-).
>
> So with the above reasoning I'm not sure I'd call this a limitation.
> It's rather a mechanism to protect users from unexpected consequences. |
|
User |
|
On the Git mailing list, Junio C Hamano wrote (reply to this): Patrick Steinhardt <ps@pks.im> writes:
> The problem we have here is backwards compatibility. Right now we assume
> that `extensions.refStorage` applies to all worktrees, so if we wanted
> to change it like you propose then we'd have to introduce a backwards
> incompatible change.
That settles it. If we have long declared that a set of worktrees
attached to a repository share the same backend, then we do not have
to worry about overlaying refs stored in a different backend on top
of the base set of refs at all. That simplifies things a lot, I
would imagine.
> So with the above reasoning I'm not sure I'd call this a limitation.
> It's rather a mechanism to protect users from unexpected consequences.
The need for that mechanism would imply that it may not be clear to
the users that worktrees of the same repository must use the same
ref backend. Some education is needed, and erroring this operation
out may be one of the ways to give that, perhaps.
Thanks. |
|
On the Git mailing list, Ben Knoble wrote (reply to this): > Le 29 oct. 2025 à 07:37, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
>
> On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
>>> On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
>>> Patrick Steinhardt <ps@pks.im> writes:
[snip]
>>> * If "you must do so from the primary worktree and we convert all
>>> the worktrees attached to the same repository" is the only mode
>>> of operation we support (which by the way I have no problem
>>> with---the first bullet point above was asking question, not
>>> suggesting change of design), then would it be easier for the
>>> user to use if the command noticed that it is not in the primary
>>> worktree and switched to it for the user, instead of complaining
>>> and failing?
>>
>> I'm not sure. The question is whether the user recognizes that migrating
>> references in the worktree would also migrate references in the main
>> repository. It might be surprising behaviour if we did that without
>> asking.
>
> On the contrary, as a user I think it mattering what worktree I run this
> command from sounds very weird. (But again I can tolerate it requiring
> me to run it from the main worktree if there are technical difficulties/
> limitations. But using different backends for different
> worktrees is very weird, again.)
[snip]
The fewer concepts we ask a user to manage at a time, likely the better. In this case, “migrate the refs” should probably just work. While things are experimental, rough edges are more tolerable of course, but as we are lifting limitations towards making things official I think polishing such edges is a good idea.
In sum, it can be done later, but I think automatically changing the process directory to the main worktree and carrying on is fine. The curious folks would even see that under the TRACE output ;) |
|
User |
|
On the Git mailing list, Patrick Steinhardt wrote (reply to this): On Wed, Oct 29, 2025 at 12:33:15PM +0100, Kristoffer Haugsbakk wrote:
> On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
> > On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
> >> Patrick Steinhardt <ps@pks.im> writes:
> >>
> >> >> `migrate`::
> >> >> - Migrate ref store between different formats.
> >> >> + Migrate ref store between different formats. Supports repositories
> >> >> + with worktrees; migration must be run from the main worktree.
> >> >
> >> > It feels a bit weird to single our worktrees specifically. We don't say
> >> > that the tool supports bare and non-bare repositories, either, so the
> >> > only reason why we'd have the note about worktrees is historic legacy.
> >> > How about this instead:
> >> >
> >> > Migrate ref storage between different formats. Must be run from the
> >> > main worktree in case the repository uses worktrees.
> >>
> >> Two thoughts.
> >>
> >> * Would it be unacceptable if the primary repository and refstore
> >> uses reftable backend, and a newly attached worktree to the
> >> repository uses ref-files only for its per-worktree refs? If we
> >> should allow it, then "if the ref store you are migrating is in a
> >> repository with multiple worktrees, you must migrate from the
> >> primary and migrate _all_ ref store for all worktrees at once,
> >> into the same backend", which the design of this patch seems to
> >> aim at, would contradict with it, no?
> >
> > The problem we have here is backwards compatibility. Right now we assume
> > that `extensions.refStorage` applies to all worktrees, so if we wanted
> > to change it like you propose then we'd have to introduce a backwards
> > incompatible change.
>
> I don’t understand the motivation or use case for supporting different
> backends for different worktrees. But Junio would have to explain that.
>
> Maybe the motivation is this weird (from a user’s perspective) limi-
> tation that you have to run a command from the main worktree? Okay,
> that’s strange but you get the error and switch to wherever the main
> worktree is (that the error message hopefully helpfully provides you
> with) and run the command there. Then you forget that weird thing five
> minutes later since this was a one-off command.
>
> >
> > I agree though that it would've been great if we would have said from
> > the beginning that the worktree-specific configuration is allowed to
> > override the ref storage format for a worktree. If so, we could easily
> > convert any of the worktrees (including the main one) by without having
> > any impact on all the other worktrees.
>
> As a user I don’t understand why that is a great thing to have.
I am commenting more from the developer side here. In the best case the
user wouldn't ever care what ref storage format they use. We simply pick
the best format available and the user lives happily ever after.
But from a developer standpoint it matters. If we had per-worktree ref
formats we would for example be able to make the ref migration code a
lot more robust, as we could now migrate worktrees one by one. In the
current situation we basically have to migrate all worktrees at once,
and that significantly increases the risk of the migration going wrong
at any point in time.
> >
> > But we do not live in such a world right now, and getting there would
> > require some significant reworking of how we handle per-worktree
> > references. Unfortunate, but I also don't think there's a strong enough
> > reason to change this.
> >
> >> * If "you must do so from the primary worktree and we convert all
> >> the worktrees attached to the same repository" is the only mode
> >> of operation we support (which by the way I have no problem
> >> with---the first bullet point above was asking question, not
> >> suggesting change of design), then would it be easier for the
> >> user to use if the command noticed that it is not in the primary
> >> worktree and switched to it for the user, instead of complaining
> >> and failing?
> >
> > I'm not sure. The question is whether the user recognizes that migrating
> > references in the worktree would also migrate references in the main
> > repository. It might be surprising behaviour if we did that without
> > asking.
>
> On the contrary, as a user I think it mattering what worktree I run this
> command from sounds very weird. (But again I can tolerate it requiring
> me to run it from the main worktree if there are technical difficulties/
> limitations. But using different backends for different
> worktrees is very weird, again.)
>
> If I run `git gc` I don’t want it to do different things based on what
> worktree I am. I want to operate on the repository, and the repository
> is the same no matter what worktree I am in. The same principle applies
> to this command in my mind.
It does though :) Only very slightly so, but for example maintenance of
references is dependent on the worktree you are in. We don't maintain
references from other worktrees. So it's not really a new thing that I'm
proposing here.
In any case, I'm happy to change my stance if the majority of folks
thinks that migrating the whole repository from secondary worktrees is
fine. I mostly wanted to avoid that operations that the user perform
have a wider blast radius than they understood, but if everyone agrees
that this is a non-issue then I don't mind much. It's only going to make
the implementation simpler.
Patrick |
cc: Patrick Steinhardt ps@pks.im
cc: "Kristoffer Haugsbakk" kristofferhaugsbakk@fastmail.com
cc: Ben Knoble ben.knoble@gmail.com