Skip to content

Conversation

@sambostock
Copy link

@sambostock sambostock commented Oct 21, 2025

cc: Patrick Steinhardt ps@pks.im
cc: "Kristoffer Haugsbakk" kristofferhaugsbakk@fastmail.com
cc: Ben Knoble ben.knoble@gmail.com

@gitgitgadget-git
Copy link

Welcome to GitGitGadget

Hi @sambostock, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

  • Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
  • Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

  • the lines should not exceed 76 columns,
  • the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
  • the commit messages' body should be describing the "why?" of the change.
  • Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

@gitgitgadget-git
Copy link

There are issues in commit 8f1562d:
refs: support migration with worktrees
Lines in the body of the commit messages should be wrapped between 60 and 76 characters.
Indented lines, and lines without whitespace, are exempt

@sambostock sambostock force-pushed the refs-migrate-worktree-support branch from 8f1562d to 29deaed Compare October 21, 2025 18:36
@olga-mcbfe
Copy link
Contributor

/allow

@gitgitgadget-git
Copy link

User sambostock is now allowed to use GitGitGadget.

WARNING: sambostock has no public email address set on GitHub; GitGitGadget needs an email address to Cc: you on your contribution, so that you receive any feedback on the Git mailing list. Go to https://github.com/settings/profile to make your preferred email public to let GitGitGadget know which email address to use.

@sambostock sambostock force-pushed the refs-migrate-worktree-support branch 3 times, most recently from 3c42f80 to 5cd905f Compare October 26, 2025 17:33
@sambostock
Copy link
Author

/preview

@gitgitgadget-git
Copy link

Preview email sent as pull.2077.git.git.1761586833769.gitgitgadget@gmail.com

Remove the worktree limitation from `git refs migrate` by implementing
migration support for repositories with linked worktrees.

Previously, attempting to migrate a repository with worktrees would fail
with "migrating repositories with worktrees is not supported yet". This
limitation existed because each worktree has its own ref storage that
needed to be migrated separately.

Migration now uses a multi-phase approach to safely handle multiple
worktrees:

1. Phase 1: Iterate through all worktrees and create temporary new ref
   storage for each in a staging directory.

2. Phase 2: For each worktree, backup the existing ref storage, then
   move the new storage into place.

3. Phase 3: Update the repository format config, clear cached ref stores,
   and delete all backups. On failure, restore from backups and report
   where the migrated refs can be found for manual recovery.

This approach ensures that if migration fails partway through, the
repository can be restored to its original state.

Key implementation details:

- For files backend: Create a commondir file in temp directories for
  linked worktrees so the files backend knows where the common git
  directory is located.

- For linked worktrees: Use non-INITIAL transactions to avoid creating
  packed-refs files (linked worktrees should never have packed-refs).

- Filter refs during iteration: Linked worktrees only migrate their
  per-worktree refs (refs/bisect/*, refs/rewritten/*, refs/worktree/*).
  Shared refs are migrated once in the main worktree.

- Write per-worktree refs as loose files: The files backend's
  transaction_finish_initial() optimization writes most refs to
  packed-refs, but per-worktree refs must be stored as loose files
  to maintain proper worktree isolation.

- Backup root refs: During Phase 2, backup all root refs (HEAD,
  ORIG_HEAD, etc.) by iterating files in the git directory and using
  is_root_ref() to identify them. This ensures safe rollback if
  migration fails.

Tests are updated to expect migration with worktrees to succeed, and
new tests verify:
- Basic worktree migration in both directions (files ↔ reftable)
- Migration with multiple worktrees
- Dry-run mode with worktrees
- Physical separation of per-worktree refs
- Bare repository with worktrees

Documentation is updated to reflect the worktree support and note that
migration must be run from the main worktree.

As the author's familiarity with git internals and C is limited, this
change was made with the assistance of Claude Code. However, the author
has carefully reviewed and iterated on the work to ensure quality to the
best of their ability.

Signed-off-by: Sam Bostock <sam.bostock@shopify.com>
Co-Authored-By: Claude <noreply@anthropic.com>
@sambostock sambostock force-pushed the refs-migrate-worktree-support branch from 5cd905f to 29ba72c Compare October 27, 2025 17:41
@sambostock
Copy link
Author

The CI failure here appears to be unrelated, and the same as what's failing on current master:

win+Meson build

...
[691/906] Compiling C object git.exe.p/builtin_replace.c.obj
[692/906] Compiling C object git.exe.p/builtin_replay.c.obj
[693/906] Compiling C object git.exe.p/builtin_rerere.c.obj
[694/906] Compiling C object git.exe.p/builtin_repo.c.obj
[695/906] Generating src/git_rs with a custom command
FAILED: [code=1] src/libgitcore.a 
"C:\Program Files\Git\bin\sh.exe" "D:/a/git/git/src/cargo-meson.sh" "D:/a/git/git" "D:/a/git/git/build/src" "--release"
cp: cannot stat 'D:/a/git/git/build/src/release/libgitcore.a': No such file or directory
[696/906] Compiling C object git.exe.p/builtin_reset.c.obj
[697/906] Compiling C object git.exe.p/builtin_rev-list.c.obj
[698/906] Compiling C object git.exe.p/builtin_revert.c.obj
[699/906] Compiling C object git.exe.p/builtin_repack.c.obj
[700/906] Compiling C object git.exe.p/builtin_rev-parse.c.obj
ninja: build stopped: subcommand failed.
Activating VS 17.14.18 (October 2025)
INFO: automatically activated MSVC compiler environment
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts\ninja.EXE -C D:/a/git/git/build
Error: Process completed with exit code 1.

@sambostock
Copy link
Author

/submit

@gitgitgadget-git
Copy link

Submitted as pull.2077.git.git.1761589580028.gitgitgadget@gmail.com

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-2077/sambostock/refs-migrate-worktree-support-v1

To fetch this version to local tag pr-git-2077/sambostock/refs-migrate-worktree-support-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-2077/sambostock/refs-migrate-worktree-support-v1

@dscho
Copy link
Member

dscho commented Oct 27, 2025

The CI failure here appears to be unrelated, and the same as what's failing on current master:

win+Meson build

...
[691/906] Compiling C object git.exe.p/builtin_replace.c.obj
[692/906] Compiling C object git.exe.p/builtin_replay.c.obj
[693/906] Compiling C object git.exe.p/builtin_rerere.c.obj
[694/906] Compiling C object git.exe.p/builtin_repo.c.obj
[695/906] Generating src/git_rs with a custom command
FAILED: [code=1] src/libgitcore.a 
"C:\Program Files\Git\bin\sh.exe" "D:/a/git/git/src/cargo-meson.sh" "D:/a/git/git" "D:/a/git/git/build/src" "--release"
cp: cannot stat 'D:/a/git/git/build/src/release/libgitcore.a': No such file or directory
[696/906] Compiling C object git.exe.p/builtin_reset.c.obj
[697/906] Compiling C object git.exe.p/builtin_rev-list.c.obj
[698/906] Compiling C object git.exe.p/builtin_revert.c.obj
[699/906] Compiling C object git.exe.p/builtin_repack.c.obj
[700/906] Compiling C object git.exe.p/builtin_rev-parse.c.obj
ninja: build stopped: subcommand failed.
Activating VS 17.14.18 (October 2025)
INFO: automatically activated MSVC compiler environment
INFO: autodetecting backend as ninja
INFO: calculating backend command to run: C:\hostedtoolcache\windows\Python\3.9.13\x64\Scripts\ninja.EXE -C D:/a/git/git/build
Error: Process completed with exit code 1.

Indeed. I fail to understand how the master branch is allowed to fail its own CI builds for several weeks. The fix seems not exactly to be fast-tracked, either (even if it was finally marked to be merged to master 4 days ago, as of time of writing).

@gitgitgadget-git
Copy link

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Mon, Oct 27, 2025 at 06:26:20PM +0000, Sam Bostock via GitGitGadget wrote:
> From: Sam Bostock <sam.bostock@shopify.com>
> 
> Remove the worktree limitation from `git refs migrate` by implementing
> migration support for repositories with linked worktrees.
> 
> Previously, attempting to migrate a repository with worktrees would fail
> with "migrating repositories with worktrees is not supported yet". This
> limitation existed because each worktree has its own ref storage that
> needed to be migrated separately.
> 
> Migration now uses a multi-phase approach to safely handle multiple
> worktrees:
> 
> 1. Phase 1: Iterate through all worktrees and create temporary new ref
>    storage for each in a staging directory.
> 
> 2. Phase 2: For each worktree, backup the existing ref storage, then
>    move the new storage into place.
> 
> 3. Phase 3: Update the repository format config, clear cached ref stores,
>    and delete all backups. On failure, restore from backups and report
>    where the migrated refs can be found for manual recovery.

Makes sense.

> This approach ensures that if migration fails partway through, the
> repository can be restored to its original state.
> 
> Key implementation details:
> 
> - For files backend: Create a commondir file in temp directories for
>   linked worktrees so the files backend knows where the common git
>   directory is located.

Hm, okay. Not yet sure why we need this, but let's read on.

> - For linked worktrees: Use non-INITIAL transactions to avoid creating
>   packed-refs files (linked worktrees should never have packed-refs).

Hm, this is unfortunate. The reason why we use initial transactions is
twofold:

  - First, we want to avoid creating loose refs, only. This is indeed
    something we must not do with worktrees, as you point out.

  - But second, we also want to skip pointless checks like the conflict
    checks. This results in quite a saving.

Would've been great to retain the second property, but I guess as long
as we only do this for worktrees it's okayish and something we can worry
about at a later point in time. Better to migrate the refs slowish than
not at all.

> - Filter refs during iteration: Linked worktrees only migrate their
>   per-worktree refs (refs/bisect/*, refs/rewritten/*, refs/worktree/*).
>   Shared refs are migrated once in the main worktree.

Makes sense.

> - Write per-worktree refs as loose files: The files backend's
>   transaction_finish_initial() optimization writes most refs to
>   packed-refs, but per-worktree refs must be stored as loose files
>   to maintain proper worktree isolation.

Isn't this roughly the same as the second bullet point? Feels like they
should be merged together.

> diff --git a/Documentation/git-refs.adoc b/Documentation/git-refs.adoc
> index fa33680cc7..f6a3bf4f03 100644
> --- a/Documentation/git-refs.adoc
> +++ b/Documentation/git-refs.adoc
> @@ -30,7 +30,8 @@ COMMANDS
>  --------
>  
>  `migrate`::
> -	Migrate ref store between different formats.
> +	Migrate ref store between different formats. Supports repositories
> +	with worktrees; migration must be run from the main worktree.

It feels a bit weird to single our worktrees specifically. We don't say
that the tool supports bare and non-bare repositories, either, so the
only reason why we'd have the note about worktrees is historic legacy.
How about this instead:

    Migrate ref storage between different formats. Must be run from the
    main worktree in case the repository uses worktrees.

> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>  
>  The ref format migration has several known limitations in its current form:
>  
> -* It is not possible to migrate repositories that have worktrees.
> +* Migration must be run from the main worktree.
>  

I'd drop this bullet point entirely, as I don't really see this as a
limitation anymore.

> diff --git a/refs.c b/refs.c
> index 965381367e..0834329e5a 100644
> --- a/refs.c
> +++ b/refs.c

I'm sorry, but this is _extremely_ hard to review as we're changing
almost all of the implementation at once with random changes left and
right. Furthermore, I feel like we're getting way to intimate with the
different backends here -- that shouldn't be the case though, the logic
that is specific to the backends should really live in the backends
themselves.

The way I'd expect a series like this to look like is to have commits
that:

  1. Do preparatory changes, e.g. teach the files backend to not create
     a packed-refs file for worktrees during migration.

  2. Pull out the logic to migrate a single reference backend that we
     already have into a separate function that can be called in a loop.
     The end result should be a function that accepts the old refdb as
     input and that returns the new refdb.

  3. Implement the logic that calls the function we introduced in (2)
     for each worktree. This can be done by iterating through all the
     worktrees, calling `get_worktree_ref_store()` on it and then
     passing the refdb to the new function.

> @@ -2974,8 +2976,296 @@ struct migration_data {
[snip]
> +/*
> + * Create a commondir file in the temporary migration directory for a linked
> + * worktree. The files backend needs this to locate the common git directory.
> + * Returns 0 on success, -1 on failure.
> + */
> +static int create_commondir_file(const char *new_dir, const char *worktree_path,
> +				  struct strbuf *errbuf)
> +{

I still don't get why we need this. We should have access to both the
ref store of the worktree and the repository, and both of these are
handled in the same process. So there shouldn't be a need to propagate
the commondir via a file.

[snip]
> +/*
> + * Returns the list of ref storage items to backup/restore for a worktree.
> + * Main worktrees include packed-refs, linked worktrees do not.
> + */
> +static const char **get_ref_storage_items(int is_main_worktree)
> +{
> +	static const char *main_items[] = {"refs", "logs", "reftable", "packed-refs", NULL};
> +	static const char *linked_items[] = {"refs", "logs", "reftable", NULL};
> +
> +	return is_main_worktree ? main_items : linked_items;
> +}

This here is what I was referring to as "too intimate with the
backends". This logic should be entirely self-contained in the backends,
and it already is for migrating the main worktree. We have
`ref_store_remove_on_disk()` to prune old data, and as the new ref
storage is written into a temporary directory we don't need to enumerate
its contents, but can instead move all of its entries into the gitdir
directly.

I guess the reason why you have this and all of the following functions
is to create the backups. But that logic must not live in "refs.c", but
it really should live in the backends. I could for example see a new
function that moves a ref store to a different directory.

An alternative would be to not do the backups at all. We only start
doing "destructive" operations when all the new backends have already
been created, so the last step would be to rename everything into place.
If this operation fails or gets cancelled we are left with a broken
repository, true. But the data is not lost, as we can in theory continue
to rename the remaining data into place.

So maybe that's good enough? The user would have to manually restore in
either of the cases, so we don't really gain that much by having a
backup in the first place.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 8d7007f4aa..6be56274d3 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -3210,11 +3210,13 @@ static int files_transaction_finish_initial(struct files_ref_store *refs,
>  		string_list_append(&refnames_to_check, update->refname);
>  
>  		/*
> -		 * packed-refs don't support symbolic refs, root refs and reflogs,
> -		 * so we have to queue these references via the loose transaction.
> +		 * packed-refs don't support symbolic refs, root refs, per-worktree
> +		 * refs, and reflogs, so we have to queue these references via the
> +		 * loose transaction.
>  		 */
>  		if (update->new_target ||
>  		    is_root_ref(update->refname) ||
> +		    is_per_worktree_ref(update->refname) ||
>  		    (update->flags & REF_LOG_ONLY)) {
>  			if (!loose_transaction) {
>  				loose_transaction = ref_store_transaction_begin(&refs->base, 0, err);

This is one of these changes where I think it would make sense to split
them out into separate commits so that they can be properly singled out
and explained.

Patrick

@gitgitgadget-git
Copy link

User Patrick Steinhardt <ps@pks.im> has been added to the cc: list.

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Patrick Steinhardt <ps@pks.im> writes:

>>  `migrate`::
>> -	Migrate ref store between different formats.
>> +	Migrate ref store between different formats. Supports repositories
>> +	with worktrees; migration must be run from the main worktree.
>
> It feels a bit weird to single our worktrees specifically. We don't say
> that the tool supports bare and non-bare repositories, either, so the
> only reason why we'd have the note about worktrees is historic legacy.
> How about this instead:
>
>     Migrate ref storage between different formats. Must be run from the
>     main worktree in case the repository uses worktrees.

Two thoughts.

 * Would it be unacceptable if the primary repository and refstore
   uses reftable backend, and a newly attached worktree to the
   repository uses ref-files only for its per-worktree refs?  If we
   should allow it, then "if the ref store you are migrating is in a
   repository with multiple worktrees, you must migrate from the
   primary and migrate _all_ ref store for all worktrees at once,
   into the same backend", which the design of this patch seems to
   aim at, would contradict with it, no?

 * If "you must do so from the primary worktree and we convert all
   the worktrees attached to the same repository" is the only mode
   of operation we support (which by the way I have no problem
   with---the first bullet point above was asking question, not
   suggesting change of design), then would it be easier for the
   user to use if the command noticed that it is not in the primary
   worktree and switched to it for the user, instead of complaining
   and failing?

>> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>>  
>>  The ref format migration has several known limitations in its current form:
>>  
>> -* It is not possible to migrate repositories that have worktrees.
>> +* Migration must be run from the main worktree.
>>  
>
> I'd drop this bullet point entirely, as I don't really see this as a
> limitation anymore.

I agree that such a limitation should be lifted, but if we have to
say "you must do it this way, not that way", that is still a
limitation ;-).

@gitgitgadget-git
Copy link

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >>  `migrate`::
> >> -	Migrate ref store between different formats.
> >> +	Migrate ref store between different formats. Supports repositories
> >> +	with worktrees; migration must be run from the main worktree.
> >
> > It feels a bit weird to single our worktrees specifically. We don't say
> > that the tool supports bare and non-bare repositories, either, so the
> > only reason why we'd have the note about worktrees is historic legacy.
> > How about this instead:
> >
> >     Migrate ref storage between different formats. Must be run from the
> >     main worktree in case the repository uses worktrees.
> 
> Two thoughts.
> 
>  * Would it be unacceptable if the primary repository and refstore
>    uses reftable backend, and a newly attached worktree to the
>    repository uses ref-files only for its per-worktree refs?  If we
>    should allow it, then "if the ref store you are migrating is in a
>    repository with multiple worktrees, you must migrate from the
>    primary and migrate _all_ ref store for all worktrees at once,
>    into the same backend", which the design of this patch seems to
>    aim at, would contradict with it, no?

The problem we have here is backwards compatibility. Right now we assume
that `extensions.refStorage` applies to all worktrees, so if we wanted
to change it like you propose then we'd have to introduce a backwards
incompatible change.

I agree though that it would've been great if we would have said from
the beginning that the worktree-specific configuration is allowed to
override the ref storage format for a worktree. If so, we could easily
convert any of the worktrees (including the main one) by without having
any impact on all the other worktrees.

But we do not live in such a world right now, and getting there would
require some significant reworking of how we handle per-worktree
references. Unfortunate, but I also don't think there's a strong enough
reason to change this.

>  * If "you must do so from the primary worktree and we convert all
>    the worktrees attached to the same repository" is the only mode
>    of operation we support (which by the way I have no problem
>    with---the first bullet point above was asking question, not
>    suggesting change of design), then would it be easier for the
>    user to use if the command noticed that it is not in the primary
>    worktree and switched to it for the user, instead of complaining
>    and failing?

I'm not sure. The question is whether the user recognizes that migrating
references in the worktree would also migrate references in the main
repository. It might be surprising behaviour if we did that without
asking.

It might of course also be surprising if you do that from the main
working tree. But I think there's an argument to be made that it's at
least _less_ surprising.

> >> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
> >>  
> >>  The ref format migration has several known limitations in its current form:
> >>  
> >> -* It is not possible to migrate repositories that have worktrees.
> >> +* Migration must be run from the main worktree.
> >>  
> >
> > I'd drop this bullet point entirely, as I don't really see this as a
> > limitation anymore.
> 
> I agree that such a limitation should be lifted, but if we have to
> say "you must do it this way, not that way", that is still a
> limitation ;-).

So with the above reasoning I'm not sure I'd call this a limitation.
It's rather a mechanism to protect users from unexpected consequences.

Patrick

@gitgitgadget-git
Copy link

On the Git mailing list, "Kristoffer Haugsbakk" wrote (reply to this):

On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
> On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
>> Patrick Steinhardt <ps@pks.im> writes:
>>
>> >>  `migrate`::
>> >> -	Migrate ref store between different formats.
>> >> +	Migrate ref store between different formats. Supports repositories
>> >> +	with worktrees; migration must be run from the main worktree.
>> >
>> > It feels a bit weird to single our worktrees specifically. We don't say
>> > that the tool supports bare and non-bare repositories, either, so the
>> > only reason why we'd have the note about worktrees is historic legacy.
>> > How about this instead:
>> >
>> >     Migrate ref storage between different formats. Must be run from the
>> >     main worktree in case the repository uses worktrees.
>>
>> Two thoughts.
>>
>>  * Would it be unacceptable if the primary repository and refstore
>>    uses reftable backend, and a newly attached worktree to the
>>    repository uses ref-files only for its per-worktree refs?  If we
>>    should allow it, then "if the ref store you are migrating is in a
>>    repository with multiple worktrees, you must migrate from the
>>    primary and migrate _all_ ref store for all worktrees at once,
>>    into the same backend", which the design of this patch seems to
>>    aim at, would contradict with it, no?
>
> The problem we have here is backwards compatibility. Right now we assume
> that `extensions.refStorage` applies to all worktrees, so if we wanted
> to change it like you propose then we'd have to introduce a backwards
> incompatible change.

I don’t understand the motivation or use case for supporting different
backends for different worktrees. But Junio would have to explain that.

Maybe the motivation is this weird (from a user’s perspective) limi-
tation that you have to run a command from the main worktree?  Okay,
that’s strange but you get the error and switch to wherever the main
worktree is (that the error message hopefully helpfully provides you
with) and run the command there.  Then you forget that weird thing five
minutes later since this was a one-off command.

>
> I agree though that it would've been great if we would have said from
> the beginning that the worktree-specific configuration is allowed to
> override the ref storage format for a worktree. If so, we could easily
> convert any of the worktrees (including the main one) by without having
> any impact on all the other worktrees.

As a user I don’t understand why that is a great thing to have.

>
> But we do not live in such a world right now, and getting there would
> require some significant reworking of how we handle per-worktree
> references. Unfortunate, but I also don't think there's a strong enough
> reason to change this.
>
>>  * If "you must do so from the primary worktree and we convert all
>>    the worktrees attached to the same repository" is the only mode
>>    of operation we support (which by the way I have no problem
>>    with---the first bullet point above was asking question, not
>>    suggesting change of design), then would it be easier for the
>>    user to use if the command noticed that it is not in the primary
>>    worktree and switched to it for the user, instead of complaining
>>    and failing?
>
> I'm not sure. The question is whether the user recognizes that migrating
> references in the worktree would also migrate references in the main
> repository. It might be surprising behaviour if we did that without
> asking.

On the contrary, as a user I think it mattering what worktree I run this
command from sounds very weird.  (But again I can tolerate it requiring
me to run it from the main worktree if there are technical difficulties/
limitations.  But using different backends for different
worktrees is very weird, again.)

If I run `git gc` I don’t want it to do different things based on what
worktree I am. I want to operate on the repository, and the repository
is the same no matter what worktree I am in.  The same principle applies
to this command in my mind.

Is the “main worktree” even something that makes sense from the user’s
perspective?  It seems like it’s just a side-effect of the fact that the
repository itself has to live somewhere.  Imagine I have one main
worktree and two linked ones.  I delete the main worktree.  Imagine that
it works because the repository itself is moved to one of the linked
worktrees (arbitrary).  Which then becomes the main worktree.  But the
user does not have to care as long the user does not poke inside the
`.git` directory.  Which the user should not have to do (there should be
commands to answer whatever `.git`-poking motivations).

I am of course not suggesting such a change.  But the point is that the
“main worktree” is not such a useful end-user concept.

Sure, I happen to use a “main worktree” in the informal sense that I
often have the original path where I cloned or created the repository
and I have the other ones in satellite locations with more
pointed/topical names (e.g. `git-mine` is the basename of the Git
worktree that I use to `make install`).  But I never ever consult `git
worktree list` to remind me what the main worktree is.

Okay.  Let’s say I get tripped up by the gitlink or whatever it is kind
of file that worktrees use for `.git`.  Because I really want to poke at
the `.git` directory.  Then I think “I need to find the main worktree”
because it happens to have the repository and the link to that directory
could not be implemented using a symlink, maybe because of Windows
filesystems, I don’t know.  Again a technical limitation to my mind.
No worktree is special except because of technical limitations.

(The “main worktree” even becomes a technically contradictory concept in
the case when the “main worktree” is bare. And that is a popular
practice for some reason.)

And I wonder how many worktree users even actively think about the fact
that per-worktree refs exist.  It’s the kind of thing that you have to
logically conclude *has* to be the case:

1. `HEAD` is a ref and you need that for a worktree
2. You can have a bisect session in a worktree and that uses refs under
   the hood

But:

1. Conceptually I never really think about `HEAD` as a ref; “what
   branch/commit am I on” is what I care about. It’s the only builtin
   symref that I know of (or ref or symref depending on...). Not a usual
   ref at all.
2. I use git-bisect(1) to find a commit given a criteria.  Ones I have
   it I note the commit.  I don’t care that refs are used to store the
   bisect state while a session is active.

According to gitglossary(7) these are currently the only per-worktree
refs.  I do not know if you are allowed to use the `refs/worktree/`
hierarchy to create refs beyond that.

>
> It might of course also be surprising if you do that from the main
> working tree. But I think there's an argument to be made that it's at
> least _less_ surprising.
>
>> >> @@ -95,7 +96,7 @@ KNOWN LIMITATIONS
>> >>
>> >>  The ref format migration has several known limitations in its current form:
>> >>
>> >> -* It is not possible to migrate repositories that have worktrees.
>> >> +* Migration must be run from the main worktree.
>> >>
>> >
>> > I'd drop this bullet point entirely, as I don't really see this as a
>> > limitation anymore.
>>
>> I agree that such a limitation should be lifted, but if we have to
>> say "you must do it this way, not that way", that is still a
>> limitation ;-).
>
> So with the above reasoning I'm not sure I'd call this a limitation.
> It's rather a mechanism to protect users from unexpected consequences.

@gitgitgadget-git
Copy link

User "Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> has been added to the cc: list.

@gitgitgadget-git
Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

Patrick Steinhardt <ps@pks.im> writes:

> The problem we have here is backwards compatibility. Right now we assume
> that `extensions.refStorage` applies to all worktrees, so if we wanted
> to change it like you propose then we'd have to introduce a backwards
> incompatible change.

That settles it.  If we have long declared that a set of worktrees
attached to a repository share the same backend, then we do not have
to worry about overlaying refs stored in a different backend on top
of the base set of refs at all.  That simplifies things a lot, I
would imagine.

> So with the above reasoning I'm not sure I'd call this a limitation.
> It's rather a mechanism to protect users from unexpected consequences.

The need for that mechanism would imply that it may not be clear to
the users that worktrees of the same repository must use the same
ref backend.  Some education is needed, and erroring this operation
out may be one of the ways to give that, perhaps.

Thanks.

@gitgitgadget-git
Copy link

On the Git mailing list, Ben Knoble wrote (reply to this):

> Le 29 oct. 2025 à 07:37, Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com> a écrit :
> 
> On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
>>> On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
>>> Patrick Steinhardt <ps@pks.im> writes:

[snip]

>>> * If "you must do so from the primary worktree and we convert all
>>>   the worktrees attached to the same repository" is the only mode
>>>   of operation we support (which by the way I have no problem
>>>   with---the first bullet point above was asking question, not
>>>   suggesting change of design), then would it be easier for the
>>>   user to use if the command noticed that it is not in the primary
>>>   worktree and switched to it for the user, instead of complaining
>>>   and failing?
>> 
>> I'm not sure. The question is whether the user recognizes that migrating
>> references in the worktree would also migrate references in the main
>> repository. It might be surprising behaviour if we did that without
>> asking.
> 
> On the contrary, as a user I think it mattering what worktree I run this
> command from sounds very weird.  (But again I can tolerate it requiring
> me to run it from the main worktree if there are technical difficulties/
> limitations.  But using different backends for different
> worktrees is very weird, again.)

[snip]

The fewer concepts we ask a user to manage at a time, likely the better. In this case, “migrate the refs” should probably just work. While things are experimental, rough edges are more tolerable of course, but as we are lifting limitations towards making things official I think polishing such edges is a good idea.

In sum, it can be done later, but I think automatically changing the process directory to the main worktree and carrying on is fine. The curious folks would even see that under the TRACE output ;)

@gitgitgadget-git
Copy link

User Ben Knoble <ben.knoble@gmail.com> has been added to the cc: list.

@gitgitgadget-git
Copy link

On the Git mailing list, Patrick Steinhardt wrote (reply to this):

On Wed, Oct 29, 2025 at 12:33:15PM +0100, Kristoffer Haugsbakk wrote:
> On Wed, Oct 29, 2025, at 11:10, Patrick Steinhardt wrote:
> > On Tue, Oct 28, 2025 at 09:00:43AM -0700, Junio C Hamano wrote:
> >> Patrick Steinhardt <ps@pks.im> writes:
> >>
> >> >>  `migrate`::
> >> >> -	Migrate ref store between different formats.
> >> >> +	Migrate ref store between different formats. Supports repositories
> >> >> +	with worktrees; migration must be run from the main worktree.
> >> >
> >> > It feels a bit weird to single our worktrees specifically. We don't say
> >> > that the tool supports bare and non-bare repositories, either, so the
> >> > only reason why we'd have the note about worktrees is historic legacy.
> >> > How about this instead:
> >> >
> >> >     Migrate ref storage between different formats. Must be run from the
> >> >     main worktree in case the repository uses worktrees.
> >>
> >> Two thoughts.
> >>
> >>  * Would it be unacceptable if the primary repository and refstore
> >>    uses reftable backend, and a newly attached worktree to the
> >>    repository uses ref-files only for its per-worktree refs?  If we
> >>    should allow it, then "if the ref store you are migrating is in a
> >>    repository with multiple worktrees, you must migrate from the
> >>    primary and migrate _all_ ref store for all worktrees at once,
> >>    into the same backend", which the design of this patch seems to
> >>    aim at, would contradict with it, no?
> >
> > The problem we have here is backwards compatibility. Right now we assume
> > that `extensions.refStorage` applies to all worktrees, so if we wanted
> > to change it like you propose then we'd have to introduce a backwards
> > incompatible change.
> 
> I don’t understand the motivation or use case for supporting different
> backends for different worktrees. But Junio would have to explain that.
> 
> Maybe the motivation is this weird (from a user’s perspective) limi-
> tation that you have to run a command from the main worktree?  Okay,
> that’s strange but you get the error and switch to wherever the main
> worktree is (that the error message hopefully helpfully provides you
> with) and run the command there.  Then you forget that weird thing five
> minutes later since this was a one-off command.
> 
> >
> > I agree though that it would've been great if we would have said from
> > the beginning that the worktree-specific configuration is allowed to
> > override the ref storage format for a worktree. If so, we could easily
> > convert any of the worktrees (including the main one) by without having
> > any impact on all the other worktrees.
> 
> As a user I don’t understand why that is a great thing to have.

I am commenting more from the developer side here. In the best case the
user wouldn't ever care what ref storage format they use. We simply pick
the best format available and the user lives happily ever after.

But from a developer standpoint it matters. If we had per-worktree ref
formats we would for example be able to make the ref migration code a
lot more robust, as we could now migrate worktrees one by one. In the
current situation we basically have to migrate all worktrees at once,
and that significantly increases the risk of the migration going wrong
at any point in time.

> >
> > But we do not live in such a world right now, and getting there would
> > require some significant reworking of how we handle per-worktree
> > references. Unfortunate, but I also don't think there's a strong enough
> > reason to change this.
> >
> >>  * If "you must do so from the primary worktree and we convert all
> >>    the worktrees attached to the same repository" is the only mode
> >>    of operation we support (which by the way I have no problem
> >>    with---the first bullet point above was asking question, not
> >>    suggesting change of design), then would it be easier for the
> >>    user to use if the command noticed that it is not in the primary
> >>    worktree and switched to it for the user, instead of complaining
> >>    and failing?
> >
> > I'm not sure. The question is whether the user recognizes that migrating
> > references in the worktree would also migrate references in the main
> > repository. It might be surprising behaviour if we did that without
> > asking.
> 
> On the contrary, as a user I think it mattering what worktree I run this
> command from sounds very weird.  (But again I can tolerate it requiring
> me to run it from the main worktree if there are technical difficulties/
> limitations.  But using different backends for different
> worktrees is very weird, again.)
> 
> If I run `git gc` I don’t want it to do different things based on what
> worktree I am. I want to operate on the repository, and the repository
> is the same no matter what worktree I am in.  The same principle applies
> to this command in my mind.

It does though :) Only very slightly so, but for example maintenance of
references is dependent on the worktree you are in. We don't maintain
references from other worktrees. So it's not really a new thing that I'm
proposing here.

In any case, I'm happy to change my stance if the majority of folks
thinks that migrating the whole repository from secondary worktrees is
fine. I mostly wanted to avoid that operations that the user perform
have a wider blast radius than they understood, but if everyone agrees
that this is a non-issue then I don't mind much. It's only going to make
the implementation simpler.

Patrick

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants