Jch #2059

spacefix-creator · 2025-09-23T23:09:49Z

Thanks for taking the time to contribute to Git! Please be advised that the
Git community does not use github.com for their contributions. Instead, we use
a mailing list (git@vger.kernel.org) for code submissions, code reviews, and
bug reports. Nevertheless, you can use GitGitGadget (https://gitgitgadget.github.io/)
to conveniently send your Pull Requests commits to our mailing list.

For a single-commit pull request, please leave the pull request description
empty: your commit message itself should describe your changes.

Please read the "guidelines for contributing" linked above!

Running "git refs migrate master main" would fail and say "too many arguments". By reading that message, you cannot tell if you just should have given a single ref and made it "git refs migrate master", or the command refuses to take any arguments. Instead, report that "git ref migrate" takes no arguments, which is far easier for the user to understand. $ git refs migrate master main fatal: 'git refs migrate' takes no arguments The other side of the coin this change is covering is to remove doubts in new users' minds when we say "git refs migrate", if it is "git" command running with two "refs migrate" arguments, "git refs" command running with one "migrate" argument, or "git refs migrate" command running with no arguments. In the same spirit, reword the existing "missing --ref-format=<format>" message and say $ git refs migrate fatal: 'git refs migrate' needs '--ref-format=<format>' Note that we are turning two usage() calls to die() calls. The former should signal that the message given is a command line that shows the usage help of the command. If we are giving a fatal error message, we should not hesitate to use die(). Signed-off-by: Junio C Hamano <gitster@pobox.com>

Running "git cat-file -e a b c d e f g" would fail and say "too many arguments". By reading that message, you cannot tell if the command could have worked if you limited the list of objects to 5 items instead of 7, or the command is prepared to take only a single item. Let's report that "b" is an unexpected argument instead. Signed-off-by: Junio C Hamano <gitster@pobox.com>

Imagine seeing your command failing with "too many arguments" when you run "git cmd foo bar baz". Can you tell it will work if you said "git cmd foo bar"? Or is that trimming your command line too much? Too little? Instead, if the command reports "unexpected argument: 'bar'", you'd know that "bar" and everything after it is unwanted. Let's make it so for "git notes". Signed-off-by: Junio C Hamano <gitster@pobox.com>

Imagine seeing your command failing with "too many arguments" when you run "git cmd foo bar baz". Can you tell it will work if you said "git cmd foo bar"? Or is that trimming your command line too much? Too little? Instead, if the command reports "unexpected argument: 'bar'", you'd know that "bar" and everything after it is unwanted. Let's make it so for a few remaining commands. Signed-off-by: Junio C Hamano <gitster@pobox.com>

Moving the sizep computation now makes the next commit to avoid redundant object info lookups easier to understand. There is no user-visible change, here. [ew: commit message] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Avoid unnecessary round trips to the object store to speed up cat-file contents retrievals. The majority of packed objects don't benefit from the streaming interface at all and we end up having to load them in core anyways to satisfy our streaming API. This drops the runtime of `git cat-file --batch-all-objects --unordered --batch' on git.git from ~7.1s to ~6.1s on Jeff's machine. [ew: commit message] Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

object-file.c::loose_object_info() accepts objects matching content_limit exactly, so it follows packfile handling allows slurping objects which match loose object handling and slurp objects with size matching the content_limit exactly. This change is merely for consistency with the majority of existing code and there is no user visible change in nearly all cases. The only exception being the corner case when the object size matches content_limit exactly where users will see a speedup from avoiding an extra lookup. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We need to check delta_base_cache anyways to fill in the `whence' field in `struct object_info'. Inlining (and getting rid of) cache_or_unpack_entry() makes it easier to only do the hashmap lookup once and avoid a redundant lookup later on. This code reorganization will also make an optimization to use the cache entry directly easier to implement in the next commit. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

For objects already in the delta_base_cache, we can safely use one entry at-a-time directly to avoid the malloc+memcpy+free overhead. For a 1MB delta base object, this eliminates the speed penalty of duplicating large objects into memory and speeds up those 1MB delta base cached content retrievals by roughly 30%. While only 2-7% of objects are delta bases in repos I've looked at, this avoids up to 96MB of duplicated memory in the worst case with the default git config. The new delta_base_cache_lock is a simple single-threaded assertion to ensure cat-file (and similar) is the exclusive user of the delta_base_cache. In other words, we cannot have diff or similar commands using two or more entries directly from the delta base cache. The new lock has nothing to do with parallel access via multiple threads at the moment. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

For entries in the delta base cache, packed_to_object_type calls can be omitted. This prepares us to bypass content_limit for non-blob types in the following commit. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Streaming is only supported for blobs, so we'd end up having to slurp all the other object types into memory regardless. So slurp all the non-blob types up front when requesting content since we always handle them in-core, anyways. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

As with the normal `--batch' mode, we can use the content_limit round trip optimization to avoid a redundant lookup. The only tricky thing here is we need to enable/disable setting the object_info.contentp field depending on whether we hit an `info' or `contents' command. t1006 is updated to ensure we can switch back and forth between `info' and `contents' commands without problems. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

fwrite(3) and write(2), and all of our wrappers for them use size_t while object size is `unsigned long', so there's no excuse to use a potentially smaller representation. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Using writev here is 20-40% faster than three write syscalls in succession for smaller (1-10k) objects in the delta base cache. This advantage decreases as object sizes approach pipe size (64k on Linux). writev reduces wakeups and syscalls on the read side as well: each write(2) syscall may trigger one or more corresponding read(2) syscalls in the reader. Attempting atomicity in the writer via writev also reduces the likelyhood of non-blocking readers failing with EAGAIN and having to call poll||select before attempting to read again. Unfortunately, this turns into a small (1-3%) slowdown for gigantic objects of a megabyte or more even with after increasing pipe size to 1MB via the F_SETPIPE_SZ fcntl(2) op. This slowdown is acceptable to me since the vast majority of objects are 64K or less for projects I've looked at. Relying on stdio buffering and fflush(3) after each response was considered for users without --buffer, but historically cat-file defaults to being compatible with non-blocking stdout and able to poll(2) after hitting EAGAIN on write(2). Using stdio on files with the O_NONBLOCK flag is (AFAIK) unspecified and likely subject to portability problems and thus avoided. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

The git_user_agent_sanitized() function performs some sanitizing to avoid special characters being sent over the line and possibly messing up with the protocol or with the parsing on the other side. Let's extract this sanitizing into a new strbuf_sanitize() function, as we will want to reuse it in a following patch, and let's put it into strbuf.{c,h}. While at it, let's also make a few small improvements: - use 'size_t' for 'i' instead of 'int', - move the declaration of 'i' inside the 'for ( ... )', - use strbuf_detach() to explicitly detach the string contained by the 'sb' strbuf. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

We often have to split strings at some specified terminator character. The strbuf_split*() functions, that we can use for this purpose, return substrings that include the terminator character, so we often need to remove that character. When it is a whitespace, newline or directory separator, the terminator character can easily be removed using an existing triming function like strbuf_rtrim(), strbuf_trim_trailing_newline() or strbuf_trim_trailing_dir_sep(). There is no function to remove that character when it's not one of those characters though. Let's introduce a new strbuf_trim_trailing_ch() function that can be used to remove any trailing character, and let's refactor existing code that manually removed trailing characters using this new function. We are also going to use this new function in a following commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

When a server S knows that some objects from a repository are available from a promisor remote X, S might want to suggest to a client C cloning or fetching the repo from S that C should use X directly instead of S for these objects. Note that this could happen both in the case S itself doesn't have the objects and borrows them from X, and in the case S has the objects but knows that X is better connected to the world (e.g., it is in a $LARGEINTERNETCOMPANY datacenter with petabit/s backbone connections) than S. Implementation of the latter case, which would require S to omit in its response the objects available on X, is left for future improvement though. Then C might or might not, want to get the objects from X, and should let S know about this. To allow S and C to agree and let each other know about C using X or not, let's introduce a new "promisor-remote" capability in the protocol v2, as well as a few new configuration variables: - "promisor.advertise" on the server side, and: - "promisor.acceptFromServer" on the client side. By default, or if "promisor.advertise" is set to 'false', a server S will not advertise the "promisor-remote" capability. If S doesn't advertise the "promisor-remote" capability, then a client C replying to S shouldn't advertise the "promisor-remote" capability either. If "promisor.advertise" is set to 'true', S will advertise its promisor remotes with a string like: promisor-remote=<pr-info>[;<pr-info>]... where each <pr-info> element contains information about a single promisor remote in the form: name=<pr-name>[,url=<pr-url>] where <pr-name> is the urlencoded name of a promisor remote and <pr-url> is the urlencoded URL of the promisor remote named <pr-name>. For now, the URL is passed in addition to the name. In the future, it might be possible to pass other information like a filter-spec that the client should use when cloning from S, or a token that the client should use when retrieving objects from X. It might also be possible in the future for "promisor.advertise" to have other values. For example a value like "onlyName" could prevent S from advertising URLs, which could help in case C should use a different URL for X than the URL S is using. (The URL S is using might be an internal one on the server side for example.) By default or if "promisor.acceptFromServer" is set to "None", C will not accept to use the promisor remotes that might have been advertised by S. In this case, C will not advertise any "promisor-remote" capability in its reply to S. If "promisor.acceptFromServer" is set to "All" and S advertised some promisor remotes, then on the contrary, C will accept to use all the promisor remotes that S advertised and C will reply with a string like: promisor-remote=<pr-name>[;<pr-name>]... where the <pr-name> elements are the urlencoded names of all the promisor remotes S advertised. In a following commit, other values for "promisor.acceptFromServer" will be implemented, so that C will be able to decide the promisor remotes it accepts depending on the name and URL it received from S. So even if that name and URL information is not used much right now, it will be needed soon. Helped-by: Taylor Blau <me@ttaylorr.com> Helped-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

A previous commit introduced a "promisor.acceptFromServer" configuration variable with only "None" or "All" as valid values. Let's introduce "KnownName" and "KnownUrl" as valid values for this configuration option to give more choice to a client about which promisor remotes it might accept among those that the server advertised. In case of "KnownName", the client will accept promisor remotes which are already configured on the client and have the same name as those advertised by the client. This could be useful in a corporate setup where servers and clients are trusted to not switch names and URLs, but where some kind of control is still useful. In case of "KnownUrl", the client will accept promisor remotes which have both the same name and the same URL configured on the client as the name and URL advertised by the server. This is the most secure option, so it should be used if possible. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>

These tests prepare the working tree & index state to have something to be committed, and try a sequence of "test_must_fail git commit". If an earlier one did not fail by a bug, a later one will fail for a wrong reason (namely, "nothing to commit"). Give them "--allow-empty" to make sure that they would work even when there is nothing to commit by accident. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Sometimes people want to specify additional configuration data as "best effort" basis. Maybe commit.template configuration file points at somewhere in ~/template/ but on a particular system, the file may not exist and the user may be OK without using the template in such a case. When the value given to a configuration variable whose type is pathname wants to signal such an optional file, it can be marked by prepending ":(optional)" in front of it. Such a setting that is marked optional would avoid getting the command barf for a missing file, as an optional configuration setting that names a missing or an empty file is not even seen. cf. <xmqq5ywehb69.fsf@gitster.g> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

In the previous step, we introduced an optional filename that can be given to a configuration variable, and nullify the fact that such a configuration setting even existed if the named path is missing or empty. Let's do the same for command line options that name a pathname. Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

The server options populated via `OPT_STRING_LIST()` is never cleared, causing a memory leak. Plug it. This leak is exposed by t5702, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

The "reach" test tool doesn't bother to clean up any of its allocated resources, causing various leaks. Plug them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

In `grep_splice_or()` we search for the next `TRUE` node in our tree of grep exrpessions and replace it with the given new expression. But we don't free the old node, which causes a memory leak. Plug it. This leak is exposed by t7810, but plugging it alone isn't sufficient to make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

When executing with `--max-count=0` we'll return early from git-grep(1) without performing any cleanup, which causes memory leaks. Plug these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

The memory allocated by `prepare_to_use_bloom_filter()` is not released by `release_revisions()`, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

In `do_diff_cache()` we initialize a new `rev_info` and then overwrite its `diffopt` with a user-provided set of options. This can leak memory because `repo_init_revisions()` may end up allocating memory for the `diffopt` itself depending on the configuration. And as that field is overwritten we won't ever free that. Plug the memory leak by releasing the diffopts before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

The signature check in of the formatting context is never getting released. Fix this to plug the resulting memory leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

We don't clear `struct upload_pack::uri_protocols`, which causes a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

While we free the worktree change data, we never free its contents. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Since the `info` command in cat-file --batch-command prints object info for a given object, it is natural to add another command in cat-file --batch-command to print object info for a given object from a remote. Add `remote-object-info` to cat-file --batch-command. While `info` takes object ids one at a time, this creates overhead when making requests to a server so `remote-object-info` instead can take multiple object ids at once. cat-file --batch-command is generally implemented in the following manner: - Receive and parse input from user - Call respective function attached to command - Get object info, print object info In --buffer mode, this changes to: - Receive and parse input from user - Store respective function attached to command in a queue - After flush, loop through commands in queue - Call respective function attached to command - Get object info, print object info Notice how the getting and printing of object info is accomplished one at a time. As described above, this creates a problem for making requests to a server. Therefore, `remote-object-info` is implemented in the following manner: - Receive and parse input from user If command is `remote-object-info`: - Get object info from remote - Loop through and print each object info Else: - Call respective function attached to command - Parse input, get object info, print object info And finally for --buffer mode `remote-object-info`: - Receive and parse input from user - Store respective function attached to command in a queue - After flush, loop through commands in queue: If command is `remote-object-info`: - Get object info from remote - Loop through and print each object info Else: - Call respective function attached to command - Get object info, print object info To summarize, `remote-object-info` gets object info from the remote and then loop through the object info passed in, printing the info. In order for remote-object-info to avoid remote communication overhead in the non-buffer mode, the objects are passed in as such: remote-object-info <remote> <oid> <oid> ... <oid> rather than remote-object-info <remote> <oid> remote-object-info <remote> <oid> ... remote-object-info <remote> <oid> Helped-by: Jonathan Tan <jonathantanmy@google.com> Helped-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Eric Ju <eric.peijian@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

In anticipation of a few planned applications, introduce the most basic form of a path-walk API. It currently assumes that there are no UNINTERESTING objects, and does not include any complicated filters. It calls a function pointer on groups of tree and blob objects as grouped by path. This only includes objects the first time they are discovered, so an object that appears at multiple paths will not be included in two batches. These batches are collected in 'struct type_and_oid_list' objects, which store an object type and an oid_array of objects. The data structures are documented in 'struct path_walk_context', but in summary the most important are: * 'paths_to_lists' is a strmap that connects a path to a type_and_oid_list for that path. To avoid conflicts in path names, we make sure that tree paths end in "/" (except the root path with is an empty string) and blob paths do not end in "/". * 'path_stack' is a string list that is added to in an append-only way. This stores the stack of our depth-first search on the heap instead of using recursion. * 'path_stack_pushed' is a strmap that stores path names that were already added to 'path_stack', to avoid repeating paths in the stack. Mostly, this saves us from quadratic lookups from doing unsorted checks into the string_list. The coupling of 'path_stack' and 'path_stack_pushed' is protected by the push_to_stack() method. Call this instead of inserting into these structures directly. The walk_objects_by_path() method initializes these structures and starts walking commits from the given rev_info struct. The commits are used to find the list of root trees which populate the start of our depth-first search. The core of our depth-first search is in a while loop that continues while we have not indicated an early exit and our 'path_stack' still has entries in it. The loop body pops a path off of the stack and "visits" the path via the walk_path() method. The walk_path() method gets the list of OIDs from the 'path_to_lists' strmap and executes the callback method on that list with the given path and type. If the OIDs correspond to tree objects, then iterate over all trees in the list and run add_children() to add the child objects to their own lists, adding new entries to the stack if necessary. In testing, this depth-first search approach was the one that used the least memory while iterating over the object lists. There is still a chance that repositories with too-wide path patterns could cause memory pressure issues. Limiting the stack size could be done in the future by limiting how many objects are being considered in-progress, or by visiting blob paths earlier than trees. There are many future adaptations that could be made, but they are left for future updates when consumers are ready to take advantage of those features. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

This test helper will be helpful to reduce repeated logic in t6601-path-walk.sh, but may be helpful elsewhere, too. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Add some tests based on the current behavior, doing interesting checks for different sets of branches, ranges, and the --boundary option. This sets a baseline for the behavior and we can extend it as new options are introduced. It is important to mention that the behavior of the API will change soon as we start to handle UNINTERESTING objects differently, but these tests will demonstrate the change in behavior. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

We add the ability to filter the object types in the path-walk API so the callback function is called fewer times. This adds the ability to ask for the commits in a list, as well. We re-use the empty string for this set of objects because these are passed directly to the callback function instead of being part of the 'path_stack'. Future changes will add the ability to visit annotated tags. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

The rev_info that is specified for a path-walk traversal may specify visiting tag refs (both lightweight and annotated) and also may specify indexed objects (blobs and trees). Update the path-walk API to walk these objects as well. When walking tags, we need to peel the annotated objects until reaching a non-tag object. If we reach a commit, then we can add it to the pending objects to make sure we visit in the commit walk portion. If we reach a tree, then we will assume that it is a root tree. If we reach a blob, then we have no good path name and so add it to a new list of "tagged blobs". When the rev_info includes the "--indexed-objects" flag, then the pending set includes blobs and trees found in the cache entries and cache-tree. The cache entries are usually blobs, though they could be trees in the case of a sparse index. The cache-tree stores previously-hashed tree objects but these are cleared out when staging objects below those paths. We add tests that demonstrate this. The indexed objects come with a non-NULL 'path' value in the pending item. This allows us to prepopulate the 'path_to_lists' strmap with lists for these paths. The tricky thing about this walk is that we will want to combine the indexed objects walk with the commit walk, especially in the future case of walking objects during a command like 'git repack'. Whenever possible, we want the objects from the index to be grouped with similar objects in history. We don't want to miss any paths that appear only in the index and not in the commit history. Thus, we need to be careful to let the path stack be populated initially with only the root tree path (and possibly tags and tagged blobs) and go through the normal depth-first search. Afterwards, if there are other paths that are remaining in the paths_to_lists strmap, we should then iterate through the stack and visit those objects recursively. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

When the input rev_info has UNINTERESTING starting points, we want to be sure that the UNINTERESTING flag is passed appropriately through the objects. To match how this is done in places such as 'git pack-objects', we use the mark_edges_uninteresting() method. This method has an option for using the "sparse" walk, which is similar in spirit to the path-walk API's walk. To be sure to keep it independent, add a new 'prune_all_uninteresting' option to the path_walk_info struct. To check how the UNINTERSTING flag is spread through our objects, extend the 'test-tool path-walk' command to output whether or not an object has that flag. This changes our tests significantly, including the removal of some objects that were previously visited due to the incomplete implementation. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Update the project's CodingGuidelines to discourage naming functions with a "_1()" suffix. * kn/arbitrary-suffixes: CodingGuidelines: discourage arbitrary suffixes in function names

Updates the '.clang-format' to match project conventions. * kn/ci-clang-format-tidy: clang-format: align consecutive macro definitions clang-format: re-adjust line break penalties

Centralize documentation for repository extensions into a single place. * cw/config-extensions: doc: consolidate extensions in git-config documentation

Buildfix and upgrade of Clar to a newer version. * ps/upgrade-clar: cmake: set up proper dependencies for generated clar headers cmake: fix compilation of clar-based unit tests Makefile: extract script to generate clar declarations Makefile: adjust sed command for generating "clar-decls.h" t/unit-tests: update clar to 206accb

When called with '--left-right' and '--use-bitmap-index', 'rev-list' will produce output without any left/right markers, which has been corrected. * jk/left-right-bitmap: rev-list: skip bitmap traversal for --left-right

"git cat-file --batch" and friends can optionally ask a remote server about objects it does not have. * ej/cat-file-remote-object-info: cat-file: add remote-object-info to batch-command cat-file: add declaration of variable i inside its for loop transport: add client support for object-info serve: advertise object-info feature fetch-pack: move fetch initialization fetch-pack: refactor packet writing

Teaches the MinGW compatibility layer to support POSIX semantics for atomic renames when other process(es) have a file opened at the destination path. * ps/mingw-rename: compat/mingw: support POSIX semantics for atomic renames compat/mingw: allow deletion of most opened files compat/mingw: share file handles created via `CreateFileW()`

Isolates the reftable subsystem from the rest of Git's codebase by using fewer pieces of Git's infrastructure. * ps/reftable-detach: reftable/system: provide thin wrapper for lockfile subsystem reftable/stack: drop only use of `get_locked_file_path()` reftable/system: provide thin wrapper for tempfile subsystem reftable/stack: stop using `fsync_component()` directly reftable/system: stop depending on "hash.h" reftable: explicitly handle hash format IDs reftable/system: move "dir.h" to its only user

The v2 protocol learned to allow the server to advertise possible promisor remotes, and the client to respond with what promissor remotes it uses, so that the server side can omit objects that the client can lazily obtain from these other promissor remotes. Comments? I got an impression that this is premature without finishing the discussion on a larger picture. cf. <ZvpZv_fed_su4w2-@pks.im> * cc/promisor-remote-capability: promisor-remote: check advertised name or URL Add 'promisor-remote' capability to protocol v2 strbuf: refactor strbuf_trim_trailing_ch() version: refactor strbuf_sanitize()

Error message clarification. * jc/too-many-arguments: miscellaneous: avoid "too many arguments" notes: avoid "too many arguments" cat-file: avoid "too many arguments" refs: avoid "too many arguments"

"git cat-file --batch" has been optimized. * ew/cat-file-optim: cat-file: use writev(2) if available cat-file: batch_write: use size_t for length cat-file: batch-command uses content_limit object_info: content_limit only applies to blobs packfile: packed_object_info avoids packed_to_object_type cat-file: use delta_base_cache entries directly packfile: inline cache_or_unpack_entry packfile: fix off-by-one in content_limit comparison packfile: allow content-limit for cat-file packfile: move sizep computation

Drop support for older libcURL and Perl. * bc/drop-ancient-libcurl-and-perl: gitweb: make use of s///r Require Perl 5.26.0 INSTALL: document requirement for libcurl 7.61.0 git-curl-compat: remove check for curl 7.56.0 git-curl-compat: remove check for curl 7.53.0 git-curl-compat: remove check for curl 7.52.0 git-curl-compat: remove check for curl 7.44.0 git-curl-compat: remove check for curl 7.43.0 git-curl-compat: remove check for curl 7.39.0 git-curl-compat: remove check for curl 7.34.0 git-curl-compat: remove check for curl 7.25.0 git-curl-compat: remove check for curl 7.21.5

Teach configuration values of type "pathname" a new ':(optional)' suffix. * jc/optional-path: parseopt: values of pathname type can be prefixed with :(optional) config: values of pathname type can be prefixed with :(optional) t7500: make each piece more independent

More leakfixes. * ps/leakfixes-part-9: (22 commits) list-objects-filter-options: work around reported leak on error builtin/merge: release outbut buffer after performing merge dir: fix leak when parsing "status.showUntrackedFiles" t/helper: fix leaking buffer in "dump-untracked-cache" t/helper: stop re-initialization of `the_repository` sparse-index: correctly free EWAH contents dir: release untracked cache data combine-diff: fix leaking lost lines builtin/tag: fix leaking key ID on failure to sign transport-helper: fix leaking import/export marks builtin/commit: fix leaking cleanup config trailer: fix leaking strbufs when formatting trailers trailer: fix leaking trailer values builtin/commit: fix leaking change data contents upload-pack: fix leaking URI protocols pretty: clear signature check diff-lib: fix leaking diffopts in `do_diff_cache()` revision: fix leaking bloom filters builtin/grep: fix leak with `--max-count=0` grep: fix leak in `grep_splice_or()` ...

Regression fix for 'show-index' when run outside of a repository. * as/show-index-uninitialized-hash: show-index: fix uninitialized hash function

Documentation improvements to more prominently call out the use of '--all' when creating bundles. * kh/bundle-docs: Documentation/git-bundle.txt: discuss naïve backups Documentation/git-bundle.txt: mention --all in spec. refs Documentation/git-bundle.txt: mention full backup example

Introduce a new API to visit objects in batches based on a common path, or by type. * ds/path-walk-1: path-walk: mark trees and blobs as UNINTERESTING path-walk: visit tags and cached objects path-walk: allow consumer to specify object types t6601: add helper for testing path-walk API test-lib-functions: add test_cmp_sorted path-walk: introduce an object walk by path

Drop support for ancient environments in various CI jobs. * bc/ancient-ci: Add additional CI jobs to avoid accidental breakage ci: remove clause for Ubuntu 16.04 gitlab-ci: switch from Ubuntu 16.04 to 20.04

gitgitgadget-git · 2025-09-23T23:10:21Z

Welcome to GitGitGadget

Hi @spacefix-creator, and welcome to GitGitGadget, the GitHub App to send patch series to the Git mailing list from GitHub Pull Requests.

Please make sure that either:

Your Pull Request has a good description, if it consists of multiple commits, as it will be used as cover letter.
Your Pull Request description is empty, if it consists of a single commit, as the commit message should be descriptive enough by itself.

You can CC potential reviewers by adding a footer to the PR description with the following syntax:

CC: Revi Ewer <revi.ewer@example.com>, Ill Takalook <ill.takalook@example.net>

NOTE: DO NOT copy/paste your CC list from a previous GGG PR's description,
because it will result in a malformed CC list on the mailing list. See
example.

Also, it is a good idea to review the commit messages one last time, as the Git project expects them in a quite specific form:

the lines should not exceed 76 columns,
the first line should be like a header and typically start with a prefix like "tests:" or "revisions:" to state which subsystem the change is about, and
the commit messages' body should be describing the "why?" of the change.
Finally, the commit messages should end in a Signed-off-by: line matching the commits' author.

It is in general a good idea to await the automated test ("Checks") in this Pull Request before contributing the patches, e.g. to avoid trivial issues such as unportable code.

Contributing the patches

Before you can contribute the patches, your GitHub username needs to be added to the list of permitted users. Any already-permitted user can do that, by adding a comment to your PR of the form /allow. A good way to find other contributors is to locate recent pull requests where someone has been /allowed:

Search: is:pr is:open "/allow"

Both the person who commented /allow and the PR author are able to /allow you.

An alternative is the channel #git-devel on the Libera Chat IRC network:

<newcontributor> I've just created my first PR, could someone please /allow me? https://github.com/gitgitgadget/git/pull/12345
<veteran> newcontributor: it is done
<newcontributor> thanks!

Once on the list of permitted usernames, you can contribute the patches to the Git mailing list by adding a PR comment /submit.

If you want to see what email(s) would be sent for a /submit request, add a PR comment /preview to have the email(s) sent to you. You must have a public GitHub email address for this. Note that any reviewers CC'd via the list in the PR description will not actually be sent emails.

After you submit, GitGitGadget will respond with another comment that contains the link to the cover letter mail in the Git mailing list archive. Please make sure to monitor the discussion in that thread and to address comments and suggestions (while the comments and suggestions will be mirrored into the PR by GitGitGadget, you will still want to reply via mail).

If you do not want to subscribe to the Git mailing list just to be able to respond to a mail, you can download the mbox from the Git mailing list archive (click the (raw) link), then import it into your mail program. If you use GMail, you can do this via:

curl -g --user "<EMailAddress>:<Password>" \
    --url "imaps://imap.gmail.com/INBOX" -T /path/to/raw.txt

To iterate on your change, i.e. send a revised patch or patch series, you will first want to (force-)push to the same branch. You probably also want to modify your Pull Request description (or title). It is a good idea to summarize the revision by adding something like this to the cover letter (read: by editing the first comment on the PR, i.e. the PR description):

Changes since v1:
- Fixed a typo in the commit message (found by ...)
- Added a code comment to ... as suggested by ...
...

To send a new iteration, just add another PR comment with the contents: /submit.

Need help?

New contributors who want advice are encouraged to join git-mentoring@googlegroups.com, where volunteers who regularly contribute to Git are willing to answer newbie questions, give advice, or otherwise provide mentoring to interested contributors. You must join in order to post or view messages, but anyone can join.

You may also be able to find help in real time in the developer IRC channel, #git-devel on Libera Chat. Remember that IRC does not support offline messaging, so if you send someone a private message and log out, they cannot respond to you. The scrollback of #git-devel is archived, though.

gitgitgadget-git · 2025-09-23T23:10:22Z

The pull request has 87 commits. The max allowed is 30. Please split the patch series into multiple pull requests. Also consider squashing related commits.

gitster and others added 30 commits August 6, 2024 10:24

t/helper: fix leaks in "reach" test tool

bfac141

The "reach" test tool doesn't bother to clean up any of its allocated resources, causing various leaks. Plug them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

builtin/grep: fix leak with --max-count=0

4042f03

When executing with `--max-count=0` we'll return early from git-grep(1) without performing any cleanup, which causes memory leaks. Plug these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

revision: fix leaking bloom filters

ab28bc8

The memory allocated by `prepare_to_use_bloom_filter()` is not released by `release_revisions()`, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

pretty: clear signature check

c0a1026

The signature check in of the formatting context is never getting released. Fix this to plug the resulting memory leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

upload-pack: fix leaking URI protocols

5ca4de4

We don't clear `struct upload_pack::uri_protocols`, which causes a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

builtin/commit: fix leaking change data contents

27164f7

While we free the worktree change data, we never free its contents. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>

peijianju and others added 27 commits October 31, 2024 16:59

test-lib-functions: add test_cmp_sorted

44425c4

This test helper will be helpful to reduce repeated logic in t6601-path-walk.sh, but may be helpful elsewhere, too. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>

Merge branch 'kn/arbitrary-suffixes' into jch

d18caa1

Update the project's CodingGuidelines to discourage naming functions with a "_1()" suffix. * kn/arbitrary-suffixes: CodingGuidelines: discourage arbitrary suffixes in function names

Merge branch 'kn/ci-clang-format-tidy' into jch

fbed493

Updates the '.clang-format' to match project conventions. * kn/ci-clang-format-tidy: clang-format: align consecutive macro definitions clang-format: re-adjust line break penalties

Merge branch 'cw/config-extensions' into jch

46be1b3

Centralize documentation for repository extensions into a single place. * cw/config-extensions: doc: consolidate extensions in git-config documentation

###

5b107fa

Merge branch 'jk/left-right-bitmap' into jch

ff12b90

When called with '--left-right' and '--use-bitmap-index', 'rev-list' will produce output without any left/right markers, which has been corrected. * jk/left-right-bitmap: rev-list: skip bitmap traversal for --left-right

### match next

8f346e1

Merge branch 'jc/too-many-arguments' into jch

96e5742

Error message clarification. * jc/too-many-arguments: miscellaneous: avoid "too many arguments" notes: avoid "too many arguments" cat-file: avoid "too many arguments" refs: avoid "too many arguments"

Merge branch 'as/show-index-uninitialized-hash' into jch

2f42bd7

Regression fix for 'show-index' when run outside of a repository. * as/show-index-uninitialized-hash: show-index: fix uninitialized hash function

Merge branch 'bc/ancient-ci' into jch

515fd3a

Drop support for ancient environments in various CI jobs. * bc/ancient-ci: Add additional CI jobs to avoid accidental breakage ci: remove clause for Ubuntu 16.04 gitlab-ci: switch from Ubuntu 16.04 to 20.04

gitgitgadget-git bot added the new user label Sep 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jch #2059

Jch #2059

Uh oh!

spacefix-creator commented Sep 23, 2025

Uh oh!

gitgitgadget-git bot commented Sep 23, 2025

Uh oh!

gitgitgadget-git bot commented Sep 23, 2025

Uh oh!

Uh oh!

Jch #2059

Are you sure you want to change the base?

Jch #2059

Uh oh!

Conversation

spacefix-creator commented Sep 23, 2025

Uh oh!

gitgitgadget-git bot commented Sep 23, 2025

Welcome to GitGitGadget

Contributing the patches

Need help?

Uh oh!

gitgitgadget-git bot commented Sep 23, 2025

Uh oh!

Uh oh!