Skip to content

Conversation

t-b
Copy link

@t-b t-b commented Sep 3, 2014

The commits which are in sschuberth/git and not in here are (pasted from private mail from @sschuberth):

  • 75e415e Makefile: Do not install built-ins
  • ffb434c MinGW: Use MakeMaker to build the Perl libraries
  • 617433c Documentation: make AsciiDoc links always point to HTML files (*)
  • b543717 MSVC: fix stat definition hell (*)
  • 296f510 MinGW: Fix stat definitions to work with MinGW runtime version 4.0 (*)
  • 72631c9 mailmap: work around implementations with pure inline strcasecmp (*)
  • ab94cf6 Windows: do not redefine _WIN32_WINNT (*)
  • d892bec t0061: Work around a line endings issue with newer versions of
    cat on MSYS
  • adfa3c9 Makefile: Set htmldir to match the default HTML docs location under MSYS
  • 159ea1a t7800: Use "test_cmp_text" in all places where "echo" is used

The starred ones (*) are supposedly already upstream.
I picked the necessary ones here.

The omission of 75e415e is on purpose as that feature is not ready yet.

sschuberth and others added 4 commits September 3, 2014 22:43
This way the libraries get properly installed into the "site_perl"
directory and we just have to move them out of the "mingw" directory.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
While "echo", being a Bash built-in, produces Unix line-endings, "git" and
"cat" produce DOS line-endings on MSYS, so ignore line-ending differences.

Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
… MSYS

While msysgit uses cat 2.0, mingwGitDevEnv uses cat 5.97, currently. A
change part of the latter version [1] makes cat use the "same rules as
other programs to decide whether to use binary I/O". As a result, cat 5.97
behaves like e.g. awk with respect to which line endings are used if
stdout is redirected to a file, and those are DOS line endings. Ignore
line endings when comparing the test result to work around this.

[1] http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=commit;h=8770c00ef45e3c6c4dd3d5ce1e55a6fb1adb13dc

Signed-off-by: Thomas Braun <thomas.braun@virtuell-zuhause.de>
Signed-off-by: Sebastian Schuberth <sschuberth@gmail.com>
@dscho
Copy link
Member

dscho commented Sep 4, 2014

Looks good to me! @sschuberth would you mind if I merged?

@sschuberth
Copy link

Just a question to double check: Did you actually cherry-pick the commits that are supposed to be upstream, to verify that they really are (they should either result in a no-op or minor conflict, if upstream has a slightly reworked patch applied)?

@sschuberth
Copy link

Nevermind, I just verified that the starred commits are indeed upstream.

sschuberth added a commit that referenced this pull request Sep 4, 2014
@sschuberth sschuberth merged commit cd79f58 into git-for-windows:master Sep 4, 2014
@t-b t-b deleted the new-sdk-fixes branch September 4, 2014 15:59
@t-b
Copy link
Author

t-b commented Sep 4, 2014

Thanks for merging.
@sschuberth Got me! No I did not let cherry-pick discover that for. Will do that next time :)

t-b added a commit that referenced this pull request Oct 13, 2014
dscho pushed a commit that referenced this pull request Nov 30, 2014
The main loop in strbuf_utf8_replace() could summed up as:

  while ('src' is still valid) {
    1) advance 'src' to copy ANSI escape sequences
    2) advance 'src' to copy/replace visible characters
  }

The problem is after #1, 'src' may have reached the end of the string
(so 'src' points to NUL) and #2 will continue to copy that NUL as if
it's a normal character. Because the output is stored in a strbuf,
this NUL accounted in the 'len' field as well. Check after #1 and
break the loop if necessary.

The test does not look obvious, but the combination of %>>() should
make a call trace like this

  show_log()
  pretty_print_commit()
  format_commit_message()
  strbuf_expand()
  format_commit_item()
  format_and_pad_commit()
  strbuf_utf8_replace()

where %C(auto)%d would insert a color reset escape sequence in the end
of the string given to strbuf_utf8_replace() and show_log() uses
fwrite() to send everything to stdout (including the incorrect NUL
inserted by strbuf_utf8_replace)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Mar 6, 2015
Don't chop test_expect_success line into pieces and concatenate with
'\'.  That's so 2005.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Mar 6, 2015
* jc/t9001-modernise:
  t9001: style modernisation phase #5
  t9001: style modernisation phase #4
  t9001: style modernisation phase #3
  t9001: style modernisation phase #2
  t9001: style modernisation phase #1
dscho pushed a commit that referenced this pull request Mar 6, 2015
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Jul 30, 2015
The collect_parents() function now is responsible for

 1. parsing the commits given on the command line into a list of
    commits to be merged;

 2. filtering these parents into independent ones; and

 3. optionally calling fmt_merge_msg() via prepare_merge_message()
    to prepare an auto-generated merge log message, using fake
    contents that FETCH_HEAD would have had if these commits were
    fetched from the current repository with "git pull . $args..."

Make "git merge FETCH_HEAD" to be the same as the traditional

    git merge "$(git fmt-merge-msg <.git/FETCH_HEAD)" $commits

invocation of the command in "git pull", where $commits are the ones
that appear in FETCH_HEAD that are not marked as not-for-merge, by
making it do a bit more, specifically:

 - noticing "FETCH_HEAD" is the only "commit" on the command line
   and picking the commits that are not marked as not-for-merge as
   the list of commits to be merged (substitute for step #1 above);

 - letting the resulting list fed to step #2 above;

 - doing the step #3 above, using the contents of the FETCH_HEAD
   instead of fake contents crafted from the list of commits parsed
   in the step #1 above.

Note that this changes the semantics.  "git merge FETCH_HEAD" has
always behaved as if the first commit in the FETCH_HEAD file were
directly specified on the command line, creating a two-way merge
whose auto-generated merge log said "merge commit xyz".  With this
change, if the previous fetch was to grab multiple branches (e.g.
"git fetch $there topic-a topic-b"), the new world order is to
create an octopus, behaving as if "git pull $there topic-a topic-b"
were run.  This is a deliberate change to make that happen, and
can be seen in the changes to t3033 tests.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
rimrul referenced this pull request in rimrul/git Oct 21, 2015
When ac49f5c (rerere "remaining", 2011-02-16) split out a new
helper function check_one_conflict() out of find_conflict()
function, so that the latter will use the returned value from the
new helper to update the loop control variable that is an index into
active_cache[], the new variable incremented the index by one too
many when it found a path with only stage #1 entry at the very end
of active_cache[].

This "strange" return value does not have any effect on the loop
control of two callers of this function, as they all notice that
active_nr+2 is larger than active_nr just like active_nr+1 is, but
nevertheless it puzzles the readers when they are trying to figure
out what the function is trying to do.

In fact, there is no need to do an early return.  The code that
follows after skipping the stage #1 entry is fully prepared to
handle a case where the entry is at the very end of active_cache[].

Help future readers from unnecessary confusion by dropping an early
return.  We skip the stage #1 entry, and if there are stage #2 and
stage git-for-windows#3 entries for the same path, we diagnose the path as
THREE_STAGED (otherwise we say PUNTED), and then we skip all entries
for the same path.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
rimrul referenced this pull request in rimrul/git Oct 21, 2015
A conflicted index can have multiple stage #1 entries when dealing
with a criss-cross merge and using the "resolve" merge strategy.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
rimrul referenced this pull request in rimrul/git Oct 21, 2015
A conflicted index can have multiple stage #1 entries when dealing
with a criss-cross merge and using the "resolve" merge strategy.

Plug the leak by reading only the first one of the same stage
entries.

Strictly speaking, this fix does change the semantics, in that we
used to use the last stage #1 entry as the common ancestor when
doing the plain-vanilla three-way merge, but with the leak fix, we
will use the first stage #1 entry.  But it is not a grave backward
compatibility breakage.  Either way, we are arbitrarily picking one
of multiple stage #1 entries and using it, ignoring others, and
there is no meaning in the ordering of these stage #1 entries.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
rimrul referenced this pull request in rimrul/git Oct 21, 2015
Code clean-up and minor fixes.

* jc/rerere: (21 commits)
  rerere: un-nest merge() further
  rerere: use "struct rerere_id" instead of "char *" for conflict ID
  rerere: call conflict-ids IDs
  rerere: further clarify do_rerere_one_path()
  rerere: further de-dent do_plain_rerere()
  rerere: refactor "replay" part of do_plain_rerere()
  rerere: explain the remainder
  rerere: explain "rerere forget" codepath
  rerere: explain the primary codepath
  rerere: explain MERGE_RR management helpers
  rerere: fix benign off-by-one non-bug and clarify code
  rerere: explain the rerere I/O abstraction
  rerere: do not leak mmfile[] for a path with multiple stage #1 entries
  rerere: stop looping unnecessarily
  rerere: drop want_sp parameter from is_cmarker()
  rerere: report autoupdated paths only after actually updating them
  rerere: write out each record of MERGE_RR in one go
  rerere: lift PATH_MAX limitation
  rerere: plug conflict ID leaks
  rerere: handle conflicts with multiple stage #1 entries
  ...
dscho pushed a commit that referenced this pull request Jan 18, 2017
We generate the squash commit message incrementally running
a sed script once for each commit. It parses "This is
a combination of <N> commits" from the first line of the
existing message, adds one to <N>, and uses the result as
the number of our current message.

Since f2d1706 (i18n: rebase-interactive: mark comments of
squash for translation, 2016-06-17), the first line may be
localized, and sed uses a pretty liberal regex, looking for:

  /^#.*([0-9][0-9]*)/

The "[0-9][0-9]*" tries to match double digits, but it
doesn't quite work.  The first ".*" is greedy, so if you
have:

  This is a combination of 10 commits.

it will eat up "This is a combination of 1", leaving "0" to
match the first "[0-9]" digit, and then skipping the
optional match of "[0-9]*".

As a result, the count resets every 10 commits, and a
15-commit squash would end up as:

  # This is a combination of 5 commits.
  # This is the 1st commit message:
  ...
  # This is the commit message #2:
  ... and so on ..
  # This is the commit message #10:
  ...
  # This is the commit message #1:
  ...
  # This is the commit message #2:
  ... etc, up to 5 ...

We can fix this by making the ".*" less greedy. Instead of
depending on ".*?" working portably, we can just limit the
match to non-digit characters, which accomplishes the same
thing.

Reported-by: Brandon Tolsch <btolsch@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Jun 7, 2017
Amend my change earlier in this series ("grep: add support for the
PCRE v1 JIT API", 2017-04-11) to un-break the build on PCRE v1
versions later than 8.31 compiled without --enable-jit.

As explained in that change and a later compatibility change in this
series ("grep: un-break building with PCRE < 8.32", 2017-05-10) the
pcre_jit_exec() function is a faster path to execute the JIT.

Unfortunately there's no compatibility stub for that function compiled
into the library if pcre_config(PCRE_CONFIG_JIT, &ret) would return 0,
and no macro that can be used to check for it, so the only portable
option to support builds without --enable-jit is via a new
NO_LIBPCRE1_JIT=UnfortunatelyYes Makefile option[1].

Another option would be to make the JIT opt-in via
USE_LIBPCRE1_JIT=YesPlease, after all it's not a default option of
PCRE v1.

I think it makes more sense to make it opt-out since even though it's
not a default option, most packagers of PCRE seem to turn it on by
default, with the notable exception of the MinGW package.

Make the MinGW platform work by default by changing the build defaults
to turn on NO_LIBPCRE1_JIT=UnfortunatelyYes. It is the only platform
that turns on USE_LIBPCRE=YesPlease by default, see commit
df5218b ("config.mak.uname: support MSys2", 2016-01-13) for that
change.

1. "How do I support pcre1 JIT on all
   versions?"  (https://lists.exim.org/lurker/thread/20170601.103148.10253788.en.html)

2. https://github.com/Alexpux/MINGW-packages/blob/master/mingw-w64-pcre/PKGBUILD
   (referenced from "Re: PCRE v2 compile error, was Re: What's cooking
   in git.git (May 2017, #1; Mon, 1)";
   <alpine.DEB.2.20.1705021756530.3480@virtualbox>)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Jul 14, 2017
We turn off ASan's leak detection by default in the test
suite because it's too noisy. But we don't do so until
part-way through test-lib. This is before we've run any
tests, but after we do our initial "./git" to see if the
binary has even been built.

When built with clang, this seems to work fine. However,
using "gcc -fsanitize=address", the leak checker seems to
complain more aggressively:

  $ ./git
  ...
  ==5352==ERROR: LeakSanitizer: detected memory leaks
  Direct leak of 2 byte(s) in 1 object(s) allocated from:
      #0 0x7f120e7afcf8 in malloc (/usr/lib/x86_64-linux-gnu/libasan.so.3+0xc1cf8)
      #1 0x559fc2a3ce41 in do_xmalloc /home/peff/compile/git/wrapper.c:60
      #2 0x559fc2a3cf1a in do_xmallocz /home/peff/compile/git/wrapper.c:100
      #3 0x559fc2a3d0ad in xmallocz /home/peff/compile/git/wrapper.c:108
      #4 0x559fc2a3d0ad in xmemdupz /home/peff/compile/git/wrapper.c:124
      #5 0x559fc2a3d0ad in xstrndup /home/peff/compile/git/wrapper.c:130
      #6 0x559fc274535a in main /home/peff/compile/git/common-main.c:39
      #7 0x7f120dabd2b0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202b0)

This is a leak in the sense that we never free it, but it's
in a global that is meant to last the whole program. So it's
not really interesting or in need of fixing. And at any
rate, mentioning leaks outside of the test_expect blocks is
certainly unwelcome, as it pollutes stderr.

Let's bump the setting of ASAN_OPTIONS higher in test-lib.sh
to catch our initial "can we even run git?" test.  While
we're at it, we can add a comment to make it a bit less
inscrutable.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Dec 31, 2024
In 1b9e9be (csum-file.c: use unsafe SHA-1 implementation when
available, 2024-09-26) we have converted our `struct hashfile` to use
the unsafe SHA1 backend, which results in a significant speedup. One
needs to be careful with how to use that structure now though because
callers need to consistently use either the safe or unsafe variants of
SHA1, as otherwise one can easily trigger corruption.

As it turns out, we have one inconsistent usage in our tree because we
directly initialize `struct hashfile_checkpoint::ctx` with the safe
variant of SHA1, but end up writing to that context with the unsafe
ones. This went unnoticed so far because our CI systems do not exercise
different hash functions for these two backends, and consequently safe
and unsafe variants are equivalent. But when using SHA1DC as safe and
OpenSSL as unsafe backend this leads to a crash an t1050:

    ++ git -c core.compression=0 add large1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==1367==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x507000000db0 sp 0x7fffffff5690 T0)
    ==1367==The signal is caused by a READ memory access.
    ==1367==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x555555b9905d in deflate_blob_to_pack ../bulk-checkin.c:286:4
        #5 0x555555b98ae9 in index_blob_bulk_checkin ../bulk-checkin.c:362:15
        #6 0x555555ddab62 in index_blob_stream ../object-file.c:2756:9
        #7 0x555555dda420 in index_fd ../object-file.c:2778:9
        #8 0x555555ddad76 in index_path ../object-file.c:2796:7
        #9 0x555555e947f3 in add_to_index ../read-cache.c:771:7
        #10 0x555555e954a4 in add_file_to_index ../read-cache.c:804:9
        #11 0x5555558b5c39 in add_files ../builtin/add.c:355:7
        #12 0x5555558b412e in cmd_add ../builtin/add.c:578:18
        #13 0x555555b1f493 in run_builtin ../git.c:480:11
        #14 0x555555b1bfef in handle_builtin ../git.c:740:9
        #15 0x555555b1e6f4 in run_argv ../git.c:807:4
        #16 0x555555b1b87a in cmd_main ../git.c:947:19
        #17 0x5555561649e6 in main ../common-main.c:64:11
        #18 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #19 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #20 0x555555772c84 in _start (git+0x21ec84)

    ==1367==Register values:
    rax = 0x0000511000001080  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x0000507000000db0  rbp = 0x0000507000000db0  rsp = 0x00007fffffff5690
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b38  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==1367==ABORTING
    ./test-lib.sh: line 1023:  1367 Aborted                 git $config add large1
    error: last command exited with $?=134
    not ok 4 - add with -c core.compression=0

Fix the issue by using the unsafe variant instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Dec 31, 2024
Same as with the preceding commit, git-fast-import(1) is using the safe
variant to initialize a hashfile checkpoint. This leads to a segfault
when passing the checkpoint into the hashfile subsystem because it would
use the unsafe variants instead:

    ++ git --git-dir=R/.git fast-import --big-file-threshold=1
    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==577126==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000040 (pc 0x7ffff7a01a99 bp 0x5070000009c0 sp 0x7fffffff5b30 T0)
    ==577126==The signal is caused by a READ memory access.
    ==577126==Hint: address points to the zero page.
        #0 0x7ffff7a01a99 in EVP_MD_CTX_copy_ex (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4)
        #1 0x555555ddde56 in openssl_SHA1_Clone ../sha1/openssl.h:40:2
        #2 0x555555dce2fc in git_hash_sha1_clone_unsafe ../object-file.c:123:2
        #3 0x555555c2d5f8 in hashfile_checkpoint ../csum-file.c:211:2
        #4 0x5555559647d1 in stream_blob ../builtin/fast-import.c:1110:2
        #5 0x55555596247b in parse_and_store_blob ../builtin/fast-import.c:2031:3
        #6 0x555555967f91 in file_change_m ../builtin/fast-import.c:2408:5
        #7 0x55555595d8a2 in parse_new_commit ../builtin/fast-import.c:2768:4
        #8 0x55555595bb7a in cmd_fast_import ../builtin/fast-import.c:3614:4
        #9 0x555555b1f493 in run_builtin ../git.c:480:11
        #10 0x555555b1bfef in handle_builtin ../git.c:740:9
        #11 0x555555b1e6f4 in run_argv ../git.c:807:4
        #12 0x555555b1b87a in cmd_main ../git.c:947:19
        #13 0x5555561649e6 in main ../common-main.c:64:11
        #14 0x7ffff742a1fb in __libc_start_call_main (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #15 0x7ffff742a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/65h17wjrrlsj2rj540igylrx7fqcd6vq-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: bf320110569c8ec2425e9a0c5e4eb7e97f1fb6e4)
        #16 0x555555772c84 in _start (git+0x21ec84)

    ==577126==Register values:
    rax = 0x0000511000000cc0  rbx = 0x0000000000000000  rcx = 0x000000000000000c  rdx = 0x0000000000000000
    rdi = 0x0000000000000000  rsi = 0x00005070000009c0  rbp = 0x00005070000009c0  rsp = 0x00007fffffff5b30
     r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x00007ffff7a01a30
    r12 = 0x0000000000000000  r13 = 0x00007fffffff6b60  r14 = 0x00007ffff7ffd000  r15 = 0x00005555563b9910
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: SEGV (/nix/store/h1ydpxkw9qhjdxjpic1pdc2nirggyy6f-openssl-3.3.2/lib/libcrypto.so.3+0x201a99) (BuildId: 41746a580d39075fc85e8c8065b6c07fb34e97d4) in EVP_MD_CTX_copy_ex
    ==577126==ABORTING
    ./test-lib.sh: line 1039: 577126 Aborted                 git --git-dir=R/.git fast-import --big-file-threshold=1 < input
    error: last command exited with $?=134
    not ok 167 - R: blob bigger than threshold

The segfault is only exposed in case the unsafe and safe backends are
different from one another.

Fix the issue by initializing the context with the unsafe SHA1 variant.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Jan 1, 2025
Our CI jobs sometimes see false positive leaks like this:

        =================================================================
        ==3904583==ERROR: LeakSanitizer: detected memory leaks

        Direct leak of 32 byte(s) in 1 object(s) allocated from:
            #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
            #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
            #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
            #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
            #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
            #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
            #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
            #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This is not a leak in our code, but appears to be a race between one
thread calling exit() while another one is in LSan's stack setup code.
You can reproduce it easily by running t0003 or t5309 with --stress
(these trigger it because of the threading in git-grep and index-pack
respectively).

This may be a bug in LSan, but regardless of whether it is eventually
fixed, it is useful to work around it so that we stop seeing these false
positives.

We can recognize it by the mention of the sanitizer functions in the
DEDUP_TOKEN line. With this patch, the scripts mentioned above should
run with --stress indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Jan 20, 2025
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Jan 20, 2025
An internal customer reported a segfault when running `git
sparse-checkout set` with the `index.sparse` config enabled. I was
unable to reproduce it locally, but with their help we debugged into the
failing process and discovered the following stacktrace:

```
#0  0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125
git-for-windows#1  0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247
git-for-windows#2  0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122
git-for-windows#3  0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638
git-for-windows#4  0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255
git-for-windows#5  0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570)    at sparse-index.c:307
git-for-windows#6  0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46
git-for-windows#7  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#8  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#9  0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422
git-for-windows#10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456
git-for-windows#11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0,    search_mode=EXPAND_SPARSE) at read-cache.c:556
git-for-windows#12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566
git-for-windows#13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756
git-for-windows#14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860
git-for-windows#15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063
git-for-windows#16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548
git-for-windows#17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808
git-for-windows#18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877
git-for-windows#19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017
git-for-windows#20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 
```

The very bottom of the stack being the `rehash()` method from
`hashmap.c` as called within the `name-hash` API made me look at where
these hashmaps were being used in the sparse index logic. These were
being copied across indexes, which seems dangerous. Indeed, clearing
these hashmaps and setting them as not initialized fixes the segfault.

The second commit is a response to a test failure that happens in
`t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to
fail because the underlying `git checkout-index` process fails due to
colliding files. Passing the `-f` flag appears to work, but it's unclear
why this name-hash change causes that change in behavior.
dscho pushed a commit that referenced this pull request Feb 3, 2025
When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only via Meson with `meson setup -Dsanitize=leak`:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Feb 3, 2025
We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
via `meson setup -Dsanitize=leak`:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Feb 5, 2025
When trying to create a Unix socket in a path that exceeds the maximum
socket name length we try to first change the directory into the parent
folder before creating the socket to reduce the length of the name. When
this fails we error out of `unix_sockaddr_init()` with an error code,
which indicates to the caller that the context has not been initialized.
Consequently, they don't release that context.

This leads to a memory leak: when we have already populated the context
with the original directory that we need to chdir(3p) back into, but
then the chdir(3p) into the socket's parent directory fails, then we
won't release the original directory's path. The leak is exposed by
t0301, but only when running tests in a directory hierarchy whose path
is long enough to make the socket name length exceed the maximum socket
name length:

    Direct leak of 129 byte(s) in 1 object(s) allocated from:
        #0 0x5555555e85c6 in realloc.part.0 lsan_interceptors.cpp.o
        #1 0x55555590e3d6 in xrealloc ../wrapper.c:140:8
        #2 0x5555558c8fc6 in strbuf_grow ../strbuf.c:114:2
        #3 0x5555558cacab in strbuf_getcwd ../strbuf.c:605:3
        #4 0x555555923ff6 in unix_sockaddr_init ../unix-socket.c:65:7
        #5 0x555555923e42 in unix_stream_connect ../unix-socket.c:84:6
        #6 0x55555562a984 in send_request ../builtin/credential-cache.c:46:11
        #7 0x55555562a89e in do_cache ../builtin/credential-cache.c:108:6
        #8 0x55555562a655 in cmd_credential_cache ../builtin/credential-cache.c:178:3
        #9 0x555555700547 in run_builtin ../git.c:480:11
        #10 0x5555556ff0e0 in handle_builtin ../git.c:740:9
        #11 0x5555556ffee8 in run_argv ../git.c:807:4
        #12 0x5555556fee6b in cmd_main ../git.c:947:19
        #13 0x55555593f689 in main ../common-main.c:64:11
        #14 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #15 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #16 0x5555555ad1d4 in _start (git+0x591d4)

    DEDUP_TOKEN: ___interceptor_realloc.part.0--xrealloc--strbuf_grow--strbuf_getcwd--unix_sockaddr_init--unix_stream_connect--send_request--do_cache--cmd_credential_cache--run_builtin--handle_builtin--run_argv--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 129 byte(s) leaked in 1 allocation(s).

Fix this leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Feb 5, 2025
We don't free the result of `remote_default_branch()`, leading to a
memory leak. This leak is exposed by t9211, but only when run with Meson
with the `-Db_sanitize=leak` option:

    Direct leak of 5 byte(s) in 1 object(s) allocated from:
        #0 0x5555555cfb93 in malloc (scalar+0x7bb93)
        #1 0x5555556b05c2 in do_xmalloc ../wrapper.c:55:8
        #2 0x5555556b06c4 in do_xmallocz ../wrapper.c:89:8
        #3 0x5555556b0656 in xmallocz ../wrapper.c:97:9
        #4 0x5555556b0728 in xmemdupz ../wrapper.c:113:16
        #5 0x5555556b07a7 in xstrndup ../wrapper.c:119:9
        #6 0x5555555d3a4b in remote_default_branch ../scalar.c:338:14
        #7 0x5555555d20e6 in cmd_clone ../scalar.c:493:28
        #8 0x5555555d196b in cmd_main ../scalar.c:992:14
        #9 0x5555557c4059 in main ../common-main.c:64:11
        #10 0x7ffff7a2a1fb in __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fb) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #11 0x7ffff7a2a2b8 in __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b8) (BuildId: 0a855678aa0cb573cecbb2bcc73ab8239ec472d0)
        #12 0x555555592054 in _start (scalar+0x3e054)

    DEDUP_TOKEN: __interceptor_malloc--do_xmalloc--do_xmallocz--xmallocz--xmemdupz--xstrndup--remote_default_branch--cmd_clone--cmd_main--main--__libc_start_call_main--__libc_start_main@GLIBC_2.2.5--_start
    SUMMARY: LeakSanitizer: 5 byte(s) leaked in 1 allocation(s).

As the `branch` variable may contain a string constant obtained from
parsing command line arguments we cannot free the leaking variable
directly. Instead, introduce a new `branch_to_free` variable that only
ever gets assigned the allocated string and free that one to plug the
leak.

It is unclear why the leak isn't flagged when running the test via our
Makefile.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Feb 26, 2025
Submodule merges are, in general, similar to other merges based on oid
three-way-merge. When a conflict happens, however, Git has two special
cases (introduced in 68d03e4) on handling the conflict before
yielding it to the user. From the merge-ort and merge-recursive sources:

- "Case #1: a is contained in b or vice versa": both strategies try to
perform a fast-forward in the submodules if the commit referred by the
conflicted submodule is descendant of another;

- "Case #2: There are one or more merges that contain a and b in the
submodule.  If there is only one, then present it as a suggestion to the
user, but leave it marked unmerged so the user needs to confirm the
resolution."

Add a small paragraph on merge-strategies.adoc describing this behavior.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 6, 2025
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 6, 2025
An internal customer reported a segfault when running `git
sparse-checkout set` with the `index.sparse` config enabled. I was
unable to reproduce it locally, but with their help we debugged into the
failing process and discovered the following stacktrace:

```
#0  0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125
git-for-windows#1  0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247
git-for-windows#2  0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122
git-for-windows#3  0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638
git-for-windows#4  0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255
git-for-windows#5  0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570)    at sparse-index.c:307
git-for-windows#6  0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46
git-for-windows#7  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#8  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#9  0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422
git-for-windows#10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456
git-for-windows#11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0,    search_mode=EXPAND_SPARSE) at read-cache.c:556
git-for-windows#12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566
git-for-windows#13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756
git-for-windows#14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860
git-for-windows#15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063
git-for-windows#16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548
git-for-windows#17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808
git-for-windows#18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877
git-for-windows#19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017
git-for-windows#20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 
```

The very bottom of the stack being the `rehash()` method from
`hashmap.c` as called within the `name-hash` API made me look at where
these hashmaps were being used in the sparse index logic. These were
being copied across indexes, which seems dangerous. Indeed, clearing
these hashmaps and setting them as not initialized fixes the segfault.

The second commit is a response to a test failure that happens in
`t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to
fail because the underlying `git checkout-index` process fails due to
colliding files. Passing the `-f` flag appears to work, but it's unclear
why this name-hash change causes that change in behavior.
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 13, 2025
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 13, 2025
An internal customer reported a segfault when running `git
sparse-checkout set` with the `index.sparse` config enabled. I was
unable to reproduce it locally, but with their help we debugged into the
failing process and discovered the following stacktrace:

```
#0  0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125
git-for-windows#1  0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247
git-for-windows#2  0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122
git-for-windows#3  0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638
git-for-windows#4  0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255
git-for-windows#5  0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570)    at sparse-index.c:307
git-for-windows#6  0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46
git-for-windows#7  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#8  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#9  0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422
git-for-windows#10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456
git-for-windows#11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0,    search_mode=EXPAND_SPARSE) at read-cache.c:556
git-for-windows#12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566
git-for-windows#13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756
git-for-windows#14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860
git-for-windows#15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063
git-for-windows#16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548
git-for-windows#17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808
git-for-windows#18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877
git-for-windows#19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017
git-for-windows#20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 
```

The very bottom of the stack being the `rehash()` method from
`hashmap.c` as called within the `name-hash` API made me look at where
these hashmaps were being used in the sparse index logic. These were
being copied across indexes, which seems dangerous. Indeed, clearing
these hashmaps and setting them as not initialized fixes the segfault.

The second commit is a response to a test failure that happens in
`t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to
fail because the underlying `git checkout-index` process fails due to
colliding files. Passing the `-f` flag appears to work, but it's unclear
why this name-hash change causes that change in behavior.
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 26, 2025
mjcheetham pushed a commit to mjcheetham/git that referenced this pull request Mar 26, 2025
An internal customer reported a segfault when running `git
sparse-checkout set` with the `index.sparse` config enabled. I was
unable to reproduce it locally, but with their help we debugged into the
failing process and discovered the following stacktrace:

```
#0  0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125
git-for-windows#1  0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247
git-for-windows#2  0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122
git-for-windows#3  0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638
git-for-windows#4  0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255
git-for-windows#5  0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570)    at sparse-index.c:307
git-for-windows#6  0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46
git-for-windows#7  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#8  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
git-for-windows#9  0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422
git-for-windows#10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456
git-for-windows#11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0,    search_mode=EXPAND_SPARSE) at read-cache.c:556
git-for-windows#12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566
git-for-windows#13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756
git-for-windows#14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860
git-for-windows#15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063
git-for-windows#16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548
git-for-windows#17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808
git-for-windows#18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877
git-for-windows#19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017
git-for-windows#20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 
```

The very bottom of the stack being the `rehash()` method from
`hashmap.c` as called within the `name-hash` API made me look at where
these hashmaps were being used in the sparse index logic. These were
being copied across indexes, which seems dangerous. Indeed, clearing
these hashmaps and setting them as not initialized fixes the segfault.

The second commit is a response to a test failure that happens in
`t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to
fail because the underlying `git checkout-index` process fails due to
colliding files. Passing the `-f` flag appears to work, but it's unclear
why this name-hash change causes that change in behavior.
git-for-windows-ci pushed a commit that referenced this pull request Mar 28, 2025
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Apr 25, 2025
The GitHub's CI workflow uses 'actions/checkout@v4' to checkout the
repository. This action defaults to using the GitHub REST API to obtain
the repository if the `git` executable isn't available.

The step to build Git in the GitHub workflow can be summarized as:

  ...
  - uses: actions/checkout@v4 #1
  - run: ci/install-dependencies.sh #2
  ...
  - run: sudo --preserve-env --set-home --user=builder ci/run-build-and-tests.sh #3
  ...

Step #1, clones the repository, since the `git` executable isn't present
at this step, it uses GitHub's REST API to obtain a tar of the
repository.

Step #2, installs all dependencies, which includes the `git` executable.

Step #3, sets up the build, which includes setting up meson in the meson
job. At this point the `git` executable is present.

This means while the `git` executable is present, the repository doesn't
contain the '.git' folder. To keep both the CI's (GitLab and GitHub)
behavior consistent and to ensure that the build is performed on a
real-world scenario, install `git` before the repository is checked out.
This ensures that 'actions/checkout@v4' will clone the repository
instead of using a tarball. We also update the package cache while
installing `git`, this is because some distros will fail to locate the
package without updating the cache.

Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Jul 7, 2025
In 6aec8d3 (t: refactor tests depending on Perl to print data,
2025-04-03) we have changed some of the tests in t4150 to use sed(1)
instead of Perl. One of the conversions is broken though:

    sed: -e expression #1, char 41: unterminated `s' command

Curiously enough, the test itself still passes. This is caused by a
sequence of failures:

  1. The output of sed(1) is piped into git-update-ref(1), and because
     sed(1) is the upstream command we don't notice that it fails.

  2. git-update-ref(1) does not receive any input and thus won't create
     any references.

  3. We then repack the repository with the configured pseudo merges
     pattern, but as we didn't create any references the pattern doesn't
     match anything.

  4. We use `test_pseudo_merges()` to compute the list of pseudo-merges
     and write it into a file. This file is empty as there are none.

  5. The loop over the pseudo-merges becomes a no-op.

  6. The final test succeeds as well because the number of lines in an
     empty file is obviously the same as the number of unique lines,
     namely zero.

Fix the issue by adding the terminating '|' to the sed(1) command.
Furthermore, make the test a tiny bit more robust by not using it as
part of a pipe.

Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Jul 22, 2025
find_cfg_ent() allocates a struct reflog_expire_entry_option via
FLEX_ALLOC_MEM and inserts it into a linked list in the
reflog_expire_options structure. The entries in this list are never
freed, resulting in a leak in cmd_reflog_expire and the gc reflog expire
maintenance task:

Direct leak of 39 byte(s) in 1 object(s) allocated from:
    #0 0x7ff975ee6883 in calloc (/lib64/libasan.so.8+0xe6883)
    #1 0x0000010edada in xcalloc ../wrapper.c:154
    #2 0x000000df0898 in find_cfg_ent ../reflog.c:28
    #3 0x000000df0898 in reflog_expire_config ../reflog.c:70
    #4 0x00000095c451 in configset_iter ../config.c:2116
    #5 0x0000006d29e7 in git_config ../config.h:724
    #6 0x0000006d29e7 in cmd_reflog_expire ../builtin/reflog.c:205
    #7 0x0000006d504c in cmd_reflog ../builtin/reflog.c:419
    #8 0x0000007e4054 in run_builtin ../git.c:480
    #9 0x0000007e4054 in handle_builtin ../git.c:746
    #10 0x0000007e8a35 in run_argv ../git.c:813
    #11 0x0000007e8a35 in cmd_main ../git.c:953
    #12 0x000000441e8f in main ../common-main.c:9
    #13 0x7ff9754115f4 in __libc_start_call_main (/lib64/libc.so.6+0x35f4)
    #14 0x7ff9754116a7 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x36a7)
    #15 0x000000444184 in _start (/home/jekeller/libexec/git-core/git+0x444184)

Close this leak by adding a reflog_clear_expire_config() function which
iterates the linked list and frees its elements. Call it upon exit of
cmd_reflog_expire() and reflog_expire_condition().

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-for-windows-ci pushed a commit that referenced this pull request Jul 23, 2025
find_cfg_ent() allocates a struct reflog_expire_entry_option via
FLEX_ALLOC_MEM and inserts it into a linked list in the
reflog_expire_options structure. The entries in this list are never
freed, resulting in a leak in cmd_reflog_expire and the gc reflog expire
maintenance task:

Direct leak of 39 byte(s) in 1 object(s) allocated from:
    #0 0x7ff975ee6883 in calloc (/lib64/libasan.so.8+0xe6883)
    #1 0x0000010edada in xcalloc ../wrapper.c:154
    #2 0x000000df0898 in find_cfg_ent ../reflog.c:28
    #3 0x000000df0898 in reflog_expire_config ../reflog.c:70
    #4 0x00000095c451 in configset_iter ../config.c:2116
    #5 0x0000006d29e7 in git_config ../config.h:724
    #6 0x0000006d29e7 in cmd_reflog_expire ../builtin/reflog.c:205
    #7 0x0000006d504c in cmd_reflog ../builtin/reflog.c:419
    #8 0x0000007e4054 in run_builtin ../git.c:480
    #9 0x0000007e4054 in handle_builtin ../git.c:746
    #10 0x0000007e8a35 in run_argv ../git.c:813
    #11 0x0000007e8a35 in cmd_main ../git.c:953
    #12 0x000000441e8f in main ../common-main.c:9
    #13 0x7ff9754115f4 in __libc_start_call_main (/lib64/libc.so.6+0x35f4)
    #14 0x7ff9754116a7 in __libc_start_main@@GLIBC_2.34 (/lib64/libc.so.6+0x36a7)
    #15 0x000000444184 in _start (/home/jekeller/libexec/git-core/git+0x444184)

Close this leak by adding a reflog_clear_expire_config() function which
iterates the linked list and frees its elements. Call it upon exit of
cmd_reflog_expire() and reflog_expire_condition().

Add a basic test which covers this leak. While at it, cover the
functionality from commit commit 3cb22b8 (Per-ref reflog expiry
configuration, 2008-06-15). We've had this support for years, but lacked
any tests.

Co-developed-by: Jeff King <peff@peff.net>
Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
@icebear-ux icebear-ux mentioned this pull request Aug 12, 2025
1 task
dscho added a commit that referenced this pull request Aug 13, 2025
Includes these pull requests:

	#1
	#6
	#10
	#11
	#157
	#212
	#260
	#270

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
dscho added a commit that referenced this pull request Aug 13, 2025
An internal customer reported a segfault when running `git
sparse-checkout set` with the `index.sparse` config enabled. I was
unable to reproduce it locally, but with their help we debugged into the
failing process and discovered the following stacktrace:

```
#0  0x00007ff6318fb7b0 in rehash (map=0x3dfb00d0440, newsize=1048576) at hashmap.c:125
#1  0x00007ff6318fbc66 in hashmap_add (map=0x3dfb00d0440, entry=0x3dfb5c58bc8) at hashmap.c:247
#2  0x00007ff631937a70 in hash_index_entry (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:122
#3  0x00007ff631938a2f in add_name_hash (istate=0x3dfb00d0400, ce=0x3dfb5c58bc8) at name-hash.c:638
#4  0x00007ff631a064de in set_index_entry (istate=0x3dfb00d0400, nr=8291, ce=0x3dfb5c58bc8) at sparse-index.c:255
#5  0x00007ff631a06692 in add_path_to_index (oid=0x5ff130, base=0x5ff580, path=0x3dfb4b725da "<redacted>", mode=33188, context=0x5ff570)    at sparse-index.c:307
#6  0x00007ff631a3b48c in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41f60, base=0x5ff580, depth=2, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:46
#7  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41e80, base=0x5ff580, depth=1, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
#8  0x00007ff631a3b60b in read_tree_at (r=0x7ff631c026a0 <the_repo>, tree=0x3dfb5b41ac8, base=0x5ff580, depth=0, pathspec=0x5ff5a0,    fn=0x7ff631a064e5 <add_path_to_index>, context=0x5ff570) at tree.c:80
#9  0x00007ff631a06a95 in expand_index (istate=0x3dfb00d0100, pl=0x0) at sparse-index.c:422
#10 0x00007ff631a06cbd in ensure_full_index (istate=0x3dfb00d0100) at sparse-index.c:456
#11 0x00007ff631990d08 in index_name_stage_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21, stage=0,    search_mode=EXPAND_SPARSE) at read-cache.c:556
#12 0x00007ff631990d6c in index_name_pos (istate=0x3dfb00d0100, name=0x3dfb0020080 "algorithm/levenshtein", namelen=21) at read-cache.c:566
#13 0x00007ff63180dbb5 in sanitize_paths (argc=185, argv=0x3dfb0030018, prefix=0x0, skip_checks=0) at builtin/sparse-checkout.c:756
#14 0x00007ff63180de50 in sparse_checkout_set (argc=185, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:860
#15 0x00007ff63180e6c5 in cmd_sparse_checkout (argc=186, argv=0x3dfb0030018, prefix=0x0) at builtin/sparse-checkout.c:1063
#16 0x00007ff6317234cb in run_builtin (p=0x7ff631ad9b38 <commands+2808>, argc=187, argv=0x3dfb0030018) at git.c:548
#17 0x00007ff6317239c0 in handle_builtin (argc=187, argv=0x3dfb0030018) at git.c:808
#18 0x00007ff631723c7d in run_argv (argcp=0x5ffdd0, argv=0x5ffd78) at git.c:877
#19 0x00007ff6317241d1 in cmd_main (argc=187, argv=0x3dfb0030018) at git.c:1017
#20 0x00007ff631838b60 in main (argc=190, argv=0x3dfb0030000) at common-main.c:64 
```

The very bottom of the stack being the `rehash()` method from
`hashmap.c` as called within the `name-hash` API made me look at where
these hashmaps were being used in the sparse index logic. These were
being copied across indexes, which seems dangerous. Indeed, clearing
these hashmaps and setting them as not initialized fixes the segfault.

The second commit is a response to a test failure that happens in
`t1092-sparse-checkout-compatibility.sh` where `git stash pop` starts to
fail because the underlying `git checkout-index` process fails due to
colliding files. Passing the `-f` flag appears to work, but it's unclear
why this name-hash change causes that change in behavior.
git-for-windows-ci pushed a commit that referenced this pull request Sep 2, 2025
The fill_packs_from_midx() method was refactored in fcb2205 (midx:
implement support for writing incremental MIDX chains, 2024-08-06) to
allow for preferred packfiles and incremental multi-pack-indexes.
However, this led to some conditions that can cause improperly
initialized memory in the context's list of packfiles.

The conditions caring about the preferred pack name or the incremental
flag are currently necessary to load a packfile. But the context is
still being populated with pack_info structs based on the packfile array
for the existing multi-pack-index even if prepare_midx_pack() isn't
called.

Add a new test that breaks under --stress when compiled with
SANITIZE=address. The chosen number of 100 packfiles was selected to get
the --stress output to fail about 50% of the time, while 50 packfiles
could not get a failure in most --stress runs.

The test case is marked as EXPENSIVE not only because of the number of
packfiles it creates, but because some CI environments were reporting
errors during the test that I could not reproduce, specifically around
being unable to open the packfiles or their pack-indexes.

When it fails under SANITIZE=address, it provides the following error:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==3263517==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000027
==3263517==The signal is caused by a READ memory access.
==3263517==Hint: address points to the zero page.
    #0 0x562d5d82d1fb in close_pack_windows packfile.c:299
    #1 0x562d5d82d3ab in close_pack packfile.c:354
    #2 0x562d5d7bfdb4 in write_midx_internal midx-write.c:1490
    #3 0x562d5d7c7aec in midx_repack midx-write.c:1795
    #4 0x562d5d46fff6 in cmd_multi_pack_index builtin/multi-pack-index.c:305
    ...

This failure stack trace is disconnected from the real fix because the bad
pointers are accessed later when closing the packfiles from the context.

There are a few different aspects to this fix that are worth noting:

 1. We return to the previous behavior of fill_packs_from_midx to not
    rely on the incremental flag or existence of a preferred pack.

 2. The behavior to scan all layers of an incremental midx is kept, so
    this is not a full revert of the change.

 3. We skip allocating more room in the pack_info array if the pack
    fails prepare_midx_pack().

 4. The method has always returned 0 for success and 1 for failure, but
    the condition checking for error added a check for a negative result
    for failure, so that is now updated.

 5. The call to open_pack_index() is removed, but this is needed later
    in the case of a preferred pack. That call is moved to immediately
    before its result is needed (checking for the object count).

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
dscho pushed a commit that referenced this pull request Sep 8, 2025
The fill_packs_from_midx() method was refactored in fcb2205 (midx:
implement support for writing incremental MIDX chains, 2024-08-06) to
allow for preferred packfiles and incremental multi-pack-indexes.
However, this led to some conditions that can cause improperly
initialized memory in the context's list of packfiles.

The conditions caring about the preferred pack name or the incremental
flag are currently necessary to load a packfile. But the context is
still being populated with pack_info structs based on the packfile array
for the existing multi-pack-index even if prepare_midx_pack() isn't
called.

Add a new test that breaks under --stress when compiled with
SANITIZE=address. The chosen number of 100 packfiles was selected to get
the --stress output to fail about 50% of the time, while 50 packfiles
could not get a failure in most --stress runs.

The test case is marked as EXPENSIVE not only because of the number of
packfiles it creates, but because some CI environments were reporting
errors during the test that I could not reproduce, specifically around
being unable to open the packfiles or their pack-indexes.

When it fails under SANITIZE=address, it provides the following error:

AddressSanitizer:DEADLYSIGNAL
=================================================================
==3263517==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000027
==3263517==The signal is caused by a READ memory access.
==3263517==Hint: address points to the zero page.
    #0 0x562d5d82d1fb in close_pack_windows packfile.c:299
    #1 0x562d5d82d3ab in close_pack packfile.c:354
    #2 0x562d5d7bfdb4 in write_midx_internal midx-write.c:1490
    #3 0x562d5d7c7aec in midx_repack midx-write.c:1795
    #4 0x562d5d46fff6 in cmd_multi_pack_index builtin/multi-pack-index.c:305
    ...

This failure stack trace is disconnected from the real fix because the bad
pointers are accessed later when closing the packfiles from the context.

There are a few different aspects to this fix that are worth noting:

 1. We return to the previous behavior of fill_packs_from_midx to not
    rely on the incremental flag or existence of a preferred pack.

 2. The behavior to scan all layers of an incremental midx is kept, so
    this is not a full revert of the change.

 3. We skip allocating more room in the pack_info array if the pack
    fails prepare_midx_pack().

 4. The method has always returned 0 for success and 1 for failure, but
    the condition checking for error added a check for a negative result
    for failure, so that is now updated.

 5. The call to open_pack_index() is removed, but this is needed later
    in the case of a preferred pack. That call is moved to immediately
    before its result is needed (checking for the object count).

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants