Skip to content

Read comp_size, decomp_size and local_header_ofs for zip64 if extra fields exist#324

Open
larryliu0820 wants to merge 2 commits into
richgel999:masterfrom
larryliu0820:patch-1
Open

Read comp_size, decomp_size and local_header_ofs for zip64 if extra fields exist#324
larryliu0820 wants to merge 2 commits into
richgel999:masterfrom
larryliu0820:patch-1

Conversation

@larryliu0820

@larryliu0820 larryliu0820 commented Nov 7, 2024

Copy link
Copy Markdown

As titled. For a zip64 archive file, comp_size, decomp_size and local_header_ofs are not stored in p + MZ_ZIP_CDH_COMPRESSED_SIZE_OFS, instead they are being stored in pExtra_data.

Here we need to read uint64 for comp_size, decomp_size and local_header_ofs. Good that they are already of that type.

Reference: https://en.wikipedia.org/wiki/ZIP_(file_format)#ZIP64

As titled. For a zip64 archive file, `comp_size` and `decomp_size` are not stored in `p + MZ_ZIP_CDH_COMPRESSED_SIZE_OFS`, instead they are being stored in `pExtra_data`.

Here we need to read `uint64` for `comp_size` and `decomp_size`. Good that they are already of that type.
@larryliu0820 larryliu0820 changed the title Read comp_size and decomp_size for zip64 Read comp_size, decomp_size and local_header_ofs for zip64 if extra fields exist Nov 7, 2024
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
…d patch for zip64"

Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
…d patch for zip64"

Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D65586230](https://our.internmc.facebook.com/intern/diff/D65586230)

[ghstack-poisoned]
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: [D65586230](https://our.internmc.facebook.com/intern/diff/D65586230)

[ghstack-poisoned]
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan: Rely on unit test

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 4501669
Pull Request resolved: #139985
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:
Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Differential Revision: D65586230
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 7, 2024
Summary:
Pull Request resolved: #140041

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Differential Revision: D65586230
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 8, 2024
Summary:
Pull Request resolved: #140041

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Differential Revision: D65586230
larryliu0820 added a commit to pytorch/pytorch that referenced this pull request Nov 8, 2024
Summary:
Pull Request resolved: #140041

Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Reviewed By: mikaylagawarecki, angelayi

Differential Revision: D65586230
pytorchmergebot pushed a commit to pytorch/pytorch that referenced this pull request Nov 9, 2024
Summary:
Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* #79636 patches internal BUCK and bazel build
* #138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Differential Revision: D65586230

Pull Request resolved: #140041
Approved by: https://github.com/mikaylagawarecki
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
…140041)

Summary:
Bump miniz version from 2.1.0 to 3.0.2 and apply these patches:

* pytorch#79636 patches internal BUCK and bazel build
* pytorch#138959 adds `bool compute_crc32` argument
* miniz PR: richgel999/miniz#324 to support
  zip64

Anyone bumping miniz version again, please apply these patches as well.

Test Plan:
Rely on unit test

Imported from OSS

Differential Revision: D65586230

Pull Request resolved: pytorch#140041
Approved by: https://github.com/mikaylagawarecki
@uroni

uroni commented Mar 9, 2025

Copy link
Copy Markdown
Collaborator

Why do you need the correct zip64 values at that location? As far as I can see at this point they are just used for some checks and deciding if it is a zip64 file or not.
The correct values would be read via mz_zip_file_stat_internal

@benjaminwinger

Copy link
Copy Markdown

Why do you need the correct zip64 values at that location? As far as I can see at this point they are just used for some checks and deciding if it is a zip64 file or not.

While I'm not sure of @larryliu0820's motivations (maybe just to make sure that the check fails correctly when the zip64 fields are corrupt) I've run into an issue which would be fixed by this (it might also be related to #301).

When zip64 attributes are used for archive sizes of less than 2^32 bytes there is a false positive with the following check, since MZ_READ_LE32(p + MZ_ZIP_CDH_LOCAL_HEADER_OFS) is always UINT32_MAX when the value is stored in a separate zip64 field (but the comp_size may not be large enough to also need to be stored in a separate zip64 field):

miniz/miniz_zip.c

Lines 935 to 939 in c883286

if (comp_size != MZ_UINT32_MAX)
{
if (((mz_uint64)MZ_READ_LE32(p + MZ_ZIP_CDH_LOCAL_HEADER_OFS) + MZ_ZIP_LOCAL_DIR_HEADER_SIZE + comp_size) > pZip->m_archive_size)
return mz_zip_set_error(pZip, MZ_ZIP_INVALID_HEADER_OR_CORRUPTED);
}

This only occurs when zip64 fields are used with an archive size of less than about 2^32 bytes, since the sanity check will always succeed if the UINT32_MAX flag value indicating the true value is in an extended field is more than a little smaller than the archive size.

I ran into this with cpython's zipfile implementation, which uses zip64 for values larger than 2^31-1, so zip archives between 2-4GB produced by it will exhibit this problem (though I think they are valid according to the spec since they are marked as zip64 archives, even though it's unnecessary to use the separate field to store the offset if it's smaller than 2^32-1).

I was going to submit a fix (59ca512), but this PR is a better solution (on the other hand, this PR would change the meaning of the comp_size != MZ_UINT32_MAX checks, which I think you would want to remove so you don't get false negatives if the comp_size as stored in the extended field happens to be UINT32_MAX).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants