Skip to content

Align non-fixed MAP_SHARED file mmap to 2 MiB#98

Open
jserv wants to merge 1 commit into
mainfrom
map-shared-2m
Open

Align non-fixed MAP_SHARED file mmap to 2 MiB#98
jserv wants to merge 1 commit into
mainfrom
map-shared-2m

Conversation

@jserv

@jserv jserv commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Non-fixed MAP_SHARED file-backed mmap allocations whose result landed mid 2 MiB block forced hvf_apply_file_overlay_quiesced to split the containing HVF stage-2 segment at both ends. Back-to-back memfd-style allocations burned segments at roughly two per mmap and could approach GUEST_MAX_HVF_SEGMENTS (256). The earlier multi-segment overlay fix handles correctness; this change reduces the segment-table churn.

find_free_gap and find_free_gap_inner gain a per-call alignment argument. sys_mmap passes BLOCK_2MIB when the allocation actually qualifies for the host overlay fast path (MAP_SHARED, host-page-aligned offset, writable backer) and host_page_size_cached() otherwise. The backing fd is now opened before the gap finder so overlay_fd_writable can gate the alignment decision; every failure path closes the ref.

Alignment is a best-effort placement preference, not a Linux-visible constraint:

  • The cached gap hint advances by host-page granularity so a small MAP_PRIVATE 4 KiB allocation can still occupy the trailing slack of the 2 MiB block.
  • When no 2 MiB-aligned gap is available, sys_mmap retries with host-page alignment instead of returning -ENOMEM.
  • For non-zero addr hints the path probes the exact host-page-aligned hint window first, falls back to 2 MiB alignment in the wider hint range, then to host-page alignment. A guest that pinpoints a free address still gets the address it asked for.

mremap stays at host-page alignment because the destination does not reinstall a file overlay.

Close #94


Summary by cubic

Aligns non-fixed MAP_SHARED file-backed mmap to 2 MiB when eligible to reduce HVF stage-2 segment fragmentation and avoid segment exhaustion. Falls back to host-page alignment when needed and preserves exact hints; resolves #94.

  • Refactors
    • Gap finder now accepts an alignment; sys_mmap uses 2 MiB for eligible MAP_SHARED allocations and retries at host-page if no 2 MiB gap exists; mremap stays host-page aligned.
    • Open the backing fd before placement to check writability and gate 2 MiB alignment; close on all failure paths.
    • Hint handling: probe the exact host-page-aligned window first, then try 2 MiB within the hint range, then host-page; gap hints still advance by host page so small MAP_PRIVATE mappings can use trailing slack. Added tests for hint fallback and back-to-back MAP_SHARED memfd mappings.

Written for commit d8b8952. Summary will update on new commits.

Review in cubic

Non-fixed MAP_SHARED file-backed mmap allocations whose result landed
mid 2 MiB block forced hvf_apply_file_overlay_quiesced to split the
containing HVF stage-2 segment at both ends. Back-to-back memfd-style
allocations burned segments at roughly two per mmap and could approach
GUEST_MAX_HVF_SEGMENTS (256). The earlier multi-segment overlay fix
handles correctness; this change reduces the segment-table churn.

find_free_gap and find_free_gap_inner gain a per-call alignment
argument. sys_mmap passes BLOCK_2MIB when the allocation actually
qualifies for the host overlay fast path (MAP_SHARED, host-page-aligned
offset, writable backer) and host_page_size_cached() otherwise. The
backing fd is now opened before the gap finder so overlay_fd_writable
can gate the alignment decision; every failure path closes the ref.

Alignment is a best-effort placement preference, not a Linux-visible
constraint:
  - The cached gap hint advances by host-page granularity so a small
    MAP_PRIVATE 4 KiB allocation can still occupy the trailing slack of
    the 2 MiB block.
  - When no 2 MiB-aligned gap is available, sys_mmap retries with
    host-page alignment instead of returning -ENOMEM.
  - For non-zero addr hints the path probes the exact host-page-aligned
    hint window first, falls back to 2 MiB alignment in the wider hint
    range, then to host-page alignment. A guest that pinpoints a free
    address still gets the address it asked for.

mremap stays at host-page alignment because the destination does not
reinstall a file overlay.

Close #94
@jserv

jserv commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

@doanbaotrung , Please validate this PR.

cubic-dev-ai[bot]

This comment was marked as resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Align guest MAP_SHARED allocations to stage-2 block boundaries (2 MiB)

1 participant