Skip to content

bugfix in parallel#82

Open
hgoelzer wants to merge 2 commits into
mainfrom
hgoelzer/bugfix_parallel
Open

bugfix in parallel#82
hgoelzer wants to merge 2 commits into
mainfrom
hgoelzer/bugfix_parallel

Conversation

@hgoelzer
Copy link
Copy Markdown
Collaborator

This is a bug discovered by Mariana. It has limited/no impact on physics, but was responsible for failing GNU debug tests with NorESM.
See NorESMhub@b31d336
M: The problem is in the parallel_mpi.F90, subroutine parallel_halo_integer_3d (lines 6740–6763): All eight MPI send/receive calls used mpi_real8 instead of mpi_integer. Since mpi_integer is 4 bytes and mpi_real8 is 8 bytes, MPI was posting receives expecting twice as many bytes as the integer buffers (wrecv, erecv, srecv, nrecv) could hold. This wrote past the end of those stack-allocated buffers, corrupting the heap —
exactly what the malloc assertion failure in glibc's _int_malloc indicates.
The crash surfaces during the parallel_halo(marine_connection_mask_3d, ...) call in glissade_bmlt_float.F90:1391 because that's the first integer 3D halo exchange triggered in this code path.
With the fix in place - the GNU debug test is running now.

@hgoelzer hgoelzer requested review from Katetc and whlipscomb May 16, 2026 19:11
@hgoelzer hgoelzer added the bug Something isn't working label May 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant