Skip to content

Commit

Permalink
BUG#34819861: Gtid_set::Interval_chunk mem incorrectly tracked on THD…
Browse files Browse the repository at this point in the history
…_applier_module_receiver

In a GR cluster setup with a primary and two secondaries, it was observed
that the primary was not releasing all allocated memory related to the
"memory/sql/Gtid_set::Interval_chunk" event name.

The memory used by GR associated to the the key
`memory/sql/Gtid_set::Interval_chunk` is allocated by one thread:
`thread/group_rpl/THD_applier_module_receiver`, but it is released by a
different thread: `thread/group_rpl/THD_certifier_broadcast`.
This was causing a incorrect memory usage relative to
`thread/group_rpl/THD_applier_module_receiver` since only memory
allocations were accounted, making the reported memory always
increasing.
The global memory reported on
performance_schema.memory_summary_global_by_event_name` was not
affected.

In order to avoid the incorrect memory usage, the used memory ownership
is transferred from `thread/group_rpl/THD_applier_module_receiver` when
it becomes out of scope into `thread/group_rpl/THD_certifier_broadcast`
when it becomes part of the scope.

Change-Id: Ieda9c40b52a42f75636004d590440674b2665af9
  • Loading branch information
Anibal Pinto authored and dahlerlend committed Sep 14, 2024
1 parent 4e4ef1d commit a96fd3c
Show file tree
Hide file tree
Showing 6 changed files with 206 additions and 3 deletions.
9 changes: 9 additions & 0 deletions include/prealloced_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,15 @@ class Prealloced_array {
return false;
}

/**
Claim memory ownership.
@param claim take ownership of memory
*/
void claim_memory_ownership(bool claim) {
if (!using_inline_buffer()) my_claim(m_ext.m_array_ptr, claim);
}

/**
Copies an element into the back of the array.
Complexity: Constant (amortized time, reallocation may happen).
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
include/group_replication.inc
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
[connection server1]
include/start_and_bootstrap_group_replication.inc

############################################################
# 1. Populate server with data to alloc memory using
# Gtid_set::Interval_chunk key.
CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;
BEGIN;
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
COMMIT;

############################################################
# 2. Assert values on thread and global counters before
# running garbage collect.
include/assert.inc [Global counters shall been higher than sum of applier and certifier threads.]

############################################################
# 3. Wait for execution of another garbage collection

############################################################
# 4. Assert values on thread and global counters after
# running garbage collect.
include/assert.inc [Global counters shall been higher than sum of applier and certifier threads.]

############################################################
# 5. Cleanup
DROP TABLE t1;
include/group_replication_end.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
################################################################################
# Validate that primary releases all allocated memory related to
# "memory/sql/Gtid_set::Interval_chunk" event name.
#
# Test:
# 0. The test requires one server: M1.
# 1. Populate server with data to alloc memory using
# Gtid_set::Interval_chunk key.
# 2. Assert values on thread and global counters before
# running garbage collect.
# 3. Wait for execution of another garbage collection
# 4. Assert values on thread and global counters after
# running garbage collect.
# 5. Cleanup
################################################################################

--source include/have_nodebug.inc
--source include/not_valgrind.inc
--source include/not_asan.inc
--source include/big_test.inc
--source include/have_group_replication_plugin.inc
--let $rpl_skip_group_replication_start= 1
--source include/group_replication.inc

--source include/start_and_bootstrap_group_replication.inc


--echo
--echo ############################################################
--echo # 1. Populate server with data to alloc memory using
--echo # Gtid_set::Interval_chunk key.

CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;

BEGIN;
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
INSERT INTO t1 (c2) values (2);
COMMIT;

--exec $MYSQL_SLAP --create-schema=test --delimiter=";" --iterations=100 --query="INSERT INTO t1 (c2) SELECT c2 FROM t1 LIMIT 5" --concurrency=1 --silent


--echo
--echo ############################################################
--echo # 2. Assert values on thread and global counters before
--echo # running garbage collect.

--let $applier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_applier_module_receiver"`
--let $certifier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_certifier_broadcast"`
--let $global_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_global_by_event_name WHERE EVENT_NAME = "memory/sql/Gtid_set::Interval_chunk"`

--let $assert_text= Global counters shall been higher than sum of applier and certifier threads.
--let $assert_cond= [SELECT $applier_bytes_used + $certifier_bytes_used] <= $global_bytes_used
--source include/assert.inc


--echo
--echo ############################################################
--echo # 3. Wait for execution of another garbage collection

--let $wait_timeout= 150
--let $wait_condition= SELECT Count_transactions_rows_validating=5 FROM performance_schema.replication_group_member_stats WHERE member_id IN (SELECT @@server_uuid)
--source include/wait_condition.inc


--echo
--echo ############################################################
--echo # 4. Assert values on thread and global counters after
--echo # running garbage collect.

--let $applier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_applier_module_receiver"`
--let $certifier_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_by_thread_by_event_name JOIN performance_schema.threads USING(thread_id) WHERE event_name = "memory/sql/Gtid_set::Interval_chunk" AND name = "thread/group_rpl/THD_certifier_broadcast"`
--let $global_bytes_used= `SELECT CURRENT_NUMBER_OF_BYTES_USED FROM performance_schema.memory_summary_global_by_event_name WHERE EVENT_NAME = "memory/sql/Gtid_set::Interval_chunk"`

--let $assert_text= Global counters shall been higher than sum of applier and certifier threads.
--let $assert_cond= [SELECT $applier_bytes_used + $certifier_bytes_used] <= $global_bytes_used
--source include/assert.inc


--echo
--echo ############################################################
--echo # 5. Cleanup

DROP TABLE t1;

--source include/group_replication_end.inc
56 changes: 53 additions & 3 deletions plugin/group_replication/src/certifier.cc
Original file line number Diff line number Diff line change
Expand Up @@ -499,7 +499,15 @@ void Certifier::clear_certification_info() {
for (Certification_info::iterator it = certification_info.begin();
it != certification_info.end(); ++it) {
// We can only delete the last reference.
if (it->second->unlink() == 0) delete it->second;
if (it->second->unlink() == 0) {
/*
Claim Gtid_set_ref used memory to
`thread/group_rpl/THD_certifier_broadcast` thread, since this is thread
that does release the memory.
*/
it->second->claim_memory_ownership(true);
delete it->second;
}
}

certification_info.clear();
Expand Down Expand Up @@ -719,6 +727,15 @@ Certification_result Certifier::add_writeset_to_certification_info(
item_previous_sequence_number != parallel_applier_sequence_number)
transaction_last_committed = item_previous_sequence_number;
}
/*
The memory used by Gtid_set_ref is allocated by
`thread/group_rpl/THD_applier_module_receiver`, though it will be released
by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid untracked
memory release on `thread/group_rpl/THD_applier_module_receiver` we do
dissociate this used memory from this thread.
*/
snapshot_version_value->claim_memory_ownership(false);

return Certification_result::positive;
}

Expand Down Expand Up @@ -1043,7 +1060,15 @@ bool Certifier::add_item(const char *item, Gtid_set_ref *snapshot_version,
*item_previous_sequence_number =
it->second->get_parallel_applier_sequence_number();

if (it->second->unlink() == 0) delete it->second;
if (it->second->unlink() == 0) {
/*
Claim Gtid_set_ref used memory to
`thread/group_rpl/THD_certifier_broadcast` thread, since this is thread
that does release the memory.
*/
it->second->claim_memory_ownership(true);
delete it->second;
}

it->second = snapshot_version;
error = false;
Expand Down Expand Up @@ -1232,7 +1257,15 @@ void Certifier::garbage_collect_internal(Gtid_set *executed_gtid_set,

while (it != certification_info.end()) {
if (it->second->is_subset_not_equals(stable_gtid_set)) {
if (it->second->unlink() == 0) delete it->second;
if (it->second->unlink() == 0) {
/*
Claim Gtid_set_ref used memory to
`thread/group_rpl/THD_certifier_broadcast` thread, since this is
thread that does release the memory.
*/
it->second->claim_memory_ownership(true);
delete it->second;
}
certification_info.erase(it++);
} else
++it;
Expand Down Expand Up @@ -1750,6 +1783,15 @@ bool Certifier::set_certification_info_part(
value->link();
certification_info.insert(
std::pair<std::string, Gtid_set_ref *>(key, value));
/*
The memory used by Gtid_set_ref is allocated by
`thread/group_rpl/THD_applier_module_receiver`, though it will be
released by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid
untracked memory release on
`thread/group_rpl/THD_applier_module_receiver` we do dissociate this
used memory from this thread.
*/
value->claim_memory_ownership(false);
}

return false;
Expand Down Expand Up @@ -1999,6 +2041,14 @@ int Certifier::set_certification_info(
value->link();
certification_info.insert(
std::pair<std::string, Gtid_set_ref *>(key, value));
/*
The memory used by Gtid_set_ref is allocated by
`thread/group_rpl/THD_applier_module_receiver`, though it will be released
by `thread/group_rpl/THD_certifier_broadcast` thread. To avoid untracked
memory release on `thread/group_rpl/THD_applier_module_receiver` we do
dissociate this used memory from this thread.
*/
value->claim_memory_ownership(false);
}

if (initialize_server_gtid_set()) {
Expand Down
8 changes: 8 additions & 0 deletions sql/rpl_gtid.h
Original file line number Diff line number Diff line change
Expand Up @@ -1593,6 +1593,14 @@ class Gtid_set {
public:
/// Destroy this Gtid_set.
~Gtid_set();

/**
Claim ownership of memory.
@param claim claim ownership of memory.
*/
void claim_memory_ownership(bool claim);

/**
Removes all gtids from this Gtid_set.
Expand Down
11 changes: 11 additions & 0 deletions sql/rpl_gtid_set.cc
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,17 @@ Gtid_set::Gtid_set(Tsid_map *_tsid_map, const char *text,
*status = add_gtid_text(text);
}

void Gtid_set::claim_memory_ownership(bool claim) {
m_intervals.claim_memory_ownership(claim);

Interval_chunk *chunk = chunks;
while (chunk != nullptr) {
Interval_chunk *next_chunk = chunk->next;
my_claim(chunk, claim);
chunk = next_chunk;
}
}

void Gtid_set::init() {
DBUG_TRACE;
has_cached_string_length = false;
Expand Down

0 comments on commit a96fd3c

Please sign in to comment.