Tags: MironAtHome/yugabyte-db
Tags
[BACKPORT 2025.1][PLAT-18057]: Add YSQL migration file based validati… …on for upgrades and backup-restore Summary: Added changes to validate ysql migration file based validaiton for upgrades and backup-restore. During upgrade, YBA will validate that the target db version should have all the previous version ysql migration files and incase of restore, the restore universe should have all the files that were present on the universe during backup. Incase universe is in pre-finalize state, we will capture ysql migration file from older version and also during restore if universe is in pre-finalize state, we will use older db version ysql migration files for comparison Original diff/commit: fff877d/D45263 Test Plan: Tested manually and added unit test Reviewers: svarshney, hsunder Reviewed By: svarshney Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D45634
[BACKPORT 2025.1][PLAT-18174] Break out of infinite loop for Ybc upgr… …ades by removing deleted universes Summary: Original commit: 75df770 / D45595 On dev-portal we noticed high cpu usage by YbcUpgrade thread. This was happening because universes to upgraded are added to `ybcUpgradeUniverseSet` and never removed if they are deleted. Also this was a `while` loop as opposed to being a simple if check. ## Why don't we need the `while` loop anymore ? This is because we are already doing a `waitForYbc` call which has 20 retries. So we don't explicitly need to retry the `pollUpgradeTaskResult` again. This is the same as the K8s universes case. ## Fix - Modifying the while loop to an if conditional - Removing the universe from `ybcUpgradeUniverseSet` once its deleted. Test Plan: No tests Reviewers: nsingh, vbansal Reviewed By: nsingh Differential Revision: https://phorge.dev.yugabyte.com/D45624
[BACKPORT 2024.2][PLAT-18174] Break out of infinite loop for Ybc upgr… …ades by removing deleted universes Summary: Original commit: 75df770 / D45595 On dev-portal we noticed high cpu usage by YbcUpgrade thread. This was happening because universes to upgraded are added to `ybcUpgradeUniverseSet` and never removed if they are deleted. Also this was a `while` loop as opposed to being a simple if check. ## Why don't we need the `while` loop anymore ? This is because we are already doing a `waitForYbc` call which has 20 retries. So we don't explicitly need to retry the `pollUpgradeTaskResult` again. This is the same as the K8s universes case. ## Fix - Modifying the while loop to an if conditional - Removing the universe from `ybcUpgradeUniverseSet` once its deleted. Test Plan: No tests Reviewers: nsingh, vbansal Reviewed By: nsingh Differential Revision: https://phorge.dev.yugabyte.com/D45625
[yugabyte#27344] docdb: Add metrics for fastpath object locking Summary: This revision adds two metrics: - object_locking_lock_acquires - the total number of object lock acquires (from both Lock() calls and fastpath lock requests consumed). - object_locking_fastpath_acquires - the total number of fastpath object lock acquires consumed. Jira: DB-16853 Test Plan: Jenkins. Verified metrics on local cluster with enable_object_lock_fastpath on/off. Reviewers: bkolagani Reviewed By: bkolagani Subscribers: ybase Differential Revision: https://phorge.dev.yugabyte.com/D45525
[yugabyte#28044][yugabyte#28040][yugabyte#28000] YSQL: Fix SIGTERM in… … table cache invalidation on abort Summary: When Transactional DDL is enabled, the transaction rollback can lead to a crash with SIGTERM due to many retries. This can only happen in case the transaction contained an `ALTER TABLE` statement. The crash happens when we receive a `kConflict` error while invalidating the table cache of the tables that have been altered in the transaction block. The invalidation logic was introduced in D42574/1b8fe192d2ac0d2253cb45bb1d1b28dd821ab05f. The invalidation is necessary to avoid schema mismatch errors since rollback of an alter table statement can also increment the schema version. More details in the linked revision/commit. Invalidation simply involves deletion from the `table_cache_` map stored in `PgSession`. To do this, we remember the `relid` of all tables that have been altered during the transaction (`YbTrackAlteredTableId`). During abort, we extract the `database_oid` and `relfilenode_id` from the `relid` and then construct the `PgObjectId` from them which acts as the key for the `table_cache_`. This extract can lead to a read operation on the relcache and hence on `pg_class`. As a result, it is prone to receiving `kConflict` errors due to concurrent DDLs. This is easily reproducible in the `PgDdlAtomicityStressTest.Main` unit test which does concurrent DDLs. This revision sidesteps from the issue by directly remembering the `database_oid` and `relfilenode_id` during the alter table execution. As a result, the invalidation during the abort path doesn't require any read operations on catalog and we can directly delete from the map. This revision also enables transactional DDL in ddl_atomicity tests. Jira: DB-17665, DB-17663, DB-17619 Test Plan: All DDL atomicity tests. `./yb_build.sh --cxx-test pgwrapper_pg_ddl_atomicity_stress-test --gtest_filter PgDdlAtomicityStressTest.Main/colocated` Reviewers: fizaa, kramanathan, #db-approvers Reviewed By: fizaa, #db-approvers Subscribers: svc_phabricator, yql Differential Revision: https://phorge.dev.yugabyte.com/D45453
[yugabyte#27825] YSQL: Send AuthOK message received from Auth Backend… … immediately Summary: When we connect to an authentication backend, connection manager relays messages back and forth from the server and external client, acting transparently. This is not, however, the case when `AuthenticationOK` type message comes. This is sent by Postgres at the end of the auth flow to indicate that the authentication has been successfully performed. Instead of sending it back to the external client directly, we are ignoring it and then sending it after `yb_auth_via_auth_backend` returns. There doesn't seem to be any reason for the current implementation. This diff immediately relays back the AuthOK message to the client. This is an innocuous change for now -- it was working earlier and will work now (of course this seems easier to understand). However, when we decide to not buffer the ParameterStatus packets from Postgres and instead relay them immediately to the external client, this becomes a problem. The Postgres driver expects AuthOK message first before receiving ParameterStatus messages. When we buffer AuthOK and relay ParameterStatus, this confuses the driver which errors out. Jira: DB-17419 Test Plan: Verified that the change works locally. Also ran a few Connection Manager Java tests locally to verify that auth still works. Jenkins: all tests Reviewers: skumar, asrinivasan, rbarigidad, mkumar, vikram.damle Reviewed By: rbarigidad, mkumar Subscribers: svc_phabricator, yql Differential Revision: https://phorge.dev.yugabyte.com/D45052
[BACKPORT 2025.1][yugabyte#28084] YSQL: [pg15 upgrade] Use RPC bind a… …ddress on master Summary: Use RPC bind address on master just like we do on tserver. This is needed for kubernetes like deployments where the node_name and bind addresses are different, and the cert name is not the hostname. Jira: DB-17714 Original commit: eaca126 / D45608 Test Plan: jenkins: urgent Reviewers: fizaa, smishra, anijhawan Reviewed By: anijhawan Subscribers: ybase Differential Revision: https://phorge.dev.yugabyte.com/D45618
[BACKPORT 2025.1.0][PLAT-18093] Extract azcopy/gutil in case they are… … present Summary: We recently started packaging the Node Agent on architectures basis. As part of this effort, we now filter out x86-based dependencies when building for ARM. Previously, we were installing the AMD64 version of azcopy on ARM machines, which was not appropriate. Since these utilities are no longer used on the master for backup, it is safe to conditionally extract them only if they are present. Test Plan: Manually created aarm based universe. Reviewers: vbansal, anijhawan, nsingh, anabaria Reviewed By: anabaria Differential Revision: https://phorge.dev.yugabyte.com/D45630
[yugabyte#27604] YSQL: Fix DROP CASCADE involving stored generated co… …lumns Summary: Currently, DROP CASCADE on the source column of a generated column does not drop the generated column from docdb (it only drops it from the YSQL catalog). Consequently, ``` CREATE TABLE table1(id INT, c1 INT, stored_col INT GENERATED ALWAYS AS (c1 * 2) STORED); ALTER TABLE table1 DROP COLUMN c1 CASCADE; ALTER TABLE table1 ADD COLUMN stored_col INT; ``` throws `ERROR: The column already exists: stored_col`. This revision fixes this issue by dropping the generated column from the docdb as well. Jira: DB-17187 Test Plan: ./yb_build.sh --java-test 'org.yb.pgsql.TestPgRegressMisc#testPgRegressMiscIndependent' Reviewers: fizaa, myang Reviewed By: fizaa Subscribers: yql Differential Revision: https://phorge.dev.yugabyte.com/D44769
[PLAT-18057]: Add YSQL migration file based validation for upgrades a… …nd backup-restore Summary: Added changes to validate ysql migration file based validaiton for upgrades and backup-restore. During upgrade, YBA will validate that the target db version should have all the previous version ysql migration files and incase of restore, the restore universe should have all the files that were present on the universe during backup. Incase universe is in pre-finalize state, we will capture ysql migration file from older version and also during restore if universe is in pre-finalize state, we will use older db version ysql migration files for comparison Test Plan: Tested manually and added unit test Reviewers: svarshney, hsunder Reviewed By: svarshney Subscribers: yugaware Differential Revision: https://phorge.dev.yugabyte.com/D45263
PreviousNext