While a standard Apache Kafka cluster can be stretched across three full datacenters to provide high availability, many organizations are restricted to operating only two primary datacenters for their production data, often for regulatory or compliance reasons. This creates a challenge for implementing robust disaster recovery.
This Proof of Concept (PoC) demonstrates a solution to this specific problem: the 2.5 Datacenter (DC) architecture.
The game-changing aspect of this pattern lies in its data plane design, which provides extreme resilience while the control plane remains largely unchanged from vanilla Kafka. This architecture is particularly useful for organizations that must maintain their data strictly within two primary datacenters but still require automated, near-instantaneous failover. This is achieved by separating the data plane and control plane across the locations. The data plane (brokers hosting replicas and observers) exists only in the two primary DCs. The control plane's resilience is provided by placing a single KRaft controller in a third, lightweight location. This "tie-breaker" controller does not host any brokers or data; its sole purpose is to participate in quorum and prevent a split-brain scenario if one of the primary DCs is lost. This allows the cluster to maintain control plane consensus and then use the data plane's Automatic Observer Promotion (AOP) feature within the surviving primary DC to ensure continuous data availability.
The primary purpose of this PoC is to showcase how this specific architecture achieves:
- Recovery Point Objective (RPO) of 0: Guaranteeing zero data loss.
- Recovery Time Objective (RTO) of near-zero: Ensuring the system recovers automatically in seconds.
For more technical details on the pattern, see the Confluent blog post: Automatic Observer Promotion for Safe Multi-Datacenter Failover.
- Automated Provisioning: End-to-end, hands-off deployment of a complete Confluent cluster on AWS using Terraform for infrastructure and Ansible for software configuration.
- Multi-Datacenter Architecture: Deployment of Kafka brokers across multiple failure domains (simulating different datacenters or availability zones) with a topic created using constraint-based replica placement.
- Automatic Observer Promotion: Leverages Confluent Platform's
under-min-isrpromotion policy to automatically promote an observer to a full replica when a partition becomes under replicated, ensuring cluster availability. - High Availability & Durability: Demonstrates a configuration (
min.insync.replicas=3) that guarantees data is replicated across datacenters before being acknowledged by the producer. - Disaster Simulation: Includes scripts to simulate a total DC failure, allowing for a clear demonstration of the platform's resilience and automatic recovery capabilities.
The environment is built on AWS and managed entirely through code.
- Cloud Provider: AWS
- Infrastructure as Code: Terraform
- Configuration Management: Ansible (via the
confluent.platformcollection) - Streaming Platform: Confluent Platform
The physical layout of the cluster is defined by assigning a rack.id to each Kafka broker upon startup. These racks are distributed across the primary and disaster recovery datacenters. This PoC uses the following rack assignments for its brokers:
prod.0: A rack within the primary datacenter.dr.0: A rack within the disaster recovery datacenter.prod.1: A second rack within the primary datacenter.dr.1: A second rack within the disaster recovery datacenter.
The placement of data replicas on these brokers is then governed by the rules in replicas.json. This file defines how many replicas of a partition should be normal replicas versus observer replicas (a specific type of replica in Confluent MRC). This ensures that data is safely distributed across the racks and datacenters to guarantee availability during a failure.
- An AWS account with programmatic credentials configured for your environment.
- Terraform CLI installed.
- Clone the repository (if you haven't already).
- (Optional) Review and adjust any Terraform variables in
variables.tfto suit your needs (e.g., AWS region). - Execute the master setup script:
Note: This script will run
./setup.sh
terraform apply, which will automatically generate an SSH private key (confluent-key.pem) and save it in the project root. This key is used for SSH access to the provisioned EC2 instances. This single command will perform all necessary actions:- Initialize Terraform (
terraform init). - Provision the AWS infrastructure, including networking and EC2 instances (
terraform apply). - Copy the necessary scripts to the bastion host.
- Run the Ansible playbook from the bastion to deploy and configure the Confluent Platform on all server nodes.
- Initialize Terraform (
The process will take several minutes. Once it completes, your multi-DC Confluent Platform will be running and ready for the demonstration.
All demonstration commands should be run from your local machine. They use the ssh_bastion.sh script to execute commands on the correct remote host.
1. Start the Producer Workload
This script creates a durable topic and starts a performance test that sends 100 million messages to the cluster.
./ssh_bastion.sh ./start-perf.sh2. Simulate a Datacenter Failure
While the producer is running, open a new terminal and run the kill script. This will force-kill the Kafka brokers in one of the datacenters.
./ssh_bastion.sh ./kill_123.sh3. Observe the Result
Watch the output of the start-perf.sh script. You may see a brief pause or a few retries, but the producer will continue running without interruption and without data loss as the observer is promoted and a new leader is elected. This demonstrates the near-zero RTO.
4. Recover the "Failed" Datacenter
You can bring the failed brokers back online with the following command:
./ssh_bastion.sh ./start_123.shThe recovered nodes will rejoin the cluster and begin replicating data again.
Below is a visual representation of the 2.5 DC architecture:
+------------------------+
| User's Local Machine |
| |
| [ Terraform CLI ] |
+-----------+------------+
|
| 1. Provisions (AWS API)
| 2. Executes (SSH)
v
+-----------+------------+
| [ Bastion Host ] |
+------------------------+
|
| Manages
|
+-----------v-----------------------------------------------------------------------------------+
| AWS Cloud |
| |
| +-----------------------------+ +-----------------------------+ +-------------------------+ |
| | DC 1 (Prod Region) | | DC 2 (DR Region) | | DC 3 (Tie-breaker) | |
| | | | | | | |
| | +---------------------+ | | +---------------------+ | | +-------------------+ | |
| | | Control Plane |<--|--|-->| Control Plane | | | | Control Plane | | |
| | | | | | | | | | | | | |
| | | [KRaft Controller] |---.------| [KRaft Controller] |---.---->| [KRaft Controller]| | |
| | +---------------------+ | | +---------------------+ | | | (Tie-breaker) | | |
| | | | | | +---------^---------+ | |
| | +---------------------+ | | +---------------------+ | | | | |
| | | Data Plane | | | | Data Plane | | | | Quorum | |
| | | | | | | | | | | Vote | |
| | | [Broker rack=prod.0]| | | | [Broker rack=dr.0] | | | | | |
| | | -> Full Replicas | | | | -> Full Replicas | | | | | |
| | | [Broker rack=prod.1]| | | | [Broker rack=dr.1] | | | | | |
| | | -> Obs. Replicas | | | | -> Obs. Replicas | | | | | |
| | +---------------------+ | | +---------------------+ | | | | |
| | | | | | | | |
| +-----------------------------+ +-----------------------------+ +-------------------------+ |
| |
+-----------------------------------------------------------------------------------------------+
It is crucial to dispose of the AWS resources provisioned by this PoC when you are finished, to avoid incurring unnecessary cloud costs.
You can destroy all resources created by Terraform by running the following command from your local machine, in the root of this project:
terraform destroy -auto-approveThis command will prompt you for confirmation before proceeding. Once executed, Terraform will de-provision all the AWS infrastructure (EC2 instances, VPC, subnets, security groups, etc.) that were set up for this demonstration.