8 min readJust now
–
Written by:
and
Why We Conducted This Comparison
Cross-region replication of Kafka data is essential for disaster recovery, migrations, multi-region deployments, and data locality.
When replicating between Amazon MSK clusters across AWS regions, organizations need solutions that deliver low latency, high throughput, and operational reliability, without unnecessary complexity.
Apache Kafka MirrorMaker2 (MM2) has been the industry-standard tool for Kafka replication for years. However, as detailed in The Hidden Complexity of MirrorMaker 2, MM2 introduces significant operational overhead, particularly around offset management. The distinction between offset replication and offset translat…
8 min readJust now
–
Written by:
and
Why We Conducted This Comparison
Cross-region replication of Kafka data is essential for disaster recovery, migrations, multi-region deployments, and data locality.
When replicating between Amazon MSK clusters across AWS regions, organizations need solutions that deliver low latency, high throughput, and operational reliability, without unnecessary complexity.
Apache Kafka MirrorMaker2 (MM2) has been the industry-standard tool for Kafka replication for years. However, as detailed in The Hidden Complexity of MirrorMaker 2, MM2 introduces significant operational overhead, particularly around offset management. The distinction between offset replication and offset translation, combined with the complexities of IdentityReplicationPolicy vs DefaultReplicationPolicy, can lead to unexpected behavior during failover scenarios, potentially causing message loss, duplication, or consumers reprocessing days of data. AWS offers MM2 packaged as a managed service as their MSK Replicator service and inherits the same (and more) of the challenges of MirrorMaker2.
Lenses K2K (Kafka-to-Kafka) was designed to address these operational challenges while maintaining or improving performance. K2K eliminates the complexity that plagues MM2 deployments, providing straightforward configuration and reliable failover behavior.
This comprehensive performance comparison was conducted to answer a critical question: Can K2K match or exceed MM2’s performance while delivering the operational simplicity that production teams need?
Note: The tests performed with K2K were with the standalone version of the product. Lenses offers a premium developer experience for managing K2K in it’s DevX solution, this includes everything from offset management to monitoring.
Key Findings at a Glance
Our testing across three distinct scenarios reveals that both solutions achieve 100% reliability across all test conditions. However, K2K consistently delivers:
- 14–32% lower latency across all test scenarios (143ms vs 166ms baseline, 142ms vs 209ms in EOS scenarios)
- 51–78% faster producer writes (26ms vs 79ms baseline, 17ms vs 78ms EOS) critical for write-heavy workloads
- 16% higher throughput for the same available resources (7.81K vs 6.75K rd/s high throughput)
- 5x better batching efficiency with EOS (419KB vs 85KB average batch size)
This analysis provides actionable insights for teams choosing between MM2’s ecosystem integration and K2K’s performance alongside its operational simplicity.
TLDR;
Choose K2K if:
- You need lower latency and higher throughput for the same given resources
- You want a better performing exactly once replication
- You want simpler operations
- You’re setting up new replication infrastructure
Choose MM2 if:
- You’re already heavily invested in Kafka Connect ecosystem
- You have deep MM2 operational expertise
- You can accept the operational complexity for ecosystem benefits
Both solutions are production-ready with 100% reliability across all test scenarios.
Table of Contents
- Why We Conducted This Comparison
- Key Findings at a Glance
- Test Methodology and Environment
- Test Scenarios Overview
- Test 1: At least once (6 MB/s)
- Test 2: Exactly Once (6 MB/s)
- Test 3: Resource Limited Exactly Once (6 MB/s)
- Performance Summary Matrix
- Recommendations
- Conclusion
- Test Methodology & References
Test Methodology and Environment
Infrastructure Setup
All tests were conducted replicating data from AWS MSK Express in us-east-2 to AWS MSK in eu-west-1, representing a typical cross-region disaster recovery scenario. Both solutions were tested with identical parallelism (2 replicas/tasks) to ensure fair comparison and eliminate configuration bias. In the MM2 world scaling happens with tasks as per the Kafka Connect framework. K2K is scaled using Kubernetes replicas.
Press enter or click to view image in full size
Press enter or click to view image in full size
Infrastructure setup
Test Scenarios Overview
We conducted four comprehensive test scenarios to evaluate performance across different workload patterns commonly encountered in production environments:
- **Test 1: At Least Once (6 MB/s) **— Baseline performance without EOS, 2 replicas/tasks, 1KB messages
- Test 2: Exactly Once (6 MB/s) — EOS enabled with 2 replicas/tasks, optimized for exactly-once semantics
- **Test 3: Resource Limited Exactly Once (6 MB/s) **— EOS enabled with single replica/task, resource-constrained scenario
Each scenario tests different aspects of cross-region replication: latency sensitivity, throughput capacity, exactly-once semantic guarantees, and resource efficiency.
Test 1: At Least Once (6 MB/s)
This baseline test evaluated both solutions without exactly-once semantics (EOS) enabled, representing typical production workloads where latency and throughput are prioritized over strict ordering guarantees.
Test Configuration
Both solutions were configured with 2 replicas/tasks and identical consumer/producer settings:
- Consumer: max.poll.records=100,000, fetch.max.bytes=10 MB
- Producer: acks=1, batch.size=50 MB, max.in.flight.requests=5
- Load: ~6,000 records/sec (1KB messages)
Performance Results
Press enter or click to view image in full size
Throughput — 6MB/sE2E Latency — 6MB/sReplication Lag — 6MB/s
Key Findings
K2K Advantages:
- 14% lower E2E latency (143ms vs 166ms) — critical for real-time applications
- 67% faster producer writes (26ms vs 79ms) — significantly more efficient write path
MM2 Advantages:
- 21% better consumer fetch efficiency (94ms vs 119ms) — more efficient read path
Both Solutions:
- 100% reliability with zero errors and zero retries
- Near-zero lag throughout the test duration
- Production-ready performance characteristics
Conclusion
K2K delivers superior latency and throughput for medium-throughput workloads, with 14% lower E2E latency and 67% faster producer writes. Both solutions are production-ready with perfect reliability, making K2K the preferred choice when latency is priority.
Test 2: Exactly Once (EOS)
This test evaluated both solutions with exactly-once semantics (EOS) enabled using 2 replicas/tasks, representing production workloads requiring strict message ordering and no duplicates.
Test Configuration
Both solutions used identical EOS settings for fair comparison:
- EOS: Enabled with exactlyOnce: enabled (K2K) / exactly.once.support: required (MM2)
- Producer: acks=all, enable.idempotence=true, max.in.flight=1, batch.size=512 KB, compression.type=lz4
- Consumer: isolation.level=read_committed, max.poll.records=2,000
- Parallelism: 2 replicas (K2K) / 2 tasks (MM2)
- Load: ~6 MB/s (~6,000 records/sec, 1KB messages)
Performance Results
Press enter or click to view image in full size
Throughput — 6MB/s — EOSReplication Lag — 6MB/s — EOSE2E Latency — 6MB/s — EOS
Key Findings
K2K Advantages:
- **32% lower E2E latency **(142ms vs 209ms) — dramatically better for EOS workloads
- 78% faster producer writes (17ms vs 78ms) — 4.6x more efficient
- 5x larger batches (419KB vs 85KB) — significantly better batching efficiency
- 63% more records per request (433 vs 265) — better throughput per operation.
MM2 Advantages:
- Marginally better lag metrics (39 max vs 62 max)
Conclusion
K2K delivers significantly better EOS performance with 32% lower latency, 78% faster producer writes, and 5x better batching efficiency. Both solutions achieve 100% reliability with zero errors, making K2K the clear choice for EOS workloads requiring optimal latency and efficiency.
Test 3: Resource Limited Exactly Once (6 MB/s)
This test evaluated both solutions with EOS enabled using a single replica/task, representing resource-constrained environments or cost-optimized deployments.
Test Configuration
Both solutions used identical EOS settings with single instance:
- EOS: Enabled with exactlyOnce: enabled (K2K) / exactly.once.support: required (MM2)
- Producer: acks=all, enable.idempotence=true, max.in.flight=1, batch.size=512 KB, compression.type=lz4
- Consumer: isolation.level=read_committed, max.poll.records=2,000
- Parallelism: 1 replica (K2K) / 1 task (MM2)
- Load: ~6 MB/s (~6,000 records/sec, 1KB messages)
Performance Results
Press enter or click to view image in full size
Replication Lag — 6MB/s — EOS — Limited ResourcesE2E Latency — 6MB/s — EOS — Limited Resources
Key Findings
K2K Advantages:
- 21% lower E2E latency (632ms vs 802ms) — better single-instance efficiency
- 75% faster producer writes (22ms vs 87ms) — 4x more efficient
- **17% higher throughput **(5.7K vs 4.9K rd/s) — better resource utilization
- **48% larger batches **(413KB vs 278KB) — more efficient batching
- Better efficiency for the same amount of compute resources
Single Instance Overhead:
- Both solutions show significantly higher latency than 2-replica configurations
- Recommendation: Use 2+ replicas/tasks for production EOS workloads to achieve sub-200ms latency
Conclusion
K2K delivers significantly better single-instance EOS performance with 21% lower latency, 75% faster producer writes. However, both solutions require 2+ replicas/tasks for production-grade latency targets. K2K’s superior single-instance efficiency makes it the preferred choice for resource-constrained EOS deployments.
Overall Performance Comparison
Press enter or click to view image in full size
Key Performance Metrics
Press enter or click to view image in full size
The Verdict
For cross-region MSK replication scenarios, K2K delivers superior performance with significantly simpler operations. K2K consistently outperforms MM2 in latency, throughput, and producer efficiency while eliminating the complexity that complicates MM2 deployments.
MM2 remains a solid choice for organizations who are not mature in running workloads on Kubernetes and prefer Kafka Connect to run data integration and replication. However, teams should be aware of the operational overhead, particularly around offset management and failover scenarios, as detailed in The Hidden Complexity of MirrorMaker 2.
Both solutions achieve 100% reliability across all test scenarios, making either choice production-ready. The decision ultimately depends on your priorities: **performance and operational simplicity (K2K) **or ecosystem integration with operational complexity (MM2).
Choose K2K When:
- Latency is critical
- Efficiency is important — less ressources required
- EOS workloads are required — lower latency
- Operational simplicity — Single configuration
- Enterprise-grade support
- You prefer to manage the replicator through a UI or through an AI copilot, with governance
Choose MM2 When:
- Tighter lag control — Better lag management in moderate throughput scenarios
- Existing investment — Already heavily invested in Kafka Connect and MM2
Frequently Asked Questions (FAQ)
Q: Do both solutions achieve 100% reliability?
**A: **Yes. Both K2K and MM2 achieved 100% reliability (zero errors) across all four test scenarios, making either solution production-ready.
Q: What is the main operational difference between K2K and MM2?
**A: **MM2 requires a Kafka Connect setup and comes with complexities as stated in The Hidden Complexity of MirrorMaker 2. K2K is a Kubernetes native tool that enables simple operations and scaling. However, the true benefit of K2K is that it can be managed through the Lenses DevX solution — for self-service deployment, monitoring and governance of the replicator.
Q: Which solution has better latency?
A: K2K delivers 14–32% lower E2E latency across scenarios: 143ms vs 166ms in baseline (14% lower), 142ms vs 209ms with EOS (32% lower), and comparable latency at high throughput (2.37s vs 2.35s).
Q: Which solution has better throughput?
A: K2K delivers higher throughput for the same amount of resources used in exactly once scenarios.
Q: How does K2K perform with exactly-once semantics?
A: K2K excels with EOS, delivering 32% lower latency (142ms vs 209ms), 78% faster producer writes (17ms vs 78ms), and 5x better batching efficiency (419KB vs 85KB) compared to MM2.
Q: Should I migrate from MM2 to K2K?
A: If you’re experiencing operational complexity with MM2, need better latency/throughput, K2K is an excellent alternative, especially if you manage the replicator through the Lenses DevX. If you’re heavily invested in the Kafka Connect ecosystem and already use MM2, MM2 may still be appropriate.
Related Resources
- The Hidden Complexity of MirrorMaker 2** **— Understanding MM2’s operational challenges
- Lenses K2K Documentation** **— K2K configuration and deployment guides
- Apache Kafka MirrorMaker2 Documentation** **— Official MM2 documentation