Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Multi-master replication allows writes to multiple database nodes simultaneously. Unlike primary-replica setups, there is no single point for writes. The trade-off is complexity, particularly around conflict resolution.

Why Multi-Master?

The primary reasons to consider multi-master replication are:

* **Multi-region writes**: Users in the US and Europe both write with local latency.

* **Zero downtime upgrades**: Any node can be taken offline without losing write availability.

* **Read scalability with local writes**: Each node can serve both reads and writes with low latency.

Conflict Resolution Strategies

When two nodes concurrently modify the same row, a conflict occurs. Resolution strategies vary by system:

Last-Write-Wins (LWW)

Each row carries a timestamp. The write with the latest timestamp wins. Simple but lossy:

-- Cassandra default

UPDATE users SET email = 'new@example.com', updated_at = now() WHERE id = 1;

-- The LWW with the highest timestamp wins

**Pros**: Simple, always converges. **Cons**: Loses data silently.

Application-Mediated Conflict Resolution

The database detects conflicts and presents them to the application for resolution. BDR (Bi-Directional Replication) for PostgreSQL supports this:

-- Create a conflict handler function

CREATE OR REPLACE FUNCTION resolve_conflict()

RETURNS trigger AS $$

BEGIN

-- Custom logic: keep the row with the higher version

IF NEW.version >= OLD.version THEN

RETURN NEW;

END IF;

RETURN OLD;

END;

$$ LANGUAGE plpgsql;

-- Apply to the replication set

SELECT bdr.conflict_handler('orders', 'resolve_conflict');

CRDT-Based Reconciliation

Conflict-free Replicated Data Types (CRDTs) mathematically guarantee convergence without coordination. Instead of storing a scalar value, you store a data structure where concurrent operations commute or combine:

-- Counter CRDT table

CREATE TABLE page_views (

page_id INTEGER PRIMARY KEY,

counter INTEGER DEFAULT 0

);

-- Each replica increments independently

UPDATE page_views SET counter = counter + 1 WHERE page_id = 42;

-- The final value after reconciliation: MAX of all replicas' counters (for grow-only counters)

-- or SUM (for counters implemented as PN-Counters)

More sophisticated CRDTs include:

* **G-Set** (grow-only set): Elements can only be added. Union merges converge.

* **LWW-Register**: Last-write-wins register (if using wall clocks) or compare-and-swap.

* **OR-Set** (observed-remove set): Supports both add and remove without losing concurrent adds.

Galera Cluster (MySQL / MariaDB)

Galera Cluster implements synchronous multi-master replication for MySQL. All nodes coordinate before committing:

-- All nodes are writable

-- On each node:

CREATE TABLE users (

id INT PRIMARY KEY AUTO_INCREMENT,

email VARCHAR(255) NOT NULL

);

INSERT INTO users (email) VALUES ('alice@example.com');

-- This INSERT is certified on all nodes before returning

Galera Key Properties

* **Synchronous replication**: All nodes apply changes simultaneously.

* **Certification-based**: Transactions are certified on every node before commit.

* **Automatic node provisioning**: New nodes join via State Snapshot Transfer (SST) or Incremental State Transfer (IST).

* **No replication lag**: All nodes are consistent at commit time.

Galera Limitations

* Cluster size penalty: 2-3 nodes is optimal. More nodes increase certification overhead.

* Full table writes locks must be avoided; `SELECT ... FOR UPDATE` on uncached workloads causes deadlocks.

* Network latency between nodes directly impacts write latency (the "slowest node" problem).

* `REPEATABLE READ` is the default; `SERIALIZABLE` causes high conflict rates.

PostgreSQL BDR (Bi-Directional Replication)

BDR extends PostgreSQL's logical replication for multi-master scenarios. It operates at row level with conflict resolution:

-- Configure BDR group

SELECT bdr.create_node(

node_name := 'node1',

dsn := 'host=node1.example.com port=5432 dbname=appdb'

);

SELECT bdr.create_node(

node_name := 'node2',

dsn := 'host=node2.example.com port=5432 dbname=appdb'

);

SELECT bdr.create_group(

group_name := 'global_app',

nodes := ARRAY['node1', 'node2']

);

BDR supports multiple conflict resolution modes:

* `latest_timestamp_wins`

* `latest_version_wins`

* `apply_remote`

* `apply_local`

CAP Theorem Implications

Multi-master systems live in the CAP trade-off space:

* **Galera**: CP (Consistency + Partition tolerance). Sacrifices availability during network partitions because writes must certify on all nodes.

* **Cassandra-style LWW**: AP (Availability + Partition tolerance). Sacrifices immediate consistency but always accepts writes.

* **BDR with custom handlers**: Tunable. Configure per-table conflict strategies.

When to Avoid Multi-Master

Multi-master replication is not the default for good reasons:

* **Single-region apps**: A single PostgreSQL primary with streaming replicas handles the vast majority of workloads.

* **Apps with hot rows**: If 90% of writes target 10 rows (e.g., a counter), conflicts are guaranteed.

* **Teams without operational maturity**: Multi-master requires monitoring replication lag, conflict rates, and node health.

Monitoring Conflicts

Whichever system you choose, monitor conflict rates:

-- BDR conflict stats

SELECT * FROM bdr.conflict_history;

-- Galeria cluster status

SHOW STATUS LIKE 'wsrep_cluster_size';

SHOW STATUS LIKE 'wsrep_local_cert_failures';

A rising conflict rate indicates application-level contention that might be better solved by sharding the data model rather than relying on replication to resolve conflicts.

Multi-master replication is a powerful but complex tool. Start with single-master and add multi-master only when your availability or latency requirements genuinely demand it.

Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Multi-Master Replication: Conflict Resolution, CRDTs, Galera, and BDR

Related Articles