Cassandra CDC
Enable Cassandra CDC
To enable Cassandra CDC, need to enable both the table-level CDC and node level CDC.
To enable table-level CDC, update the table option cdc
:
ALTER TABLE foo WITH cdc=true;
To enable node-level CDC, update the configuration file cassandra.yaml
:
cdc_enabled=true
Other CDC options:
cdc_raw_directory
cdc_total_space
cdc_free_space_check_interval
WARNNING If CDC is enabled, when fill up the cdc_free_space_in_mb
, writes to CDC-enabled tables will be rejected.
Cassandra 3.11.X CDC Implementation
Memtable is per table (or column family).
Commitlog is global, and consisted by commitlog segments.
Memtable maintain two positions:
commitlogUpperBound
low commitlog position;commitlogLowerBound
high commitlog position;
Commitlog segment maintain two hash maps:
cfDirty
table (or column family) unflushed data interval;cfClean
table (or column family) flushed data interval;
When keyspace apply new mutation:
- append mutation to commitlog
- commitlog segment update
cfDirty
table - add partition update to memtable
- memtable update
comitlogUpperrBound
When flush memtable:
- iterate over active commitlog segments, mark clean using
commitlogUpperBound
andcommitlogLowerBound
In Cassandra 3.11.X version, if commitlog segment contains CDC-enabled table mutation, the commitlog segment will be moved from commitlog_directory
to cdc_raw_directory
.
Cassandra 4.X CDC Improvement
TBD
Debezium Cassandra Connector
TBD