Describes an end-to-end flow of how to establish and use Change Data Capture (CDC). It assumes that a new table and dataset will be created, although an existing table with data can also be used.
maprcli stream create -ischangelog
parameter. See maprcli stream
create or use the Control System. -propagateexistingdata parameter to false. The default is
true.-pause parameter to true. The change data records are
stored in a bucket until you resume the changelog relationship; at this point, the
stored change data records are propagated to the stream topic. See table changelog resume for more
information.| Scenario | Setup Task | Notes |
|---|---|---|
| You want a CDC stream topic to contain all of the table data as changed data records. | You would setup CDC in the following manner before performing operations on
the source table and consuming the change data records.
|
In this case, all table data is propagated to the stream topic as change data records and the operation type is identified on each individual data record. |
| You want a CDC stream topic to contain all of the existing table data and changed data records. | You would setup CDC in the following manner before performing operations on
the source table and consuming the change data records.
|
In this case, the existing table data is propagated to the stream topic and
that data's operation type is identified as a SET operation. Subsequently,
operations on the source table are propagated as changed data records and the
operation type is identified on each individual data record. You can consume
data at any time, however, there may be a delay before all of the existing table
data is completely propagated, expecially if you have a large dataset. Be sure
to check the |
| You want a CDC stream topic to not contain any original table data and to capture only subsequent changed data records | You would setup CDC in the following manner before performing operations on
the source table and consuming the change data records.
|
In this case, the existing table data is not propagated to the stream topic and the operation type is identified on each individual data record. |