Some databases support an option to automatically publish the changes on a database. This solution does not need additional tables to keep track of the changes. It is called Change Data Capture (CDC). This pattern uses database transaction log or a similar mechanism to track change events. This concept externalizes the transaction log of a database and forwards those events to other consumers. It allows the application state to be externalized and synchronized with external stores.
The implementation usually consists of the following characteristics: an external process that reads the transaction log of a database and forwards those events in the form of a message. As we can see in a diagram below, there are multiple options on how this process can be implemented. The CDC connector can be a separate component that scans the database for changes and pushes those events to the message broker, from where other systems can process them.

In the other case, a CDC connector is embedded into a client service which processes the events. The service can persist events directly to a data store, sends them to the message broker, or a combination of both options.

CDC captures the changes on a database in real time. It operates in three phases: detecting changes, capturing changes and forwarding changes. To detect changes there are several options available and some modern databases support mechanisms to automatically detect the changes. A diagram below demonstrates the overview of the CDC change detection.

But the CDC can also be implemented on any kind of database, however we need to implement some additional methods to detect the changes if the chosen database does not support this:
- monitoring the transaction log of a database. Since every database logs its transactions, we can implement log scanners that can identify any changes
- periodically polling data from tables. This usually requires us to query the database and compare timestamp and version fields
- use database triggers. To use this approach, we have to define triggers on a database. This mechanism affects the overall system performance as it requires transformations of the changes into the events, duplications for the records and additional maintenance work.
In the cloud native architecture, triggers are often used in a combination with the event streaming. When a database change triggers a log, those changes are sent to the CDC in streams. In the process of event streaming, additional operations such as aggregations and transformations can be applied, and the processing of data occurs in real-time. There are also tools which can capture the changes on a database and push those changes to other components or systems. Debezium is one of them.
There are two possible options to migrate captured data to the data warehouse. First one is a direct migration, and another is migration with transformation. In the first case, data is sent to other systems as it is, while in the second case, additional transformations are applied. Data can be sent directly to the service that will store data in a data warehouse, or event processing tools can be used. The most common option is usage of message brokers such as Apache Kafka, Apache Pulsar or RabbitMQ.
The CDC offers many benefits, such as maintaining the source of truth in the transaction log, effective and real-time data retrieval, real-time analysis and consistent data synchronization.