Use Case Accelerators > DMX-h Use Case Accelerator: File CDC
DMX-h Use Case Accelerator: File CDC
Syncsort has provided a set of use case accelerators (UCAs) to help users understand how to implement DMX-h ETL solutions in a Hadoop MapReduce framework.
The File CDC and File CDC MultiTargets use case accelerators demonstrate how to perform Change Data Capture (CDC), a process where previous and current versions of a large data set are compared to determine the changes that have occurred during the time period between the two versions. Both examples join two large files, and vary only in the output they produce:
File CDC produces a single output file that contains all changed records, with a flag added to each record indicating whether it is an insert, delete, or update.
File CDC MultiTargets produces three separate targets, one each for inserted, deleted, and updated records.
The following attachments are available for understanding and running these UCAs: