Change Data Capture Operation for Mover Detection

Mover events are specific situations that require close monitoring of data changes. Because they are not generic, they require a customized set of operations and workflows to meet various requirements. The Change Data Capture mechanism detects changes and triggers workflows based on the captured changes.

The instructions in the article will guide you through the process and give you an idea of how to create a sample change data capture operation for mover detection. You can modify the scripts to suit your requirements and create any change in the data capture operation.

In this article, we will,

  • Understand the scenario.

  • Download the sample script files.

  • Understand the flow of the script.

  • Learn to verify the results.

  • Learn about the stored procedure and data transfer mode.

Developers have the flexibility to enhance the Change Data Capture (CDC) capabilities by adding additional tables and columns to the configuration file (config.json). However, it's essential to customize the scripts to align with your specific use case for downstream actions. This article aims to provide a clear understanding of the script's functionality and its internal workings, you'll be equipped to make the necessary modifications.

Change Data Capture Engine for Mover Detection

Scenario

If there is a change of value in the [Department] column of the [Person] table and had a department before, a Mover's Event is triggered. This event is recorded by adding an entry in the [BusinessRequestFlowEventInbox] table. After inserting the record into the table, workflows are triggered to notify the business of the mover event and take other actions as required.

Sample Script Files

Please download the script files from the repository. You will be asked for authentication to access and download the files.

File

Description

Download link

File

Description

Download link

master.ps1

This script creates a loop that repeatedly executes a PowerShell script named "change_streaming_script.ps1" at intervals of 5 minutes.

master.ps1

change_streaming_script.ps1

The core PowerShell script is designed to execute every 5 minutes, performing a sequence of tasks. Its primary responsibilities include retrieving JSON information, executing stored procedures, transmitting updates to Log Analytics, executing Kusto Queries, and inserting data into the [BusinessRequestFlowEventInbox] table.

change_streaming_script.ps1

change_streaming_sql_script.sql

This SQL script file encompasses scripts for tables and procedures that will be executed in case the corresponding objects are not present in the database.

change_streaming_sql_script.sql

config.json

This JSON file provides details regarding tables and their respective columns for which the Change Data Capture (CDC) functionality is to be enabled.

config.json

 

Understanding the Flow of Script

The primary file for detecting mover activity via change data capture is change_streaming_script.ps1. This script calls the required additional files. The PowerShell scripts execute sequentially, as below, to enable the operation.

We have already discussed in the overview section that the PowerShell script is deployed in an Azure Kubernetes Cluster as a container. The master script runs an infinite WHILE loop with a 5-minute pause, triggering the Mover's Detection script every 5 minutes.

  1. Installing or Loading Necessary PowerShell Modules:

    • The script begins by loading essential PowerShell modules, including SQLServer, Az.Accounts, and Az.OperationalInsights.

  2. Assigning Data to Variables:

    • CustomerId, SharedKey, TenantId, ClientId, ClientSecret, and SubscriptionId details are fetched from Container Secrets and assigned to their respective variables.

  3. Checking Whether CDC is Enabled on the Database Level:

    • The script checks if Change Data Capture (CDC) is enabled at the database level. If CDC is enabled, the script continues; otherwise, it stops execution.

  4. Reading Data from config.json File:

    • The script reads the configuration file (config.json) to obtain the necessary table and column information. This information is crucial for automatically enabling CDC on specific tables.

  5. Executing [ChangeStreaming] Procedure:

    • In this step, the script executes the stored procedure [ChangeStreaming]. This procedure reads the CDC table for captured changes and converts the data into a user-readable JSON format.

  6. Pushing Data to Azure Log Analytics Workspace:

    • The JSON data generated in the previous step is pushed to the Azure Log Analytics Workspace. The data is automatically converted and stored in a tabular format within the workspace.

  7. Executing Kusto Query on Log Analytics Workspace:

    • A Kusto Query is executed in the Log Analytics Workspace to retrieve data that meets predefined conditions specified by the requirements. The provided query operates within the Mover Detection Engine.

  8. Inserting Data into [BusinessRequestFlowEventInbox] Table:

    • The output of the Kusto Query is inserted into the [BusinessRequestFlowEventInbox] table under the 'Person Mover' type. The Initiator of this record is 'EmpowerIDSystem.' Subsequently, the job picks up the record and triggers the Mover event based on the corresponding PersonID.

Verify Results

In order to verify that Change Data Capture (CDC) has worked effectively, you can use an SQL query to fetch the records captured by the CDC. The provided query allows you to observe the entries that were added to the [BusinessRequestFlowEventInbox] table originating from the Mover Detection Engine.

SELECT * FROM dbo.BusinessRequestFlowEventInbox WHERE EventJsonData LIKE '%TimeGenerated%' ORDER BY 1 DESC

 

Stored Procedure and Data Transfer Mode

When writing data from SQL to Log Analytics, there's a potential risk of losing data due to interruptions on either end. To prevent this problem, the Powershell scripts app writes a sequence value in the database. This sequence value allows it to resume data processing from where it last left off in case of an interruption. The stored procedure “dbo.ChangeStreaming_Changes“ that transforms the results to push into the log analytics engine uses the sequence value along with the transfer mode input. This ensures that no data is overlooked or duplicated.

The stored procedure “dbo.ChangeStreaming_Changes“ is designed to generate queries dynamically for any table that has CDC. This streamlines the process by extracting data changes from the original table and its associated Change Data Capture (CDC) history table by providing the table name as an input parameter. There are three possible ways to transfer data from Azure SQL to the Log Analytics Space. The transfer mode is a parameter that can be passed into the Stored Procedure used for transforming data into a readable format and pushing it to the Log Analytics Space.

  • Transfer Mode 1 (Automatic ): Mode 1, also known as Automatic transfer mode, requires the runmode parameter to be set to 1. When using this mode, the system identifies the last sequence value sent to Log Analytics and only sends data generated since that point. This ensures that redundant data is not reprocessed.

  • Transfer Mode 2 (Viewing): Mode 2, also known as Viewing transfer mode, requires the runmode parameter to be set to 2. This mode is used for viewing purposes only. It allows users to inspect the data that is currently being processed.

  • Transfer Mode 3 (Custom Range): Mode 3, also known as Custom Range transfer mode, requires the runmode parameter to be set to 3. This mode allows users to manually specify a starting and ending sequence value to send a custom range of data to Log Analytics. This is helpful for selective data transfer.

 

Â