Table-Internal Synchronization Rules
Published 06 January 2020
Table-Internal rule partitions rows in a table by specified column values (join columns), so that the column values (sync columns) in the same group will be synchronized into the same value.
Generally table-internal rules should be executed after a masking rule has been applied, which maintains the data consistency instead of generating new values.
Example
Given a table that stores customer transactions, adding a table-internal rule to mask the LastName
column joined on CustomerID
column would result in something like:
ID | CustomerID | LastName (BEFORE) | LastName (AFTER) |
---|---|---|---|
1 | 1 | Smith | Smith |
2 | 2 | Jones | Jones |
3 | 1 | Green | Smith |
4 | 1 | Red | Smith |
5 | 3 | Doe | Doe |
6 | 2 | Johnson | Jones |
7 | 2 | Russell | Jones |
8 | 3 | Williams | Doe |
In the above example - several different lastnames (Smith, Green, Red) appear in the LastName columns in multiple rows with the same value of CustomerID
1. These rows will have the same LastName
value after table-internal rule finishes, with the value (Smith) selected from the first row of the group of possible values.
Core concepts
Join columns
The columns whose values are used to partition the rows into groups.
Sync columns
The columns whose values are to be updated so that the rows within the same group will get the same value of the first row.
Advanced concepts
Bias against Nulls
When NULL values appear in a group, do not choose NULL to update the entire groups of values.
Increasing performance
On large tables it is advisable to use a Command Rule to add a temporary index over the join columns if one is not present. The temporary index can be dropped again by another Command Rule which executes after the Table-Internal Synchronization rule completes.