About Table-Internal Synchronization Rules
Published 23 March 2018
An Example of a Table-Internal Synchronization Requirement
In the above example - the name Robert Smith appears in the FIRST_NAME and LAST_NAME columns in multiple rows. In other words, some of the data items are denormalized because of repetitions in multiple rows. If the name Robert Smith changes to Albert Wilson after masking, then the same Robert Smith referenced in other rows must also change to Albert Wilson in a consistent manner. This requirement is necessary to preserve the relationships between the data rows and is called Table-Internal Synchronization. There are two other types of synchronization (Row-Internal and Table-To-Table) and all three are quite different in function.
A Table-Internal Synchronization rule updates columns in groups of rows within a table to contain identical values. This means that every occurrence of Robert Smith in the table will contain Albert Wilson.
Table-Internal Synchronization is a reasonably common requirement and it is quite easy to achieve using the Data Masker software - just implement the specialized Table-Internal Synchronization rule. To configure a Table-Internal Synchronization rule it is necessary to know which columns must be synchronized in each row and a join condition which indicates how the rows are associated with each other. In the above example, the EMP_NO, FIRST_NAME and LAST_NAME columns require synchronization in each row. The PERSON_ID column would provide a suitable join condition as each distinct PERSON_ID value identifies a group of related rows.
The rest of the Table-Internal Synchronization rule configuration is simple. The target table is chosen and the rule is configured to contain a list of the synchronization columns and a suitable join condition. The Data Masker software does the rest of the work - after rule execution, each group of rows will contain the same synchronized information. The help file for the New Table-Internal Synchronization rule form provides more details on the mechanics of the rule configuration process.
The actual values used in each row will be the values from the first row of that group processed by the Data Masker software. A Table-Internal Synchronization rule is always applied after the masking has taken place - there is no point doing it before all of the synchronization columns are masked. If the join condition is indexed, a significant speed improvement will be seen - so much so, that on large tables it is advisable to use a Command Rule to add a temporary index over the group-by columns if one is not present. The temporary index can be dropped again by another Command Rule which executes after the Table-Internal Synchronization rule completes.
Table-Internal Synchronization rules can use a Where Clause option to operate on a subset of the rows in a table. Usually this feature is not required and most Table-Internal Synchronization rules operate on the full set of rows in the table.
Table-Internal Synchronization rules are created by launching the New Table-Internal Synchronization rule form using the New Rule button located on the bottom of the Rules in Set tab.
Adding a Table-Internal Synchronization Rule