Data Masker

Table-Internal rule partitions rows in a table by specified column values (join columns), so that the column values (sync columns) in the same group will be synchronized into the same value.

Generally table-internal rules should be executed after a masking rule has been applied, which maintains the data consistency instead of generating new values.

Example

Given a table that stores customer transactions, adding a table-internal rule to mask the LastName column joined on CustomerID column would result in something like:

ID	CustomerID	LastName (BEFORE)	LastName (AFTER)
1	1	Smith	Smith
2	2	Jones	Jones
3	1	Green	Smith
4	1	Red	Smith
5	3	Doe	Doe
6	2	Johnson	Jones
7	2	Russell	Jones
8	3	Williams	Doe

In the above example - several different lastnames (Smith, Green, Red) appear in the LastName columns in multiple rows with the same value of CustomerID 1. These rows will have the same LastName value after table-internal rule finishes, with the value (Smith) selected from the first row of the group of possible values.

Core concepts

Join columns

The columns whose values are used to partition the rows into groups.

Sync columns

The columns whose values are to be updated so that the rows within the same group will get the same value of the first row.

Advanced concepts

Bias against Nulls

When NULL values appear in a group, do not choose NULL to update the entire groups of values.

Increasing performance

On large tables it is advisable to use a Command Rule to add a temporary index over the join columns if one is not present. The temporary index can be dropped again by another Command Rule which executes after the Table-Internal Synchronization rule completes.