Row-Internal Synchronization Rules
Published 06 January 2020
Row-Internal Synchronization Rules apply SQL fragments that can use existing data in the target table and datasets to generate the replacement values for the target column.
Example
Given a table that contains the FirstName
, Surname
, FullName
and EmailAddress
of a person and that have used a Substitution Rule to replace the FirstName
and Surname
columns.
The requirement is that the FullName
and the e-mail address look more realistic.
To update the FullName
column this create a Row-Internal Rule with the SQL fragment:
FirstName + ' ' + Surname
Then to update the EmailAddress
column create a second Row-Internal Rule:
LOWER(FirstName) + '.' + LOWER(Surname) + '@test-domain.com'
Assuming that the FirstName
and Surname
columns have been suitably masked the Row-Internal Synchronization Rules will make the following changes:
ALREADY MASKED | BEFORE | AFTER | ||||
---|---|---|---|---|---|---|
ID | FirstName | Surname | FullName | EmailAddress | FullName | EmailAddress |
1 | Robert | Smith | Alan Jones | ajones@abc.com | Robert Smith | robert.smith@test-domain.com |
2 | Sue | Jones | Hannah Robertson | hannah2213@gmail.com | Sue Jones | hannah.jones@test-domain.com |
3 | Thomas | Wells | Michael Rogers | mrogers@yahoo.com | Thomas Wells | thomas.wells@test-domain.com |
Core concepts
Using datasets
Datasets can be used to insert other random data in Row-Internal Rules. Each dataset is given a unique name that can be placed into the SQL Template and will be substituted with a value from the dataset for each row when the rule is run.
Data Masker's predefined datasets serve a number of common concepts but it is possible to define your own dataset if you need something more specific.
By default, the datasets are located in a directory named Datasets immediately below the Data Masker installation directory and can be changed on the Misc. Setup Tab.
Example
If we wanted to expand the above example by generating random domain names, we could set the DMSPARAM1
parameter to be the dataset Text, Dictionary Words and change the SQL Template to be:
LOWER(FirstName) + '.' + LOWER(Surname) + '@' + LOWER(DMSPARAM1) + '.com'
Which would result in:
ALREADY MASKED | BEFORE | AFTER | ||||
---|---|---|---|---|---|---|
ID | FirstName | Surname | FullName | EmailAddress | FullName | EmailAddress |
1 | Robert | Smith | Alan Jones | ajones@abc.com | Robert Smith | robert.smith@yeomen.com |
2 | Sue | Jones | Hannah Robertson | hannah2213@gmail.com | Sue Jones | sue.jones@visage.com |
3 | Thomas | Wells | Michael Rogers | mrogers@yahoo.com | Thomas Wells | thomas.wells@mythic.com |
Preserving null/empty values
Row-Internal Rules provide the ability (on the Where clause or Sampling options tab of the substitution form) to ignore NULL and/or empty values. The default for substitution rules is to ignore NULL
/empty values
Oracle treats empty strings as NULL
the preserve empty values option is not provided for substitution rules in Data Masker for Oracle.
Adding a WHERE clause
If only certain rows in the table need masking specify a where on the Where clause or Sampling options tab of the substitution form).
Fast Single Pass
The Fast Single Pass option on the Options tab can be used to improve the performance of your Row-Internal Rule. It does this by creating a single query to update the entire table, rather than taking a row-by-row approach.
NOTE: You cannot use datasets when you are using the "Fast Single Pass" option, as the datasets need to generate a new value for each row.
Advanced Concepts
Synchronizing denormalized data
If you have a denormalized structure, e.g. the employee name exists in several tables, you would want to mask the data identically in all places you can convert a Row-Internal Rule into a Sync. Manager Rule:
- Ensure that the "No where clause" is set on the Where clause or Sampling options tab.
- On the Change Managers tab select the "Convert to Sync. Manager Rule" option.
- See About Sync. Manager Rules for more details of how to connect to the other tables.
Parallelizing substitution on large tables
If a Row-Internal Rule is taking too long against a single table, you can parallelize the work using a Range Splitter Rule that divides the table into smaller chunks that can be masked in parallel. To convert to a Range Splitter Rule by selecting the "Convert to Sync. Manager Rule" option from the Change Managers tab.
Where clauses referencing other tables
The custom where clause allows you to specify a JOIN
clause (the target table will be aliased as DM1
to help create the join condition).
For example, if the above table contains basic details about a person and we only want to mask the e-mail addresses of those who are not employees, we could write:
LEFT OUTER JOIN Employees e ON e.PersonID = DM1.ID WHERE e.PersonID IS NULL
Assuming that the same where clause was applied to the substitution rule that masked the FirstName and Surname columns then you would get:
ALREADY MASKED | BEFORE | AFTER | ||||
---|---|---|---|---|---|---|
ID | FirstName | Surname | FullName | EmailAddress | FullName | EmailAddress |
1 | Robert | Smith | Alan Jones | ajones@abc.com | Robert Smith | robert.smith@test-domain.com |
2 | Sue | Jones | Sue Jones | sue.jones@your-company.com | Sue Jones | sue.jones@your-company.com |
3 | Thomas | Wells | Michael Rogers | mrogers@yahoo.com | Thomas Wells | thomas.wells@test-domain.com |