Data Masker

Row-Internal Synchronization Rules

Row-Internal Synchronization Rules apply SQL fragments that can use existing data in the target table and datasets to generate the replacement values for the target column. 

Example

Given a table that contains the FirstName, Surname, FullName and EmailAddress of a person and that have used a Substitution Rule to replace the FirstName and Surname columns. 

The requirement is that the FullName and the e-mail address look more realistic.

To update the FullName column this create a Row-Internal Rule with the SQL fragment:

FirstName + ' ' + Surname

Then to update the EmailAddress column create a second Row-Internal Rule:

LOWER(FirstName) + '.' + LOWER(Surname) + '@test-domain.com'

Assuming that the FirstName and Surname columns have been suitably masked the Row-Internal Synchronization Rules will make the following changes:


ALREADY MASKEDBEFOREAFTER
IDFirstNameSurnameFullNameEmailAddressFullNameEmailAddress
1RobertSmithAlan Jonesajones@abc.comRobert Smithrobert.smith@test-domain.com
2SueJonesHannah Robertsonhannah2213@gmail.comSue Joneshannah.jones@test-domain.com
3ThomasWellsMichael Rogersmrogers@yahoo.comThomas Wellsthomas.wells@test-domain.com

  

Core concepts

Using datasets

Datasets can be used to insert other random data in Row-Internal Rules. Each dataset is given a unique name that can be placed into the SQL Template and will be substituted with a value from the dataset for each row when the rule is run.

Data Masker's predefined datasets serve a number of common concepts but it is possible to define your own dataset if you need something more specific.

By default, the datasets are located in a directory named Datasets immediately below the Data Masker installation directory and can be changed on the Misc. Setup Tab.

Example

If we wanted to expand the above example by generating random domain names, we could set the DMSPARAM1 parameter to be the dataset Text, Dictionary Words and change the SQL Template to be:

LOWER(FirstName) + '.' + LOWER(Surname) + '@' + LOWER(DMSPARAM1) + '.com'

Which would result in:


ALREADY MASKEDBEFOREAFTER
IDFirstNameSurnameFullNameEmailAddressFullNameEmailAddress
1RobertSmithAlan Jonesajones@abc.comRobert Smithrobert.smith@yeomen.com
2SueJonesHannah Robertsonhannah2213@gmail.comSue Jonessue.jones@visage.com
3ThomasWellsMichael Rogersmrogers@yahoo.comThomas Wellsthomas.wells@mythic.com

  

Preserving null/empty values

Row-Internal Rules provide the ability (on the Where clause or Sampling options tab of the substitution form) to ignore NULL and/or empty values. The default for substitution rules is to ignore NULL/empty values

Oracle treats empty strings as NULL the preserve empty values option is not provided for substitution rules in Data Masker for Oracle.

Adding a WHERE clause

If only certain rows in the table need masking specify a where on the Where clause or Sampling options tab of the substitution form).

Fast Single Pass

The Fast Single Pass option on the Options tab can be used to improve the performance of your Row-Internal Rule. It does this by creating a single query to update the entire table, rather than taking a row-by-row approach. 

NOTE: You cannot use datasets when you are using the "Fast Single Pass" option, as the datasets need to generate a new value for each row.

Advanced Concepts

Synchronizing denormalized data

If you have a denormalized structure, e.g. the employee name exists in several tables, you would want to mask the data identically in all places you can convert a Row-Internal Rule into a Sync. Manager Rule:

Parallelizing substitution on large tables

If a Row-Internal Rule is taking too long against a single table, you can parallelize the work using a Range Splitter Rule that divides the table into smaller chunks that can be masked in parallel. To convert to a Range Splitter Rule by selecting the "Convert to Sync. Manager Rule" option from the Change Managers tab.

Where clauses referencing other tables

The custom where clause allows you to specify a JOIN clause (the target table will be aliased as DM1 to help create the join condition).

For example, if the above table contains basic details about a person and we only want to mask the e-mail addresses of those who are not employees, we could write:

LEFT OUTER JOIN Employees e ON e.PersonID = DM1.ID
WHERE e.PersonID IS NULL

Assuming that the same where clause was applied to the substitution rule that masked the FirstName and Surname columns then you would get:


ALREADY MASKEDBEFOREAFTER
IDFirstNameSurnameFullNameEmailAddressFullNameEmailAddress
1RobertSmithAlan Jonesajones@abc.comRobert Smithrobert.smith@test-domain.com
2SueJonesSue Jonessue.jones@your-company.comSue Jonessue.jones@your-company.com
3ThomasWellsMichael Rogersmrogers@yahoo.comThomas Wellsthomas.wells@test-domain.com

  




Didn't find what you were looking for?