Published 06 January 2020
The Insertion Rules generate new rows and insert them into the target table. Just as with Substitution Rules the contents of each column will the generated through the configured dataset, any column that is not associated with a dataset will be assigned a
Given a table with the schema:
ID INTEGER NOT NULL
FirstName TEXT NOT NULL
LastName TEXT NOT NULL
EmailAddress TEXT NOT NULL
DateOfBirth DATE NULL
You could create an Insertion Rule that would insert 10,000 rows where:
IDas the dataset Numbers, Integer, Sequential
FirstNameas dataset Names, First Names, Male + Female
LastNameas dataset Names, Surnames, Random (Short List)
EmailAddressas dataset E-Mail Addresses (Random)
DateOfBirthas dataset Dates, Random from 01/01/1940 to 01/01/2010
Each dataset provides a set of options that are specific to that dataset to allow for "fine-tuning" of the data used in substitutions. The datasets will be restricted to ones that are appropriate for the type of column (i.e. it is possible to put a numeric value into a text-based field but not all text values could fit into a numeric field).
By default, the datasets are located in a directory named Datasets immediately below the Data Masker installation directory and can be changed on the Misc. Setup Tab.
Creating complex distributions of test data
Insertion rules may not provide you with the exact distribution of data that you require, we recommend that you insert the using the most common type of data for the table then use other rules to correctly update the data.
For example, based on the example above, you need test data that matches:
FirstNameshould be evenly split between Male and Female customers
NULLfor 10% of customers
EmailAddressshould be of the format:
Then you could:
- Alter the Insertion Rule to set
FirstNameto be Names, First Names, Female
- Create a Substitution Rule that samples 50% of the data that sets
FirstNameto Names, First Names, Male
- Create a Substitution Rule that samples 10% of the data that sets the
DateOfBirthto NULL Values
- Create a Row-Internal Rule that sets the
FirstName + '.' + LastName + '@testdomain.com'