Data Masker

Insertion Rules

The Insertion Rules generate new rows and insert them into the target table. Just as with Substitution Rules the contents of each column will the generated through the configured dataset, any column that is not associated with a dataset will be assigned a NULL value. 

Example

Given a table with the schema:

  • ID INTEGER NOT NULL
  • FirstName TEXT NOT NULL
  • LastName TEXT NOT NULL
  • EmailAddress TEXT NOT NULL
  • DateOfBirth DATE NULL

You could create an Insertion Rule that would insert 10,000 rows where:

  • ID as the dataset Numbers, Integer, Sequential
  • FirstName as dataset Names, First Names, Male + Female
  • LastName as dataset Names, Surnames, Random (Short List)
  • EmailAddress as dataset E-Mail Addresses (Random)
  • DateOfBirth as dataset Dates, Random from 01/01/1940 to 01/01/2010

Core concepts

Datasets

Each dataset provides a set of options that are specific to that dataset to allow for "fine-tuning" of the data used in substitutions. The datasets will be restricted to ones that are appropriate for the type of column (i.e. it is possible to put a numeric value into a text-based field but not all text values could fit into a numeric field).

Data Masker predefined datasets that serve a number of common concepts but it is possible to define your own dataset if you need something more specific.

By default, the datasets are located in a directory named Datasets immediately below the Data Masker installation directory and can be changed on the Misc. Setup Tab.

Advanced concepts

Creating complex distributions of test data

Insertion rules may not provide you with the exact distribution of data that you require, we recommend that you insert the using the most common type of data for the table then use other rules to correctly update the data.

For example, based on the example above, you need test data that matches:

  • FirstName should be evenly split between Male and Female customers
  • DateOfBirth is NULL for 10% of customers
  • EmailAddress should be of the format: FirstName.LastName@testdomain.com

Then you could:

  • Alter the Insertion Rule to set FirstName to be Names, First Names, Female
  • Create a Substitution Rule that samples 50% of the data that sets FirstName to Names, First Names, Male
  • Create a Substitution Rule that samples 10% of the data that sets the DateOfBirth to NULL Values
  • Create a Row-Internal Rule that sets the EmailAddress to FirstName + '.' + LastName + '@testdomain.com'

Didn't find what you were looking for?