Search Replace Rules
Published 06 January 2020
Search Replace rules are designed to find and substitute sensitive information within free text. The core concept in search and replace rule is a replacement operation, which defines what strategy is used to search for the text segment that needs to be masked, and what strategy is used to replace the matching text segment.
Given a table that stores customer details like that you can achieve such replacements:
|strategies:||Simple X out||Random replace|
|ID||telephone (before)||telephone (after)||email (before)||email (after)|
|strategies||Dictionary replace||Regex replace (with: "(?<=IPV)\d+")|
|ID||notes (before)||notes (after)||notes (before)||notes (after)|
|1||Jane lives with John||Hadya lives with Florianne||Contacted with IPV131212 with success.||Contacted with IPV302031 with success.|
|2||jane lives with John||Maybell lives with Leigh||No response from IPV231.||No response from IPV145382.|
|3||Jane lives with john||Danielle lives with Yasmeen||Need to retry catching up with IPV213321.||Need to retry catching up with IPV038233.|
|4||Jane is alone||Chastity is alone||ToDo: IPV122121, IPV662121, IPV322121||ToDo: IPV406035, IPV809887, IPV898648|
There exist four different strategies how to mask your data in search replace rule. These operations can be combined and run in a single Search-Replace rule. They will be run in order from top to bottom
Simple X Out
Replace alphabetic or numeric characters with a single masked character, which is defined by a user. This option can be used to preserve the basic "shape" of data such as telephone number and email address without including any personal information.
Similar to "Simple X Out", but replaces each character with a random character from the same character class (lowercase letters, uppercase letters, or digits).
Uses a list of datasets to find data that should be considered sensitive and replace with data from another dataset. List of datasets to search is limited to datasets which contain a list of data - all names and user-defined datasets (To check how to create customized data set go to User Defined Dataset).
When using this method, be aware that substrings have to match an element in the lookup list exactly. In the example of the names, names not included in Data Masker's list would not be masked.
Provide a regular expression and replace all matches of that regular expression with a value from a dataset.
Adding a WHERE clause
If only certain rows in the table need masking specify a where on the Where clause or Sampling options tab of the search and replace form).
WHERE clause can be added here, e.g. for the above example, if we wanted to exclude all employees in the table where the
Id is under 1000 we could write:
WHERE Id >= 1000
You can choose to manually exclude
NULL/Empty values within your custom where clause or select the "with inline Null skips" option.