Redgate Test Data Manager

Pre-masking and post-masking transforms

There are sometimes cases where the data type of a column does not match that supported by a dataset. This means that the dataset cannot be applied to the column.

For example, the USSocialSecurityNumbers dataset generates US Social Security numbers in the format 989-11-9389. If the database column used to store this data is defined as a number, it cannot then store the data generated by the dataset when masking takes place.

To get around this problem, pre-masking and post-masking transforms can be used to manipulate the data into different formats.

Pre-masking transforms

Pre-masking transforms run after the data is loaded from the database, but before it is masked. This enables the original data to be transformed into a different form/type suitable for the applied dataset.

Post-masking transforms

Post-masking transforms run after the data is masked, but before it is written back to the database. This enables the masked data to be transformed into a different form/type suitable for the column being masked.

Configuring transforms

On a column

Pre-masking and post-masking transforms can be configured on a column in the masking.json file. Both are optional, allowing a pre-masking transform or post-masking transform to be defined on their own, or both can be defined.

{
  "tables": [
    {
      "schema": "Person",
      "name": "Address",
      "columns": [
        {
          "name": "SSN",
          "dataset": "USSocialSecurityNumbers",
          "transforms": {
            "pre": "...",
            "post": "...",
          }
        }
      ]
    }
  ]
}

Transforms take the form of an expression that is used to manipulate the data, similar to how it is done within a cell in Microsoft Excel.

The special placeholder ${value} is used to represent the input to the transform. For pre-masking transforms, this represents the value from the database column. For post-masking transforms, this represents the output of the dataset after masking.

Using a transform rule

Transform rules can also be defined in the masking.json file or the options.json file. These apply a similar syntax to classification rules, and allow conditions to be applied to determine whether to apply the transform to a given column.

{
  "transformRules": [
    {
      "condition": "Table.Name = 'Customers' AND Column.Name = 'PostalCode'",
      "pre": "removeWhitespace(${value})",
      "post": "upper(${value})"
    }
  ]
}

Transform rules will apply the specified pre-masking and post-masking transforms to any columns that match the specified condition.

Both pre-masking and post-masking transforms are again optional, allowing a pre-masking transform or post-masking transform to be defined on their own, or both can be defined.

In the above example, the transforms will be applied to any columns called "PostalCode" in any table called "Customers".


In addition to the conditions used by classification rules, transform rules can also be about:

  • a column's classification - Column.Classification = 'GivenNames' (only relevant when running the map step)
  • a column's dataset - Column.Dataset = 'GivenNames' (only relevant when running the mask step)

Data types

Expressions are type-safe, meaning that different data types cannot be interchanged. The supported data types are:

Data typeDescriptionExample
characterA single character. e.g. a letter, a numerical digit, a space, a punctuation mark. 'x'
stringA series of characters."hello"
decimalA floating point numeric. e.g. 12.345, 0.0012123.45
integerAn integral numeric. e.g. 123, 1000012345
date

A date. e.g. 22 June 2012

Dates need to be defined as YYYY/MM/DD.

DATE(2012/06/22)
NULLA null value.NULL

If you do need to interchange data types, use one of the conversions functions listed below.

Characters are quoted using single quotes. e.g. 'a'

Strings are quoted using double quotes. e.g. "hello"

Supported Functions

String functions

FunctionDescriptionInput(s)Return typeExample
upperConverts a string to uppercase.stringstringupper("hello") → HELLO
lowerConverts a string to lowercase.stringstringlower("HELLO") → hello
removeWhitespace

Removes all whitespace characters from a string.

stringstringremoveWhitespace("some thing") → something
trimRemoves all whitespace characters from the start and end of a string.stringstringtrim("  hello  ") → hello
trimStartRemoves all leading whitespace* characters from the start of a string.stringstringtrimStart("  hello") → hello
trimEndRemoves all trailing whitespace* characters from the end of a string.stringstringtrimEnd("hello  ") → hello
padLeftPads the start of a string with the specified character to the specified length.string, integer, characterstringpadLeft("hello", 3, 'x') → xxxhello
padRightPads the end of a string with the specified character to the specified length.string, integer, characterstring

padRight("hello", 3, 'x') → helloxxx

replaceReplaces all occurrences of a specified string with another.string, string, stringstringreplace("hello", "ll", "xx") → hexxo
substring

Returns a substring, starting at the specified starting position.

If a length is specified, will stop after that number of characters.

string, integer

string, integer, integer

string

substring("hello", 1) → ello

substring("hello", 1, 3) → ell

leftReturns the specified number of characters from the start of a string.string, integerstringleft("hello", 2) → he
rightReturns the specified number of characters from the end of a string.string, integerstringright("hello", 2) → lo

* Whitespace characters include spaces, tabs (\t), new lines (\n) and carriage returns (\r).

Conversion functions

FunctionDescriptionInput(s)Return typeExample
toStringConverts a value to a string.

(string | decimal | integer | date), string*

stringtoString(123.45, "0.0") → "123.4"
toDecimalConverts a value to a decimal.stringdecimaltoDecimal("123.45") → 123.45
toIntegerConverts a value to an integer.stringintegertoInteger("123") → 123
toDateConverts a value to a date.string, string*datetoDate("23012025", "ddMMyyyy") → 23/1/2025

* A format string

Format strings

The format strings passed to toString and toDate determine the format of the converted value. These use the standard format patterns for numbers and dates.

Date functions

FunctionDescriptionInput(s)Return typeExample
dayReturns the day part of a date.dateintegerday(date(23/1/2025)) → 23
monthReturns the month part of a date.dateintegermonth(date(23/1/2025)) → 1
yearReturns the year part of a date.dateintegeryear(date(23/1/2025)) → 2025

Null functions

FunctionDescriptionInput(s)Return typeExample
ifNullReturns a value if the input value is NULL, otherwise returns the input value.any type, any typeany typeifNull(NULL, "hello") → "hello"
nullIfReturns NULL is the input value matches the specified value, otherwise returns the input value.any type, any typeany typenullIf("hello", "hello") → NULL
nullIfEmptyReturns NULL if the input string contains only whitespace characters (include spaces, tabs (\t), new lines (\n) and carriage returns (\r)).stringstring or NULLnullIfEmpty("  ") → NULL

Syntax errors

When working with transforms, it is possible to get a syntax error. This can be due to a number of reasons:

CauseHow to resolve
invalid syntaxUse the correct syntax, taking care to ensure that all opening parentheses are closed.
incompatible data typeEnsure that the expected data type is being passed to a function, using a conversion function if necessary.

When using strings in an expression in a masking.json file, it is necessary to escape the double quotes surrounding the string:

{
  "name": "SSN",
  "dataset": "USSocialSecurityNumbers",
  "transforms": {
    "pre": "replace(${value}, \"-\", \"\")"
  }
}

Example expressions

ExpressionDescriptionData conversion
toDecimal(trim(${value}))Removes whitespace from both ends of the value, then converts it to a decimal."  20.013   " → 20.013
replace(removeWhitespace(${value}), "-", "")Removes all whitespace and hyphens from a string."555 123-4567" → "5551234567"




Didn't find what you were looking for?