There are sometimes cases where the data type of a column does not match that supported by a dataset. This means that the dataset cannot be applied to the column.

For example, the USSocialSecurityNumbers dataset generates US Social Security numbers in the format 989-11-9389. If the database column used to store this data is defined as a number, it cannot then store the data generated by the dataset when masking takes place.

To get around this problem, pre-masking and post-masking transforms can be used to manipulate the data into different formats.

Pre-masking transforms

Pre-masking transforms run after the data is loaded from the database, but before it is masked. This enables the original data to be transformed into a different form/type suitable for the applied dataset.

Post-masking transforms

Post-masking transforms run after the data is masked, but before it is written back to the database. This enables the masked data to be transformed into a different form/type suitable for the column being masked.

Configuring transforms

On a column

Pre-masking and post-masking transforms can be configured on a column in the masking.json file. Both are optional, allowing a pre-masking transform or post-masking transform to be defined on their own, or both can be defined.

{
  "tables": [
    {
      "schema": "Person",
      "name": "Address",
      "columns": [
        {
          "name": "SSN",
          "dataset": "USSocialSecurityNumbers",
          "transforms": {
            "pre": "...",
            "post": "...",
          }
        }
      ]
    }
  ]
}

Transforms take the form of an expression that is used to manipulate the data, similar to how it is done within a cell in Microsoft Excel.

The special placeholder ${value} is used to represent the input to the transform. For pre-masking transforms, this represents the value from the database column. For post-masking transforms, this represents the output of the dataset after masking.

Using a transform rule

Transform rules can also be defined in the masking.json file or the options.json file. These apply a similar syntax to classification rules, and allow conditions to be applied to determine whether to apply the transform to a given column.

{
  "transformRules": [
    {
      "condition": "Table.Name = 'Customers' AND Column.Name = 'PostalCode'",
      "pre": "removeWhitespace(${value})",
      "post": "upper(${value})"
    }
  ]
}

Transform rules will apply the specified pre-masking and post-masking transforms to any columns that match the specified condition.

Both pre-masking and post-masking transforms are again optional, allowing a pre-masking transform or post-masking transform to be defined on their own, or both can be defined.

In the above example, the transforms will be applied to any columns called "PostalCode" in any table called "Customers".

In addition to the conditions used by classification rules, transform rules can also be about:

a column's classification - Column.Classification = 'GivenNames' (only relevant when running the map step)
a column's dataset - Column.Dataset = 'GivenNames' (only relevant when running the mask step)

Data types

Expressions are type-safe, meaning that different data types cannot be interchanged. The supported data types are:

Data type	Description	Example
character	A single character. e.g. a letter, a numerical digit, a space, a punctuation mark.	`'x'`
string	A series of characters.	`"hello"`
decimal	A floating point numeric. e.g. 12.345, 0.0012	`123.45`
integer	An integral numeric. e.g. 123, 10000	`12345`
date	A date. e.g. 22 June 2012 Dates need to be defined as `YYYY/MM/DD`.	`DATE(2012/06/22)`
boolean	`true` or `false`	`true`
NULL	A null value.	`NULL`

If you do need to interchange data types, use one of the conversions functions listed below.

Characters are quoted using single quotes. e.g. 'a'

Strings are quoted using double quotes. e.g. "hello"

Supported Functions

String functions

Function	Description	Input(s)	Return type	Example
`upper`	Converts a string to uppercase.	string	string	`upper("hello") → HELLO`
`lower`	Converts a string to lowercase.	string	string	`lower("HELLO") → hello`
`removeWhitespace`	Removes all whitespace characters from a string.	string	string	`removeWhitespace("some thing") → something`
`trim`	Removes all whitespace characters from the start and end of a string.	string	string	`trim(" hello ") → hello`
`trimStart`	Removes all leading whitespace* characters from the start of a string.	string	string	`trimStart(" hello") → hello`
`trimEnd`	Removes all trailing whitespace* characters from the end of a string.	string	string	`trimEnd("hello ") → hello`
`padLeft`	Pads the start of a string with the specified character to the specified length.	string, integer, character	string	`padLeft("hello", 3, 'x') → xxxhello`
`padRight`	Pads the end of a string with the specified character to the specified length.	string, integer, character	string	`padRight("hello", 3, 'x') → helloxxx`
`replace`	Replaces all occurrences of a specified string with another.	string, string, string	string	`replace("hello", "ll", "xx") → hexxo`
`substring`	Returns a substring, starting at the specified starting position. If a length is specified, will stop after that number of characters.	string, integer string, integer, integer	string	`substring("hello", 1) → ello` `substring("hello", 1, 3) → ell`
`left`	Returns the specified number of characters from the start of a string.	string, integer	string	`left("hello", 2) → he`
`right`	Returns the specified number of characters from the end of a string.	string, integer	string	`right("hello", 2) → lo`
`concat`	Concatenates multiple string values together.	string, string	string	`concat("hel", "lo") → hello`

* Whitespace characters include spaces, tabs (\t), new lines (\n) and carriage returns (\r).

Conversion functions

Function	Description	Input(s)	Return type	Example
`toString`	Converts a value to a string.	(string \| integer \| decimal \| date), string*	string	`toString(123.45, "0.0") → "123.4"`
`toDecimal`	Converts a value to a decimal.	string	decimal	`toDecimal("123.45") → 123.45`
`toInteger`	Converts a value to an integer.	string	integer	`toInteger("123") → 123`
`toDate`	Converts a value to a date.	string, string*	date	`toDate("23012025", "ddMMyyyy") → 23/1/2025`

* A format string

Format strings

The format strings passed to toString and toDate determine the format of the converted value. These use the standard format patterns for numbers and dates.

Date functions

Function	Description	Input(s)	Return type	Example
`day`	Returns the day part of a date.	date	integer	`day(date(23/1/2025)) → 23`
`month`	Returns the month part of a date.	date	integer	`month(date(23/1/2025)) → 1`
`year`	Returns the year part of a date.	date	integer	`year(date(23/1/2025)) → 2025`

Boolean functions

Function	Description	Input(s)	Return type	Example
`between`	Returns `true` if the first value is between the second and third values, otherwise `false`.	(string \| integer \| decimal \| date), (string \| integer \| decimal \| date), (string \| integer \| decimal \| date)	boolean	`between(7, 5, 10) → true`
`contains`	Returns `true` if the first string contains the second string, otherwise `false`.	string, string	boolean	`contains(` `"hello", "ell") → true`
`startsWith`	Returns `true` if the first string starts with the second string, otherwise `false`.	string, string	boolean	`startsWith("hello", "he") → true`
`endsWith`	Returns `true` if the first string ends with the second string, otherwise `false`.	string, string	boolean	`endsWith("hello", "lo") → true`
`isNull`	Returns `true` if the value is `NULL`, otherwise `false`.	any type	boolean	`isNull(NULL) → true`
`oneOf`	Returns `true` if the first value is one of the other values, otherwise `false`.	any type	boolean	`oneOf(1,` `[1, 2, 3]) → true`

Null functions

Function	Description	Input(s)	Return type	Example
`ifNull`	Returns a value if the input value is NULL, otherwise returns the input value.	any type, any type	any type	`ifNull(NULL, "hello") → "hello"`
`nullIf`	Returns NULL is the input value matches the specified value, otherwise returns the input value.	any type, any type	any type	`nullIf("hello", "hello") → NULL`
`nullIfEmpty`	Returns NULL if the input string contains only whitespace characters (include spaces, tabs (`\t`), new lines (`\n`) and carriage returns (`\r`)).	string	string or NULL	`nullIfEmpty(" ") → NULL`

Syntax errors

When working with transforms, it is possible to get a syntax error. This can be due to a number of reasons:

Cause	How to resolve
invalid syntax	Use the correct syntax, taking care to ensure that all opening parentheses are closed.
incompatible data type	Ensure that the expected data type is being passed to a function, using a conversion function if necessary.

When using strings in an expression in a masking.json file, it is necessary to escape the double quotes surrounding the string:

{
  "name": "SSN",
  "dataset": "USSocialSecurityNumbers",
  "transforms": {
    "pre": "replace(${value}, \"-\", \"\")"
  }
}

Example expressions

Expression	Description	Data conversion
`toDecimal(trim(${value}))`	Removes whitespace from both ends of the value, then converts it to a decimal.	" 20.013 " → 20.013
`replace(removeWhitespace(${value}), "-", "")`	Removes all whitespace and hyphens from a string.	"555 123-4567" → "5551234567"

Redgate Test Data Manager

Pre-masking and post-masking transforms