Data Masker

The Data Masker Row-Internal Rule and JSON Data

Version 6 of Data Masker for SQL Server includes native support for masking JSON data and is recommenced for most JSON masking operations. See About JSON Masker Rules.

This document only applies to Data Masker for SQL Server as it relies on the JSON_MODIFY function.

The Row-Internal rule of the Data Masker application can be used to mask JSON data. This tech note provides some advice and assistance on this process.

Prerequisites

Assume there is a table named JSONArrayObjectArray containing complex JSON data with which has the following structure:

DROP TABLE [dbo].[JSONArrayObjectArray]
CREATE TABLE [dbo].[JSONArrayObjectArray]
(
[idcol] [int] NOT NULL,
[dataVal] [varchar](50) NULL,
[jsoncol] [nvarchar](max) NULL
CONSTRAINT [PK_JSONArrayObjectArray] PRIMARY KEY CLUSTERED
(
[idcol] ASC
) ON [PRIMARY]
) ON [PRIMARY];

truncate table JSONArrayObjectArray

The table could be populated with statements like:

insert into JSONArrayObjectArray (idcol, dataval, jsoncol) values (1, 'a', '{"idval":"0001_A",
"age":31, "cars": [{ "idval":"Ford_A", "models":[ "Fiesta_A", "Focus_A", "Mustang_A" ] },{
"idval":"BMW_A", "models":[ "320_A", "X3_A", "X5_A" ] },{ "idval":"Fiat_A", "models":[
"500_A", "Panda_A" ] }]} ');

insert into JSONArrayObjectArray (idcol, dataval, jsoncol) values (2, 'b', '{"idval":"0002_B",
"age":32, "cars": [{ "idval":"Ford_B", "models":[ "Fiesta_B", "Focus_B", "Mustang_B" ] },{
"idval":"BMW_B", "models":[ "320_B", "X3_B", "X5_B" ] },{ "idval":"Fiat_B", "models":[
"500_B", "Panda_B" ] }]} ');

insert into JSONArrayObjectArray (idcol, dataval, jsoncol) values (3, 'c', '{"idval":"0001_C",
"age":33, "cars": [{ "idval":"Ford_C", "models":[ "Fiesta_C", "Focus_C", "Mustang_C" ] },{
"idval":"BMW_C", "models":[ "320_C", "X3_C", "X5_C" ] },{ "idval":"Fiat_C", "models":[
"500_C", "Panda_C" ] }]} ');

insert into JSONArrayObjectArray (idcol, dataval, jsoncol) values (4, 'd', '{"idval":"0001_D",
"age":34, "cars": [{ "idval":"Ford_D", "models":[ "Fiesta_D", "Focus_D", "Mustang_D" ] },{
"idval":"BMW_D", "models":[ "320_D", "X3_D", "X5_D" ] },{ "idval":"Fiat_D", "models":[
"500_D", "Panda_D" ] }]} ');

insert into JSONArrayObjectArray (idcol, dataval, jsoncol) values (5, 'e', '{"idval":"0001_E",
"age":35, "cars": [{ "idval":"Ford_E", "models":[ "Fiesta_E", "Focus_E", "Mustang_E" ] },{
"idval":"BMW_E", "models":[ "320_E", "X3_E", "X5_E" ] },{ "idval":"Fiat_E", "models":[
"500_E", "Panda_E" ] }]} ');

The content of the table could be viewed with a statement like:

select idcol, dataVal, JSON_QUERY(jsoncol) as j
from JSONArrayObjectArray

Simple Masking of JSON Content

Using standard SQL, the value of the age property of the JSON object in every row could be updated to the same value with a statement like:

Update JSONArrayObjectArray set jsoncol = JSON_MODIFY(jsoncol, '$.age', 'XX');

In the above SQL the text '$.age' is the path to the data of JSON property to update and 'XX' is the value to use as the replacement.

We can use a very similar SQL clause in a Data Masker Row-Internal rule to perform the same operation – except that the 'XX' constant above can be replaced in each row with a unique value from a specified dataset.

The Masking Procedure

  1. Set up a standard Row-Internal rule on the JSONArrrayObjectArray table and ensure it references the jsoncol column.

  2. Use a Replacement Clause like the one below. Note there are no single quotes around the DMSPARAM1 value. 

    JSON_MODIFY(jsoncol, '$.age', DMSPARAM1);
  3. Use a dataset of your choice for the parameter. In the example below, we will use the Numbers, Integer, Random, As Text dataset.

  4. Run the rule.

After running the Row-Internal rule you should be able to see that the JSON age values are masked with random integer values.

JSON With Complex Paths

JSON data can get quite complex. However, the masking of a complex path is straight forward. An SQL statement like the one below will update the value of the first model of the first cars object:

Update JSONArrayObjectArray
set jsoncol = JSON_MODIFY(jsoncol, '$.cars[0].models[0]', '66')

In the above SQL the text '$.cars[0].models[0]' is the path to the data of JSON object to update and '66' is the value to use as the replacement.

A Data Masker Row Internal Rule could just use a Replacement Clause of…

JSON_MODIFY(jsoncol, '$.cars[0].models[0]', DMSPARAM1)

… to update that data item.

It must be noted that a rule using the Replacement Clause above would only ever update the first model of the first car. If you also needed to update the second model of the first car and the third model of the second car then you would need two additional Row-Internal rules. This requirement is an artifact of the specific nature of the JSON path – although it should be noted that the forthcoming JSONMasker rule, (currently under development), hopes to provide mechanisms to remove this requirement.

Simple JSON Synchronization

A Row-Internal rule can use data in other columns in the same row as replacement content for an updated JSON value. The concept is identical to that which would be used in a normal Row-Internal rule operating on any text based field.

For example, if the dataVal column has a value which should be substituted inside the JSON object then it can simply be incorporated into the Replacement Clause. The Replacement Clause below will update the idval property of the JSON with the content of the external dataVal field in that row.

JSON_MODIFY(jsoncol, '$.idval', dataVal)

A screenshot of the relevant part of the Row-Internal rule is shown below:

If the dataVal column were populated (even temporarily) via other standard Data Masker methods, the Row-Internal rule can then act as a simple synchronization mechanism.


Didn't find what you were looking for?