The Run Statistics Tab
Published 19 March 2018
The Data Masker Run Statistics Tab
The primary function of the Run Statistics tab is to indicate the progress of the currently executing masking rules and also display aggregate statistics for the operational masking set. The current status of each running rule is presented in the top part of the page and statistics for the set as a whole are displayed towards the bottom.
The display at the top of the Run Statistics tab lists the rules currently executing in the masking set. It is possible to execute up to eight rules simultaneously in independent processes called worker threads. Each worker thread is provided with a row in the display. Note that only information on actively executing rules will be present in this display. When a masking rule stops executing (whether through completion or by error) its information will be removed from the display. Statistics and status information on each rule, whether active or not, can be found on the Rule Statistics Tab.
Note that double clicking on any worker thread actively running a rule will launch a form which displays that rules configuration.
There are seven vertical columns in the masking rule display. These columns provide specific information on various components of each worker thread and the rule it is running.
What the columns in the Worker Thread Panel mean
This column indicates the ID number of the worker running the rule. It is not significant other than as an identifier for diagnostics in the log file - all workers are identical in functionality.
A text field which indicates if the worker is currently operational. Data Masker will only run rules simultaneously if they are in the same rule block. It is possible to temporarily see workers listed as Idle if there are no rules remaining in the current rule block that are available to run.
A summary of the rule block, rule ID and the rule type.
Rows to Process
This column indicates the number of rows which will be processed by the rule. This value will be blank or zero until the rule has begun to execute and, for larger tables, can be seen to increment as the Data Masker software determines which rows are eligible for masking according to the rules configuration. Be aware that Where Clause or Sampling options can restrict the number of masked rows to less than the total number of rows in the target table. In such cases, only the number of rows to be masked will be counted in the Rows to Process column. Some rules, such as Command rules cannot return information regarding the number of rows they will process and will display a blank value in this field.
This column indicates the number of rows processed by the rule. If a rule is currently executing, this display will be updated periodically. When the masking rule has completed, this figure will indicate the total number of rows processed and it will be equal to the value in the Rows to Process column. Some rules, such as Command rules cannot return information regarding their progress. In these cases, this field will be blank.
Run Time (sec)
This is the total run time in seconds for the rule. This field will update continuously while a rule is executing.
Rows per Second
Masking rules will execute at a variety of rates depending on the rule type, database configuration and a large number of other factors. This field provides an indication of how many rows each rule is processing per second. The value is derived from the other statistics via the formula: (Rows Processed)/(Time in Seconds).
The Number of Rule Worker Threads Setting
The Number of Rule Worker Threads combo-box offers the option of increasing or decreasing the number of workers available for simultaneous masking operations. This value can be adjusted (up or down) while the set is running and the workers will enable or disable as required. The number of worker threads specified is saved with the masking set and will be restored the next time the masking set is loaded.
The most effective number of workers to use depends on a number of factors: the speed and memory of the PC on which the Data Masker software is running, the speed of the network connection to the remote SQL Server database, the speed of the SQL Server database and number of CPU's on the SQL Server database platform. As a general rule, it is inefficient to set the number of workers too high since bottlenecks will occur that can cause the total execution time to be longer than if fewer workers were running. Since the correct setting depends on a number of variables finding the most effective setting is something of a trial and error process. A typical setting would be to set the number of worker threads to be equal to the number of CPU's on the SQL Server database server.
The Masking Set Statistics Panel
Various aggregate statistics for the currently running masking set are displayed at the bottom of the Run Statistics tab.
Total run time
The total time the masking set has been operational.
The total number of enabled masking rules eligible to be executed.
Total rules complete
The total number of enabled masking rules which have completed execution.
Total masked rows
The number of rows processed by all masking rules. Some rules, such as Command rules cannot return information regarding their progress. The actions of such rules will not contribute to this value.
Average masked rows per sec
This column indicates the number of rows processed by the masking set. If rules are currently executing, this display can be seen to update periodically. Some rules, such as Command rules cannot return information regarding their progress. In these cases this field will be blank.
Average masked cols per sec
Some types of masking rule (such as Substitution rules) have the ability to operate on multiple columns within the same rule. This provides a dramatic speed improvement. The Average masked cols per sec information is the sum of the Rows/Sec value for each rule multiplied by the number of columns the rule is masking divided by the total number of seconds the masking set has been operational.
Masking set status
A text field indicating the current execution state of the masking set.
The Pink Rhino
The pink rhino graphic was a bit of artwork placed on the Run the Masking Set button on an early version of the Data Masker software. We were going to leave it out of this version, but our customers complained - so we put it back in and animated it. Now it serves as a slightly surreal visual indicator that the masking set is running. If you do not wish to see the Pink Rhino graphic it can be removed using the option on the Misc. Setup Tab.