Redgate Test Data Manager

5. Optional - Manual CLI Tutorial

Why this tutorial exists

The Manual CLI Tutorial is designed for users who want a more granular, flexible approach to using Test Data Manager’s CLI tools outside the full Autopilot PowerShell experience.

  • Security-conscious environments where running a full PowerShell script isn’t allowed or feasible

  • Teams wanting to understand and troubleshoot each individual step of the subsetting and anonymization process

  • Users interested in porting or running scripts on Linux, as the tutorial includes equivalent Bash scripts

  • DevOps teams aiming to integrate CLI commands individually into CI/CD pipelines

Repository structure overview

The scripts and configuration files are hosted here:

https://github.com/red-gate/TDM-AutoPilot/tree/main/Steps/CLI_Tutorials

  • masking-options.json and subset-options.json — configuration files used to define behavior for subsetting and masking

  • Windows/ — PowerShell scripts implementing the tutorial steps for Windows environments

  • Linux/ — Bash scripts implementing the same tutorial steps for Linux environments

Tutorial step scripts explained

Each script performs one focused task within the data provisioning and masking workflow:

ScriptPurpose
00_rgsubset_explainRuns rgsubset explain to show what subsetting operations will occur; good for troubleshooting and validating options.
01_rgsubset_runExecutes the actual subset operation using the configuration in subset-options.json.
02_rganonymize_classifyRuns rganonymize classify on the target database, generating a classification JSON file describing sensitive columns.
03_rganonymize_mapUses the classification JSON to generate a masking options JSON file (masking.json).
04_rganonymize_maskApplies the masking rules in the generated masking.json to anonymize sensitive data in the target database.
05_RunAllHelper script to run all the above steps sequentially, pausing after each to allow inspection.

Important script variables to update

Before running these scripts, review and update the following variables to match your environment:

VariableDescriptionExample Value
$DB_ENGINEThe database engine (e.g., SqlServer)"SqlServer"
$SOURCE_CONN_STRINGConnection string for the source (Restored Prod) database"Server=localhost;Database=AutopilotProd_FullRestore;Trusted_Connection=true;Trust Server Certificate=true;"
$TARGET_CONN_STRINGConnection string for the target (test playground) database"Server=localhost;Database=Autopilot_Test;Trusted_Connection=true;Trust Server Certificate=true;"
$OPTIONS_FILEPath to JSON file with subsetting or masking options"..\subset-options.json" or "..\masking-options.json"
$OUTPUT_FILEPath for output logs or generated JSON files"..\subset_log.json"
$LOG_LEVELLogging verbosity level (Debug, Info, Error)"Debug"
VariableDescriptionExample Value
$DB_ENGINEThe database engine"SqlServer"
$TARGET_CONN_STRINGConnection string for the target (test playground) DB"Server=localhost;Database=Autopilot_Test;Trusted_Connection=true;"
$OUTPUT_FILEPath where classification JSON will be saved"..\classification.json"
$LOG_LEVELLogging verbosity"Debug"
VariableDescriptionExample Value
$CLASSIFICATION_FILEPath to classification JSON generated in previous step"..\classification.json"
$OUTPUT_FILEPath where masking options JSON will be saved"..\masking-options.json"
$LOG_LEVELLogging verbosity"Debug"
VariableDescriptionExample Value
$MASKING_FILEPath to masking options JSON"..\masking-options.json"
$TARGET_CONN_STRINGConnection string for the target database"Server=localhost;Database=Autopilot_Test;Trusted_Connection=true;"
$LOG_LEVELLogging verbosity"Debug"

Example snippet from 00_rgsubset_explain.ps1

00_rgsubset_explain.ps1

# Subset data using rgsubset - explain mode
# Shows planned subset operations without executing

$DB_ENGINE = "SqlServer"
$SOURCE_CONN_STRING = "Server=localhost;Database=AutopilotProd_FullRestore;Trusted_Connection=true;Trust Server Certificate=true;"
$OPTIONS_FILE = "..\subset-options.json"
$OUTPUT_FILE = "..\subset_log.json"
$LOG_LEVEL = "Debug"

Write-Host "Running subset explain for database engine: $DB_ENGINE"

rgsubset explain `
  --database-engine $DB_ENGINE `
  --source-connection-string "$SOURCE_CONN_STRING" `
  --options-file "$OPTIONS_FILE" `
  --log-level $LOG_LEVEL `
  --output-file $OUTPUT_FILE

How to use these scripts

  1. Customize connection strings and paths in each script to reflect your environment

  2. Adjust the JSON option files (subset-options.json, masking-options.json) as needed for your desired subset size or masking rules

  3. Run scripts in order to follow the workflow:

    • Preview subset with explain (optional, but recommended)

    • Run the subset operation

    • Classify sensitive columns

    • Generate masking options

    • Apply masking

  4. Use the 05_RunAll helper script to automate the sequence, or pick individual steps to integrate into your own pipelines or processes


Benefits of this approach

  • Full visibility and control over every step of your data provisioning process

  • Ability to adapt scripts easily for different environments (Windows/Linux)

  • Modular approach fits well with CI/CD pipelines, allowing automation and monitoring at each stage

  • Useful for troubleshooting or validating changes without running the full Autopilot experience


Didn't find what you were looking for?