5. Optional - Manual CLI Tutorial
Published 02 July 2025
Why this tutorial exists
The Manual CLI Tutorial is designed for users who want a more granular, flexible approach to using Test Data Manager’s CLI tools outside the full Autopilot PowerShell experience.
Security-conscious environments where running a full PowerShell script isn’t allowed or feasible
Teams wanting to understand and troubleshoot each individual step of the subsetting and anonymization process
Users interested in porting or running scripts on Linux, as the tutorial includes equivalent Bash scripts
DevOps teams aiming to integrate CLI commands individually into CI/CD pipelines
Repository structure overview
The scripts and configuration files are hosted here:
https://github.com/red-gate/TDM-AutoPilot/tree/main/Steps/CLI_Tutorials
masking-options.json and subset-options.json — configuration files used to define behavior for subsetting and masking
Windows/ — PowerShell scripts implementing the tutorial steps for Windows environments
Linux/ — Bash scripts implementing the same tutorial steps for Linux environments
Tutorial step scripts explained
Each script performs one focused task within the data provisioning and masking workflow:
| Script | Purpose |
|---|---|
00_rgsubset_explain | Runs rgsubset explain to show what subsetting operations will occur; good for troubleshooting and validating options. |
01_rgsubset_run | Executes the actual subset operation using the configuration in subset-options.json. |
02_rganonymize_classify | Runs rganonymize classify on the target database, generating a classification JSON file describing sensitive columns. |
03_rganonymize_map | Uses the classification JSON to generate a masking options JSON file (masking.json). |
04_rganonymize_mask | Applies the masking rules in the generated masking.json to anonymize sensitive data in the target database. |
05_RunAll | Helper script to run all the above steps sequentially, pausing after each to allow inspection. |
Important script variables to update
Before running these scripts, review and update the following variables to match your environment:
Example snippet from 00_rgsubset_explain.ps1
00_rgsubset_explain.ps1
# Subset data using rgsubset - explain mode # Shows planned subset operations without executing $DB_ENGINE = "SqlServer" $SOURCE_CONN_STRING = "Server=localhost;Database=AutopilotProd_FullRestore;Trusted_Connection=true;Trust Server Certificate=true;" $OPTIONS_FILE = "..\subset-options.json" $OUTPUT_FILE = "..\subset_log.json" $LOG_LEVEL = "Debug" Write-Host "Running subset explain for database engine: $DB_ENGINE" rgsubset explain ` --database-engine $DB_ENGINE ` --source-connection-string "$SOURCE_CONN_STRING" ` --options-file "$OPTIONS_FILE" ` --log-level $LOG_LEVEL ` --output-file $OUTPUT_FILE
How to use these scripts
Customize connection strings and paths in each script to reflect your environment
Adjust the JSON option files (
subset-options.json,masking-options.json) as needed for your desired subset size or masking rulesRun scripts in order to follow the workflow:
Preview subset with explain (optional, but recommended)
Run the subset operation
Classify sensitive columns
Generate masking options
Apply masking
Use the
05_RunAllhelper script to automate the sequence, or pick individual steps to integrate into your own pipelines or processes
Benefits of this approach
Full visibility and control over every step of your data provisioning process
Ability to adapt scripts easily for different environments (Windows/Linux)
Modular approach fits well with CI/CD pipelines, allowing automation and monitoring at each stage
Useful for troubleshooting or validating changes without running the full Autopilot experience