Redgate Test Data Manager

Getting started with the Subset CLI

This guide walks you through creating your first subset using the command line.

Prerequisites

Before you start, ensure you have:

  • Subsetting CLI installed
  • A source database to subset from
  • A target database with the same schema as your source

Why Foreign Keys Matter

Subsetting maintains referential integrity by following foreign key relationships. Without foreign keys, the tool cannot determine which related records to include, which may result in orphaned records or incomplete data sets.

Before your first subset, verify your database has foreign key constraints:

→ Verifying foreign keys for subsetting 

If your database lacks foreign key constraints, you can define manual relationships in the options file to tell the subsetter how tables are related.

See manualRelationships in the subsetting options file reference.

Prepare Your Environment

Source Database

  • Use a representative or test database for your subsetting source
  • This could be a development or test environment with data patterns similar to production
  • Ensure you have sufficient access privileges
  • For safety, never subset directly from production databases

See subsetter connection requirements for connection string formats and required privileges.

Target Database

You need a target database with the same schema structure as your source.

You can create this by:

  • Restoring a backup of your source database
  • Using your database's schema export/import tools
  • Running database creation scripts that include only the schema objects

Note: The target database doesn't need to be empty. We'll use overwrite mode to automatically clear existing data before subsetting.

See create an empty target database for more details.

Run Your First Subset

Step 1: Authenticate


rgsubset auth login --i-agree-to-the-eula

Or start a trial if you don't have a license:

rgsubset auth login --i-agree-to-the-eula --start-trial

Step 2: Run a Basic Subset

rgsubset run `
  --database-engine=sqlserver `
  --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-database-write-mode Overwrite

Note: The --target-database-write-mode Overwrite flag automatically clears existing data from the target database before subsetting. This means you don't need to manually empty the target database first.

Step 3: Review the Results

Check your target database to verify the subset was created successfully.

Default behavior: When running without specifying a desired size or starting tables, the subsetter defaults to creating a subset that is 10% of the original database size (up to a maximum of 1GB).

Refine Your Subset

Once you've run your first subset, you'll often want to control size or exclude problematic tables.

Control Subset Size

Specify how large you want your subset using an options file.

Create a file called subset-options.json:

{
  "desiredSize": "500MB"
}

You can specify size as a percentage (10%) or explicit size (5GB, 500MB).

Then run subsetting with the options file:

rgsubset run `
  --database-engine=sqlserver `
  --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-database-write-mode Overwrite `
  --options-file subset-options.json

Exclude Problem Tables

If certain tables cause issues (unsupported data types, permissions problems, etc.), you can exclude them using an options file.

Create a file called subset-options.json:

{
  "excludedTables": [
    {
      "schema": "dbo",
      "name": "AuditLog"
    },
    {
      "schema": "dbo", 
      "name": "ErrorLog"
    }
  ]
}

Then run subsetting with the options file:

rgsubset run `
  --database-engine=sqlserver `
  --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" `
  --target-database-write-mode Overwrite `
  --options-file subset-options.json

Note: Tables that have foreign keys referencing an excluded table will also be excluded automatically.

See subsetting options file for more advanced configuration options.

Next Steps

Extract Specific Data

Use starting tables to control exactly which data is included (e.g., all customers from a specific region and their related orders):

See startingTables in the subsetting options file reference.

Advanced Configuration

For more complex scenarios, you can use an options file to specify:

  • Multiple starting tables
  • Static data tables
  • Manual relationships

See subsetting options file for full details.

Troubleshooting

If you encounter issues, check the subsetting troubleshooting guide for common problems and solutions.



Didn't find what you were looking for?