Getting started with the Subset CLI
Published 16 October 2025
This guide walks you through creating your first subset using the command line.
Prerequisites
Before you start, ensure you have:
- Subsetting CLI installed
- A source database to subset from
- A target database with the same schema as your source
Why Foreign Keys Matter
Subsetting maintains referential integrity by following foreign key relationships. Without foreign keys, the tool cannot determine which related records to include, which may result in orphaned records or incomplete data sets.
Before your first subset, verify your database has foreign key constraints:
→ Verifying foreign keys for subsetting
If your database lacks foreign key constraints, you can define manual relationships in the options file to tell the subsetter how tables are related.
See manualRelationships in the subsetting options file reference.
Prepare Your Environment
Source Database
- Use a representative or test database for your subsetting source
- This could be a development or test environment with data patterns similar to production
- Ensure you have sufficient access privileges
- For safety, never subset directly from production databases
See subsetter connection requirements for connection string formats and required privileges.
Target Database
You need a target database with the same schema structure as your source.
You can create this by:
- Restoring a backup of your source database
- Using your database's schema export/import tools
- Running database creation scripts that include only the schema objects
Note: The target database doesn't need to be empty. We'll use overwrite mode to automatically clear existing data before subsetting.
See create an empty target database for more details.
Run Your First Subset
Step 1: Authenticate
rgsubset auth login --i-agree-to-the-eula
Or start a trial if you don't have a license:
rgsubset auth login --i-agree-to-the-eula --start-trial
Step 2: Run a Basic Subset
rgsubset run ` --database-engine=sqlserver ` --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-database-write-mode Overwrite
Note: The --target-database-write-mode Overwrite flag automatically clears existing data from the target database before subsetting. This means you don't need to manually empty the target database first.
Step 3: Review the Results
Check your target database to verify the subset was created successfully.
Default behavior: When running without specifying a desired size or starting tables, the subsetter defaults to creating a subset that is 10% of the original database size (up to a maximum of 1GB).
Refine Your Subset
Once you've run your first subset, you'll often want to control size or exclude problematic tables.
Control Subset Size
Specify how large you want your subset using an options file.
Create a file called subset-options.json:
{
"desiredSize": "500MB"
}You can specify size as a percentage (10%) or explicit size (5GB, 500MB).
Then run subsetting with the options file:
rgsubset run ` --database-engine=sqlserver ` --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-database-write-mode Overwrite ` --options-file subset-options.json
Exclude Problem Tables
If certain tables cause issues (unsupported data types, permissions problems, etc.), you can exclude them using an options file.
Create a file called subset-options.json:
{
"excludedTables": [
{
"schema": "dbo",
"name": "AuditLog"
},
{
"schema": "dbo",
"name": "ErrorLog"
}
]
}Then run subsetting with the options file:
rgsubset run ` --database-engine=sqlserver ` --source-connection-string="Server=sourceServer;Database=sourceDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-connection-string="Server=targetServer;Database=targetDB;User Id=username;Password=password;TrustServerCertificate=yes;" ` --target-database-write-mode Overwrite ` --options-file subset-options.json
Note: Tables that have foreign keys referencing an excluded table will also be excluded automatically.
See subsetting options file for more advanced configuration options.
Next Steps
Extract Specific Data
Use starting tables to control exactly which data is included (e.g., all customers from a specific region and their related orders):
See startingTables in the subsetting options file reference.
Advanced Configuration
For more complex scenarios, you can use an options file to specify:
- Multiple starting tables
- Static data tables
- Manual relationships
See subsetting options file for full details.
Troubleshooting
If you encounter issues, check the subsetting troubleshooting guide for common problems and solutions.