Subsetting command-line reference
Published 16 October 2023
This section provides a reference sheet for all the commands provided by the subsetter
CLI tool.
Contents
Running the CLI
CLI binary
- Follow the subsetting installation guide.
- Call the
subsetter
CLI with either command line parameters and/or a YAML/JSON config file.
Docker image
- Follow the subsetting installation.
- Use docker to run the subsetting image: as you would run a CLI binary (i.e. with the same arguments). The docker image name varies depending on the version of the subsetter that you are using. Note that
subsetter-0.1.0.0
below will need to be replaced with the subsetter filename from the version of the subsetter that you are using. For example version 0.2.0 has filenamesubsetter-0.2.0.0
and so on.Command line parameters - If you want to use command line parameters to specify the configuration, then you call it as you would run a CLI binary (i.e. with the same arguments). Note that
subsetter-0.1.0.0
below will need to be replaced with the subsetter filename from the version of the subsetter that you are using. For example version 0.2.0 has filenamesubsetter-0.2.0.0
and so on.# Running a subsetting operation against a locally hosted SQL server instance docker run -it --network=host subsetter-0.1.0.0 --database-engine=sqlserver --source-connection-string="server=localhost;database=SomeSourceDatabase;trusted_connection=yes;TrustServerCertificate=yes" --target-connection-string="server=localhost;database=SomeTargetDatabase;trusted_connection=yes;TrustServerCertificate=yes" --starting-table="SomeSqlTable" --filter-clause="SomeFilter"
Configuration file -If you want to use a subsetting YAML/JSON configuration, you'll need to mount the configuration directory as in (Note: this is not needed if the config can be reached via network):
# Use a host mounted configuration file with local host access: # <PATH_TO_SUBSET_CONFIG_FOLDER> - A folder in your host machine # <PATH_TO_CONFIG_FILE_RELATIVE_CONFIG_FOLDER> - A filename or subfolder within <PATH_TO_SUBSET_CONFIG_FOLDER>. docker run -it --network=host --mount type=bind,source="<PATH_TO_SUBSET_CONFIG_FOLDER>",target=/app/configs subsetter-0.1.0.0 --config-file "<PATH_TO_CONFIG_FILE_RELATIVE_CONFIG_FOLDER>" --source-connection-string="server=localhost;database=SomeSourceDatabase;trusted_connection=yes;TrustServerCertificate=yes" --target-connection-string="server=localhost;database=SomeTargetDatabase;trusted_connection=yes;TrustServerCertificate=yes"
Commands
Running a subset
You can provide the configuration file to use for subsetting either through the command line and/or a combination of a YAML/JSON config file and/or environment variables. If multiple sources are provided for the same parameter, values in the command line always take precedence over the ones stored in the configuration or environment.
Please check the installation requirements and known limitations listed in our troubleshooting guide.
Global flags and parameters
While every command has a set of dedicated key value parameters and flags that drive its behavior, subsetter
also has a set of global flags and parameters that can be applied to all commands (or subset of commands).
Flag/Parameter | Description | Example |
---|---|---|
--version | Show version information | subsetter --version |
-?, -h, --help | Show help and usage information | subsetter --help |
--console-log-level <Debug|Error|Fatal|Information|None|Verbose|Warning> | Minimum log level for the terminal's standard output and error. [default: Information] When running on Docker it could be useful to increase the console log level so that debug messages go to the console. | subsetter --console-log-level=Warning |
--file-log-level <Debug|Error|Fatal|Information|None|Verbose|Warning> | Minimum log level for the subsetter's log file. [default: Debug] | subsetter --file-log-level=Information |
--log-folder <log-folder> | Optional directory path to use when storing log files. If not specified, the default location varies per host Operating System (OS) running the tool. | subsetter --log-folder=C:\Users\***\Documents |
Via the command line parameters
The following parameters configure the subsetting:
Flag/Parameter | Description | Required |
---|---|---|
--config-file <config-file> | The path and filename to the config file: a YAML or JSON file containing parameters that control the subsetting. If the path is omitted, the config file must be in the current working directory. The config-file parameter must be provided if any required parameters are not provided as options to this command. | NO |
--database-engine <MySql|Oracle|PostgreSql|SqlServer> | The type of the database. The subsetter currently supports 4 database engines: MySql, Oracle, PostgreSql and SqlServer. Must be present if not in the config file (takes precedence). | YES |
--source-connection-string <source-connection-string> | Connection string for the source database. Must be present if not in the environment variable REDGATE_SUBSET_SOURCE_CONNECTION_STRING (takes precedence). See also the requirements for the source database on the installation requirements page. | YES |
--target-connection-string <target-connection-string> | Connection string for the target database. Must be present if not in the environment variable REDGATE_SUBSET_TARGET_CONNECTION_STRING (takes precedence). See also the requirements for the target database on the installation requirements page. | YES |
--target-size <target-size> | The approximate size to try and make the target database, in the form | NO |
--starting-table <starting-table> | The database table to apply the filter clause to, for use in the subsetting operation (case-sensitive). Either this or the target size must be specified but not both. Can be specified here or in the config file. | NO |
--filter-clause <filter-clause> | SQL query WHERE clause to be applied to the filter table. This is optional. Command line value takes precedence over config one. This option can only be used if --filter-table is provided. It can't be used in conjunction with --target-size. Note that the filter clause should not contain the The filter supports complex queries as shown in the list of example filters. |
|
--target-database-write-mode <Overwrite|Strict> | Execution mode for the subsetter. Supported values are Strict and Overwrite.
If omitted it defaults to Strict. Overwrite mode should be used with care and only when running the tool manually, as it can lead to data loss if pointed at the wrong target database. In either mode the subsetter only modifies tables that are part of the subset. This consideration can be important when using filter tables, as often these subsets don't include all tables in the database.
|
|
--output-file <file-path> | Specify a file path to put the results of a subset in JSON format. Currently it outputs:
The folder structure in the file path will be created if missing. |
|
For example:
# Using Windows authentication against local SQL Server database .\subsetter --database-engine=sqlserver --source-connection-string="server=SomeSqlServer;database=SomeSourceDatabase;trusted_connection=yes;TrustServerCertificate=true" --target-connection-string="server=SomeSqlServer;database=SomeTargetDatabase;trusted_connection=yes;TrustServerCertificate=true" --starting-table="dbo.Users" --filter-clause="FirstName LIKE 'Bob%'" # Using Sql authentication against local SQL Server database .\subsetter --database-engine=sqlserver --source-connection-string="server=SomeSqlServer;database=SomeSourceDatabase;Uid=sa;Pwd=123;TrustServerCertificate=true" --target-connection-string="server=SomeSqlServer;database=SomeTargetDatabase;Uid=sa;Pwd=123;TrustServerCertificate=true" --starting-table="dbo.Users" --filter-clause="FirstName LIKE 'Bob%'" # Subsetting two PostgreSQL local databases using SQL authentication .\subsetter --database-engine=postgresql --source-connection-string="Server=127.0.0.1;Port=5432;Database=SomeSourceDatabase;User Id=SomeUser;Password=SomePassword;" --target-connection-string="Server=127.0.0.1;Port=5432;Database=SomeTargetDatabase;User Id=SomeUser;Password=SomePassword;" --starting-table="dbo.Users" --filter-clause="Division = 'Engineering'"
This is not an exhaustive list of the command line possibilities. Please run subsetter --help
to discover some of our optional configuration parameters.
Via a YAML/JSON configuration file
Having a YAML/JSON configuration file already setup, makes the call to the CLI much simpler (but you still need to provide the connection strings or use the corresponding environment variable):
# Using a JSON configuration file... .\subsetter --config-file="<pathToConfigFile>/config.json" --source-connection-string="SomeConnectionString" --target-connection-string="SomeConnectionString" # ... or a YAML one .\subsetter --config-file="<pathToConfigFile>/config.yaml" --source-connection-string="SomeConnectionString" --target-connection-string="SomeConnectionString"
You still need the same mandatory parameters, i.e. source and target database connection strings, as when running via the command line parameters.
Exit codes
The subsetter
CLI will return the following exit codes when running commands.
Anything other than SUCCESS (0) will indicate a negative outcome.
Exit code | Designation | Outcome status | When |
---|---|---|---|
0 | Success |
| Used when a subsetter command completed successfully. |
1 | GenericFailure | FAILURE | Catch all for general errors that don't have special meaning. |
2 | UnhandledException | FAILURE | Used when no other error code is suitable. It's the default error code. |
3 | FailedInitialization | FAILURE | Used when failing initialization of services and tools. |
4 | CliInvokedIncorrectly | FAILURE | Used when provided with missing or mismatched command line parameters or configuration files. |
5 | InvalidConfiguration | FAILURE | Used when the provided configuration does not meet the subsetting requirements. |
6 | FailedTableExtraction | FAILURE | Used when the SQL tables to subset could not be retrieved from the source database. |
7 | FailedTablePreprocessing | FAILURE | Used when the data contained in the SQL tables to subset (in the source database) could not be prepared for subsetting usage. |
8 | FailedUpdatingTargetDatabase | FAILURE | Used when the subsetter failed to migrate the data from the source to the target database. |
Please check our Subsetting troubleshooting and known limitations section for some known error use cases and workarounds.
CLI logging
In terms of observability, the subsetter
CLI has two main types of logging:
Logging type | Location | Default Minimum Log Level | Purpose |
---|---|---|---|
File | Default location varies per OS:
The default log folder can be overridden in the command line by using the CLI parameter | DEBUG | Log client-side application-level information about subsetting execution and outcome. This will include, by default, additional contextual debug execution information. Minimum log level when running commands can be changed with the global --file-log-level parameter (see global flags and parameters). |
Console (standard out) | stdout | INFORMATION | Displays non-error client-side application-level information about subsetting execution and outcome. This excludes Error and Fatal log events which will be added to the standard error stream instead.Minimum log level when running commands can be changed with the global --console-log-level parameter (see global flags and parameters). |
Console (standard error) | stderror | ERROR (cannot be changed) | Displays all Error and above client-side application-level messages for issues during execution of the subsetter . This will lead to a non-successful exit code. |
Log format
The filename of the log file will use following date sortable format: subsetter-yyyy_MM_dd-HH_mm_ss.json
(e.g. Subsetter-2023_05_03-12_39_55.json
).
The files are written in Compact Log Event Format, a JSON-based format. This means each log message has structured data associated with it. The open source Compact Log Viewer tool is a convenient way to view the logs.
(there are other tools for working with Compact Log Event Format files
: see the list on the CLEF webpage).
There is no automatic purging of old log files built-in yet, so you may want to keep an eye on log files and delete older files from time to time.