Redgate Test Data Manager

Classification

Classification

Classification is the first step in anonymizing your database with Anonymize. The classify operation scans your database to identify potentially sensitive information, often referred to as Personally Identifiable Information (PII).

It then outputs a classification JSON file that describes which tables and columns contain PII.

Default Classifications and Datasets

Anonymize comes with a predefined set of classification types and datasets designed to cover NIST's definition of linked information. By masking these types of data, you minimize the chance of any other parts of the record being identifiable as the individual.

If you need to assign classification types or datasets that aren't included in our defaults, check out the custom configuration page.

Classification File Structure

When you run the classify command, it outputs a JSON file that outlines the tables and columns in your database. The file contains information about the schema, table names, column names, and the classified data types.

In most cases, you won't need to edit the classification file before using it as input for the map command. But if you do need to make changes, as mentioned above, it's usually better to provide an options file:

  • The classification and masking files are generated from scratch each time you run the classify or map commands, so any manual changes made to these files will be lost (unless you use version control).
  • The options file allow you to store your anonymization configuration separately, making it easier to manage and maintain your settings across multiple runs.

You can find more details on the custom configuration page.

Next Steps


Didn't find what you were looking for?