Static data
Published 31 July 2019
One thing to consider when version controlling your database is how to maintain static/lookup data within your project. This can include data such as zip/postal codes, country lists or even application settings.
SQL Change Automation offers two distinct methods of handling static data, online and offline.
Online method
The online method is where changes are made to the development database, for example, with SQL Server Management Studio. This is the simplest way to edit static data.
Tables can be flagged for data tracking in SQL Change Automation. At this point, the process will be similar to making schema changes. Edits can be made to the data directly in your development database and then synchronized with your project.
The changes can be imported into your project to generate INSERT/UPDATE/DELETE statements within a new migration.
Primary Key requirement
In order to track the data within your tables, the table must include a primary key. If a primary key is not present in a given table, the data within the table will be ignored (not imported).
Offline method
The offline method involves preparing a script that describes changes to data before actually deploying the changes to a database. This method has a higher learning curve than the online method but offers more flexibility.
The advantage of working offline is that you can store the data in a single file, rather than generating a new file with each change, allowing you to branch and merge the data in source control.
In the offline approach the MERGE statement is used to manage the deployment of our static data. What makes MERGE so useful is not only its ability to insert, update or delete data in one succinct, atomic operation but also the fact that the statements are re-runnable. Using MERGE enables the data to be edited and re-used for deployment to all environments. It also gives you the ability to include SQLCMD variables in place of literal values, allowing data from the deployment system to be passed in and stored within your tables (for example, environment-specific variables).
This data can be put in a Post Deployment script in your project. This means you could use your deployment system to centralize the storage of configuration settings, giving you a systematic and repeatable process for propagating updates to configuration data.
This example post deployment script customizes static data based on a variable at deployment time.
Limitations
There are a few draw-backs of the offline method to consider prior to selecting this approach:
- Non-determinism of the MERGE statement: before actually running the deployment against your target environment, it can be difficult to know what changes will be applied (if any). Worst case scenario, you could hit one of the documented issues in MERGE
- The workflow isn't necessarily the most natural way to edit data, as it requires running the utility proc and copying+pasting the output back into the original file. Editing the file directly is an alternative, but isn't the most user-friendly experience especially with large amounts of reference data
- Coordinating changes to both the schema and data within the reference table can be quite difficult, given that your schema changes will be performed in a separate part of your deployment (multiuse scripts run only after all pending migrations have been executed). For example, if you need to update an ID value in the lookup table prior to adding a foreign key to the schema, then that update would need to be hand-coded and included in your project separately to your MERGE statement
If you foresee that these limitations can be problematic for your deployments, then it can be worth considering the online approach to version controlling your static data instead.
Choosing between online and offline methods
You can choose to use either the online or offline editing approaches, or a combination of both.
For example, you can use the online method on tables that contain consistent data between all environments, and use the offline method on tables where the data changes between environments.
Working with larger data sets
It is important to bear in mind that SQL Change Automation is optimized for working with smaller sets of static data. This is to ensure that your projects always build quickly, for larger data sets, we recommend using the data seed method with BULK INSERT instead.