Use a SQL Server backup to baseline a database
Published 19 December 2024
PREVIEW
This feature is currently in Preview. If you need help or have any feedback on this feature, please email the development team.
Introduction
A baseline is a representation of what has already been deployed to production at a point in time. The current Flyway workflow is to create a SQL script called a baseline script from a production environment that contains all the CREATE DDL to rebuild all the objects in the database from scratch in a new environment. The baseline script is used to provision new database environments, like development, shadow, or spinning up environments from scratch in a CI/CD pipeline. Baseline scripts for large databases can be millions of lines long and difficult to debug if something goes wrong (e.g., invalid objects or references to linked servers that aren't accessible). Using a backup file as a baseline is an alternative way to provision databases and avoids the baseline script issues. Restoring a schema-only backup is usually faster than running a large baseline script, so performance when generating migrations or spinning up new databases is also improved by using a database backup as the baseline.
In this document, we will outline the steps to create a schema-only backup and use it to provision a shadow database.
Note: An enterprise license is required to use backups as baselines.
Getting a schema-only backup
Create a schema-only clone
The first step is to take a schema-only backup of the production database, which in this walkthrough will be called AdventureWorks2022
. In order to achieve this, we must first create a schema-only copy of our database, which can be done using the DBCC CLONEDATABASE command:
DBCC CLONEDATABASE(AdventureWorks2022, AdventureWorks2022_Copy) with VERIFY_CLONEDB;
After running the statement above, we will have another database alongside AdventureWorks2022
called AdventureWorks2022_Copy
. The new AdventureWorks2022_Copy
database will have the same objects (e.g. schemas, tables, procedures, ...) as the AdventureWorks2022
database, but without any of the data.
Optionally include data from the flyway_schema_history table
If your production database is already baselined, then you may wish to include a copy of the flyway_schema_history
table data in the backup, as the backup will then retain the same version. That is, if your production database is at migration version V500, then restoring the backup to the shadow database will version it to V500 as well. If the flyway_schema_history
table is not present in the backup then you will need to specify a backupVersion
that the backup represents (covered later).
In order to copy the data from the flyway_schema_history
table, first make the cloned database writable and then copy the records from the original flyway_schema_history
to the copy in the cloned database:
ALTER DATABASE [AdventureWorks2022_Copy] SET READ_WRITE WITH ROLLBACK IMMEDIATE INSERT INTO [AdventureWorks2022_Copy].dbo.flyway_schema_history SELECT * FROM [AdventureWorks2022].dbo.flyway_schema_history
Create the backup file
The schema-only backup can now be created using the AdventureWorks2022_Copy
database with the BACKUP command:
BACKUP DATABASE [AdventureWorks2022_Copy] TO DISK = N'AdventureWorks2022_Copy.bak';
An AdventureWorks2022_Copy.bak
file will be created in SQL Server's default backup directory. You can optionally specify the full file path for the backup file, please see Backup Devices for the full list of options.
Now that we have the backup, the AdventureWorks2022_Copy
database can be deleted as it's no longer needed. This can be done with the standard DROP DATABASE command:
DROP DATABASE [AdventureWorks2022_Copy];
Using the backup to provision a shadow database
The next step is to configure the shadow environment to use the backup file. By provision, we mean that Flyway will restore the backup to the shadow database when necessary.
One important aspect to consider is the location of the backup file. If the backup will be restored onto the same SQL Server instance from where it was taken, then the backup file doesn't need to be moved, as it will be accessible to the SQL Server process. If the shadow environment will be on a different SQL Server instance, then we must ensure the backup file is accessible to that instance. For example, it may be necessary to:
- Copy the backup file to the machine running the SQL Server instance which will host the Shadow database.
- Expose the backup file on a network share accessible to the machine running the SQL Server instance which will host the Shadow database.
In this example, the backup file will be restored onto the same SQL Server instance so don't need to move it.
Using Flyway Desktop
If a Shadow database is not configured, then you will be prompted to to configure a Shadow database when you visit the Generate migrations page. The Shadow database can also be configured through the blue settings cog in the upper right for migrations-based projects.
- Visit the Generate migrations page and click Set up shadow database.
- In the Connect to shadow environment dialog, click Backup on the left, configure your Backup file, Backup version, and server and authentication details. Check Ok to erase and click Test and save.
Using Flyway CLI
You can configure your shadow database to use a backup to provision this environment in the Flyway TOML file:
[environments.shadow] url = "jdbc:sqlserver://localhost:1433;databaseName=AdventureWorks2022_Shadow;trustServerCertificate=true" user = "sa" password = "..." provisioner = "backup" [environments.shadow.resolvers.backup] backupFilePath = "AdventureWorks2022_Copy.bak" backupVersion = "001" sqlserver.generateWithMove = true
The [environments.shadow]
section defines an environment, which represents a database and the details needed to connect to that database. See Environments settings for more details on how to setup an environment. The provisioner = "backup"
parameter specifies that the shadow environment should use the backup provisioner.
The [environments.shadow.resolvers.backup]
section configures the backup provisioner for the shadow environment. Below we take a look at each of the fields defined above:
backupFilePath
- The file path to the backup file relative to the SQL Server instance where the backup will be restored. In our example, theAdventureWorks2022_Copy.bak
backup file is stored in SQL Server's default backup location, and as we're restoring the backup to the same SQL Server instance we can simply specify the backup filename. We could alternatively have specified the full file path:C:\Program Files\Microsoft SQL Server\MSSQL16.MSSQLSERVER\MSSQL\Backup\AdventureWorks2022_Copy.bak
.backupVersion
- The version of the database that the backup represents. When creating a baseline for a new Flyway project it's standard to use version 001, as the backup represents the initial state of the database before any subsequent migrations have been applied.sqlserver.generateWithMove
- Backup files have data and log file paths embedded within them and the restore process will attempt to re-use those file paths. However, this can lead to issues when the file path formats contained within the backup don't match the SQL Server instance where the backup will be restored. For example, a backup file could contain Unix file paths and attempting to restore this on a SQL Server instance running on Windows will fail by default, or the embedded backup file paths may reference a non-existent drive on a different machine running SQL Server. Therefore, whensqlserver.generateWithMove=true
is set, the backup provisioner will generate unique file paths for the database being restored, by using the default locations for data and log files on the SQL Server instance restoring the backup. This ensures the restore won't fail due to invalid embedded file paths in the backup file.
Note as well that databaseName=AdventureWorks2022_Shadow
in the URL for the shadow environment. AdventureWorks2022_Shadow
is the name of the shadow database, and the database we wish to restore the backup to. This database doesn't have to exist at this point, as the backup provisioner will create it if necessary (assuming the user configured has permission to do so).
With the configuration above we can now trigger the creation of the AdventureWorks2022_Shadow
database and the backup to be restored to that database. Any interaction with the database through flyway should trigger the backup to be restored (if necessary). For example, the following info command triggers the backup to be restored and we see that our shadow environment is now baselined to version 001:
$ flyway info -environment=shadow ... Creating Schema History table [AdventureWorks2022_Shadow].[dbo].[flyway_schema_history] with baseline ... Successfully baselined schema with version: 001 ... Schema version: 001 +----------+---------+----------------------+----------+---------------------+----------+----------+ | Category | Version | Description | Type | Installed On | State | Undoable | +----------+---------+----------------------+----------+---------------------+----------+----------+ | | 001 | Baseline from backup | BASELINE | 2024-12-19 16:00:55 | Baseline | No | +----------+---------+----------------------+----------+---------------------+----------+----------+
Running the info command again would not trigger the backup to be restored a second time, as Flyway would identify that it wasn't necessary. However, if the shadow database was Cleaned, then the backup would be restored when Flyway next interacted with the shadow database.
Note
- The backup file must be accessible to the SQL Server that it is being restored to. You may need to:
- Copy the backup file to the machine running the SQL Server instance which will host the shadow database.
- Expose the backup file on a network share accessible to the machine running the SQL Server instance which will host the shadow database.
- The shadow user will need the necessary permissions to restore the database backup.
- DBCC CLONEDATABASE has some limitations around System-Versioned Temporal Tables, Ledger Tables, and some system catalogs related to service broker or other system-level objects. If you need these in your provisioned database after the backup is restored, then script them in an afterBaseline callback.