Redgate Flyway

For PostgreSQL users - where are your DBs hosted?

Baselines

When starting to use Flyway for the first time against an existing database, the baseline state is the state before Flyway has first been run against it.

Baselining is a concept which applies to deployments using migrations, as it marks the starting state before the first migration has been run and before deployment history has been captured in the Flyway schema history table. It is not relevant if using the schema model only as no history is being tracked.

If you are working on a completely greenfield database and there are no populated downstream databases, you do not need to worry about baselining.

Example


In this example, TableA and TableB have already been deployed to our downstream environments (Test, Staging, and Production).

We want our first migration to create Table C in the downstream environments.

If we want to test that our new migration creating Table C deploys successfully without potentially compromising a live database, we need to be able to either spin up a test database matching one of the downstream environments or capture the existing state, in this case Table A and Table B, in a baseline migration which can be executed against an database.

In this case, Test, Staging, and Production are in sync, so any of these environments can be used for the baseline source. 

Baselining a downstream environment

Before running migrations for the first time, you need to run the baseline command (or run the migrate command with the -baselineOnMigrate flag). This will create the Flyway schema history table in the target database and acknowledge that this is the expected starting state upon which to begin using Flyway.

If you have existing migration scripts you will need to specify a baselineVersion. Note that this is automatically written into the configuration file if generating a baseline migration through Flyway Desktop.

Handling downstream environments not being in sync

It's okay for your development environment to be ahead of where your downstream environments are but you want your downstream environments to be in sync before getting started with Flyway migrations.

If your downstream environments are not in sync, then you have 2 options:

  1. Reset them so that they are in sync. This might mean waiting until after your next deployment.
  2. Create multiple baselines.

Capturing the baseline state for use in a shadow or build environment

Whenever you want to use a shadow or build environment, you need to ensure that that environment can be restored to its baseline state before migrations are applied to it. This can either be done by provisioning the shadow database or creating a baseline script to capture all of the existing state of the production database which can be run against an empty shadow.

Capturing baseline state in a script

This approach is simple in that it doesn't require any infrastructure, but capturing the entire contents of a mature production-grade database in a single deployable SQL script can sometimes be tricky and require a little iteration to get right.

For many scenarios this will work out of the box, but the following scenarios need handling:

If the target database contains invalid objects

For some database types it is possible to end up with invalid objects in the database - objects that were valid on creation but no longer usable following subsequent changes. These objects are innately not deployable and the only approach to getting a working baseline script is to filter them out (or remove them from the target database and start again).

Flyway Desktop helps identify and prompts to automatically filter out these objects.

If the target database contains objects such as SQL Server file groups which cannot be handled in migrations

The baseline script approach still works, but these need to be explicitly created on the shadow or build environment either outside the automated Flyway process or via a callback.

If the database contains cross database dependencies or linked databases

If the referenced databases are accessible from the shadow or build database then this will work out of the box, though this only works as an approach if they are not being actively worked on simultaneously.

If the objects making the references are not critical to track and can be filtered out without impacting deployment, then filtering is an option.

Otherwise, the only guaranteed way of solving this scenario is to use provisioning.

If the database is very large

If the database schema is large, then the baseline script could be very long and take a long time to run. If the database schema is very large (e.g. hundreds of thousands of objects) it may be impossible to take this approach at all and provisioning will be the only option.

Provisioning the shadow database

Provisioning generally requires some up front infrastructure or process effort, but solves all of the issues of capturing the baseline state in baseline scripts and makes for a very powerful workflow.

Provisioning options:

Rebaselining

When you have many migrations, it might be desirable to reset your baseline migration. This will allow you to reduce the overhead of dealing with lots of scripts, many of which might be old and irrelevant.

This can be done by creating a new baseline migration. See this tutorial for how to go about this.

If you are using provisioning, it is possible to set up a workflow where you continuously reprovision the development and shadow databases from production and continuously rebaseline.

Note that we do not currently have streamlined support for rebaselining in the Flyway Desktop UI.



Didn't find what you were looking for?