Come Fly With Me: Flyway usage patterns part I

by Joris KuipersAugust 17, 2018

Flyway is a tool for managing your relational database schema as it evolves while developing a project. It’s not restricted to just performing schema migrations: it can execute any type of SQL script, or even JDBC code, that you’d like to run before or at startup as part of deploying a new version of your application.
An older blog provides a nice overview of the framework, and the project provides comprehensive documentation.
I’ve been using Flyway as a database schema management tool in a handful of projects over the last few years, to great success. However, what I’ve noticed is that it isn’t always clear to newcomers how to best apply it to a project. Also, it depends on a number of factors on how you can in fact best apply Flyway. In this new blog post series I’ll describe some common usage patterns that I’ve encountered or developed myself.

To kick off, let’s talk about some patterns for using version numbers for your migration scripts.

Migration version numbering

The basic idea behind version numbers for Flyway migrations is pretty simple: you provide versions in the form of one or more numeric parts separated by dots or underscores, and then Flyway executes the migrations in order from lowest to highest. Leading zeros are stripped, which can help to make alphabetical sorting match numeric sorting.
By default you make this part of the filename of a SQL migration, or of a class name in the case of a Java migration. You could have a list of migrations like this, which have been ordered in the same way that Flyway would:

  • V001__Initial_schema.sql
  • V002__Initial_reference_data.sql
  • V003__Create_relation_tables.sql
  • V004__Create_order_tables.sql

In this example, we’ve used a simple sequential versioning scheme where most versions are just a single number, padded out with zeroes so that (at least the first thousand) migrations will be ordered correctly when sorted alphabetically.

This is an easy way to get started, but the approach has some limitations.

Dealing with maintenance releases

A common approach to development, especially when developing products with predictable release cycles, is to have regular feature releases combined with more frequent maintenance/bugfix releases. That means that you could have a product with version 2.1 running in production with a 2.2 release candidate being tested in a user acceptance environment, while development teams are busy working on the upcoming 3.0 release. This usually means that you have two active branches in your version control system (if you’re working on a product that’s hosted on-premise it might be even more, depending on how many versions you need to support in parallel).

In this situation, maintenance work performed on a version that has already been released is typically merged back to the main development branch to ensure it’s contained in the new major release as well.

Let’s say that the the version of our product running in production (the maintenance version) is currently using version 3 of the DB schema (meaning it has some migration with version 3 applied as the last migration). Now as part of implementing some maintenance fix we want to add a Flyway migration. On the maintenance branch V003__some_migration.sql is the latest migration, so we must add a file using a higher version number. However, the master branch might very well already contain a V004__some_other_migration.sql. Adding something like a V004__fix_column_lengths.sql to the maintenance branch won’t work: you could release that to production just fine initially, but when merging that migration to the development branch you now end up with two V004 migrations. This is obviously a problem: Flyway will throw an error as it requires unique migrations for every version. Also, we need to ensure that the new maintenance migration will be logically ordered before any new migrations contained only on the master branch, since that’s the order that migrations will need to be applied with as you’re first releasing one or more maintenance releases followed later by a new major release.

The thing is that even though you might have multiple branches, your DB schema can only evolve linearly.

An easy fix for this is to say that after creating a maintenance branch, no Flyway migrations should be created on that branch with new major version number. Instead, use a new minor version number. In our example, you could simply add a V003_01__fix_column_lengths.sql instead and merge that to the development branch as well.
The effect of this is that your maintenance branch will contain migrations numbered 1, 2, 3 and 3.1, while your development branch will additionally contain 4 and maybe higher as well.

You can release as many maintenance versions as required, and when the time comes to release a new major version of your product it will simply run the migrations that were only present in the master branch because they’re ordered after the maintenance migrations.
This will also make it obvious just by looking at the version number if a migration was developed as part of regular develop work on the main branch, or as part of a maintenance release on the maintenance branch.

This approach works well if your development process follows this pattern where you’re deploying maintenance releases into production while in the meantime working on a new upcoming major release. You’ll still have strictly incrementing version numbers on any given branch, so Flyway can simply enforce a strict ordering and run any new migrations that it finds.

Developing with feature branches

A popular approach to development nowadays is to use feature branches, especially since distributed source control systems like Git make them so easy and fast to use. In this approach, features are developed on a separate branch and merged back to the main branch only when they’re sort-of done.
There is a lot of debate at the moment about the pros and cons of this approach versus mainline development with feature toggles, but that’s for another blog: it’s simply a given that many teams develop using feature branches.

In that approach, using simple incrementing version numbers for migrations like 1,2, 3, etc. will quickly become a problem: let’s say that version 3 is the latest migration on master, and two new feature branches are created. Both teams working on those features decide that they need a database migration, so both teams might decide that they’ll create a version 4 migration. Now the first team to merge back their feature branch wins, but the other team will have to change the version number(s) of their migration(s) after merging and (hopefully) realizing that they created a conflicting migration.
This might be OK when not using feature branches a lot and only having a small team of people, but this approach causes lots of problems when scaling up.

The approach that I’m following for projects heavily using feature branches is to forego Flyway’s default guarantee that migrations will always be applied strictly in incrementing order and to adopt version numbers that are basically timestamps.

What this means is that a migration file created on July 8 at 11:10 AM will look like this:
V2017_07_08_11_10__fix_column_lengths.sql.

This will prevent version number conflicts from happening (I’m assuming including hours and minutes suffices for this) but creates another problem: versions will not be guaranteed to be merged back to master in order.
That means that Flyway could run against a database to apply a migration 2017.7.7.12.10 like above, but then in a later release finds that there’s now a new migration 2017.7.6.12.12 that hasn’t been applied yet but has a version that’s lower than the latest migration that has been applied already.

By default Flyway will consider this an error and your migration will fail. However, you can simply tell Flyway that this is all expected and that it should simply execute all migrations that haven’t been applied yet by setting its outOfOrder property to true.

The only requirement for this to work is that migrations created on separate branches can be run independently from each other, i.e. that there’s no dependency in the order in which they are applied. In my experience, this is something that’s typically the case already and really doesn’t need any additional work or processes to be put in place.

That means that this approach works really well when using feature branches.

Naming conventions for migrations

Apart from choosing a version strategy, you’ll also need to decide on the description part of your migration file names. Flyway will use those parts to create a textual description of the migration (replacing underscores with spaces) and stores in a column of its schema_version table.
In general these description should be, well, descriptive in indicating what they do. A migration called V034__Oops_fixing_a_mistake.sql won’t help in understanding what it contains in terms of SQL. What I often find to work well is to adopt a convention that the description should start with the issue number that the migration is related to. Let’s say you’re using Jira and your project code there is PROJ, a migration might be called something like V2017_07_07_12_30__PROJ-1234_add_indices_for_relations.sql. Sure, your version control will allow you to see this as well (assuming you’re including issue numbers in your commit messages), but if someone is reading the migrations as stored by Flyway in its schema_version table they probably don’t a cloned repository available to quickly check the context of the issue requiring a particular migration.

A nice thing about conventions like this is that people will start following them automatically once you have a couple of migrations: it’s just a matter of copying what you’re seeing, rather than having to document and then enforce some process.

Learn To Fly: What’s coming up

This concludes the first part of this Flyway blog series. Some topics that I’m planning to cover in upcoming entries are:

  • introducing Flyway when you have an existing database already;
  • how to use Flyway while you’re in the process of splitting up your monolithic application into multiple services that might not yet have their own database;
  • how to use dynamic migration locations to deal with different databases or different types of migrations like schema changes vs. data fixes;
  • how to periodically collapse your ever-growing list of database migration files so that you don’t actually hit that V999…

If you have suggestions for Flyway usage patterns that you like to see covered, then please let us know!