How to manage Database Migrations with Flyway?

by Joris Kuipers, December 5, 2017

Joris Kuipers, CTO at Trifork, presented a webinar on some usage patterns for Flyway. You can find the recording on our Trifork YouTube channel.

Tools like Flyway address a common concern for many people, which quickly leads to questions on how to pick a tool and then apply it in the best manner for one’s particular situation.

In this blog post, Joris has summarized the Q&A session – he provides the readers with his answers and ideas for managing database migrations.

1. How does Flyway compare to Liquibase?

When I was choosing a DB schema migration tool 4 or 5 years ago, I’ve looked at both Liquibase and Flyway. In general, I’d say that Flyway is a bit more light-weight: Liquibase addresses some requirements that Flyway explicitly chooses not to support. These include support to define migrations declaratively (e.g. in XML) and then generate the correct SQL DDL statements for your particular DBMS and the support to generate migration rollbacks (countering operations).

Flyway did suffice for my needs, so I’ve chosen that and have been very happy with the results. I recommend that you have a look at both to see which one fits your particular requirements best, and definitely choosing either one will be a huge improvement over not having any DB versioning tool!

2. Do repeatable scripts get included in the transaction spanning in all migrations or it is just versioned scripts?

I didn’t cover repeatable scripts in the presentation, but a repeatable migration is a migration whose name starts with an ‘R’ instead of a ‘V’. It doesn’t have a version number, but is applied whenever its checksum has changed (hence the ‘repeatable’). They will be applied after your versioned migrations, but AFAIK that will still be in a single transaction if you’ve chosen to enable that feature.

3. Flyway has a checksum for each upgrade script. Then how to handle the case when a developer finds an error in the previous script in a development environment? Do you have to write a new script to overwrite the old one?

If the script has only been applied to a development environment you don’t necessarily need to write an additional script to fix the error. However, changing a scripts that was applied already will cause a checksum mismatch, obviously, if you don’t regenerate your dev DB schema from scratch. In that case it’s often easy to ensure manually that the fix is applied to the dev DB and then updated the checksum of the applied migration. You can do this yourself, or you can use Flyway’s ‘Repair’ command.

4. What is the size of largest database you can handle with Flyway?

Flyway isn’t restricted by the DB size, but applying certain migrations to a large DB table might take a long time (also depends on your DBMS). For those cases, ensure that you haven’t configured a connection timeout that would be too short and understand that if you embed Flyway in your application, starting the application might take a long time when long-running migrations are included.

5. If a change gets applied outside of Flyway for whatever reason, I could create a migration script after the effect that describes the change, but how can I mark this as ‘applied’ in the Flyway version table? Or is there another way I should be addressing changes that happen outside the process?

Ideally you’d prevent this from happening, of course: all changes should be applied through Flyway, so that you can always reliably recreate your DB schema. That said, if something like this does happen, there’s a number of ways to deal with it. In your migration you could use SQL that won’t fail if the change has been applied already, e.g. by using IF NOT EXISTS. In that scenario the migration will simply succeed. Another approach I’ve used in these situations with MySQL is that I let the migration fail and then simply mark it as successful in Flyway’s SCHEMA_VERSION table myself and then restart. If your database supports transactional DDL statements that won’t work, because Flyway won’t record a failed migration then: it will simply roll back the entire transaction, so there’ll be no entry in the SCHEMA_VERSION table at all. In that case, you could copy the checksum from another environment and insert the row yourself to indicate that Flyway should treat the migration as successfully applied.

6. How is the performance? Like over time, when you have gathered a lot of upgrade scripts, and you deploy to, as an example, a test environment with an initial DB, all these updated scripts are run. What is your experience? How do you prevent it from taking up too much time?

Creating the schema is usually quite fast, even with a lot of migrations. Of course this fully depends on your DBMS setup and what you’re doing in your migrations. That said, I do find it cumbersome to have hundreds of migrations lying around: for that reason I developed the squashing pattern that I covered in the webinar. Squashing will also improve the speed of applying all migrations against an empty schema, as you don’t need to alter things all the time. If you use a tool like mysqldump it typically disables constraint checking in the beginning of the script to re-enable it at the end, which might speed things up even further.

7. Would you recommend having the first schema version table name with the date from the offset to allow squashing in future?

No, you can choose to apply the squashing pattern with a new schema version table name in any case, you don’t need to start with a custom name from the get-go.

8. What about running automated integration tests, do you rebuild the DB then as well using the scripts or are there other patterns advised?

It depends – one thing that influences this decision is if you’re using the same DBMS for integration tests as you do for normal environments, or if you’re running against an in-memory DB. For the former case I’d definitely use Flyway, for the latter it depends on if you’ve written Flyway migrations to define the schema for your in-memory DB or if you’re relying on something else like your JPA provider then to set up the schema. In general, having your tests running against the same DBMS helps to catch DBMS-specific issues, and nowadays with tools like TestContainers this is easier than it used to be.

9. How does squashing work when you also have Java classes defining migrations?

Depending on what the Java migrations do, they can either simply be removed (if they do something that’s already captured in the new squashed migrations) or be squashed themselves to a single new Java migration.

10. What’s the best practice for managing repeatable objects such as VIEWS and stored procedures that are subject to removal?

I haven’t used repeatable migrations myself yet, but I’d say that initially you could simply replace the contents of the repeatable migration with a comment and then later when you apply the squashing pattern you can just remove the migration completely, without Flyway complaining about a missing migration.