Friday, June 29, 2012

Using Maven to Integrate Flyway Migrations in an Existing Database- Part 1

I was recently asked by a client to investigate database versioning / migration tools. Having had previous experience with Rails db Rake tasks and Liquibase, both of which require a DSL, I looked for a more Java-friendly tool and found Flyway.

One of the pre-requisites was to be able to integrate the tool with an existing database and SQL scripts to reduce the learning curve for application development personnel. We were also looking for a tool we could run with either Ant or Maven as the client has multiple projects using both technologies. Flyway offered that with its support of native SQL migrations including support for stored procedures and triggers as well as a Maven plugin and an Ant task. The Flyway command line tool (CLT) and Java migrations are an added bonus.

Disclaimer: This is not a testimonial for Flyway versus any other tool, just an accounting of what it took to get it working for an existing database. Although the Flyway wiki and documentation is adequate, I had a difficult time getting all the wheels turning in the same direction. In fact, I found myself doing things that were not really explained nor even touched on in the Flyway wiki pages. I thought, “What better way to communicate that than through a blog?” BUT, YMMV- use this article at your own risk.

Java migrations are Java classes that invoke the Flyway Java core classes to perform database versioning and migration. These can be useful for “on-the-fly” migrations, i.e., at application startup, to automagically bring the target database up to speed with respect to its infrastructure, etc., as opposed to requiring a DBA or whomever to execute a series of steps prior to starting the application in a production environment. There is a sample available on the Flyway Downloads page that illustrates this feature.

For this exercise, I concentrated on getting the Maven plugin to work, but this was not without its trials. What follows is an outline of the steps I took to get Flyway working using the Maven plugin. Although we are using Maven 3 for this project, I don’t think there is anything in this article that is Maven 3-specific. That being said, you may see different results if you are not using Maven 3.

I selected a sample Maven project from the client that had several modules. In addition to the domain module, the project also has a web application as well as XML-centric tasks. The database for this sample project is a SQLServer database as are most of this client’s databases. The DBM should not matter either unless you are using an unsupported database. Use caution if your DBM is not one of those listed as a Flyway-supported database.

Installing Flyway…

First I installed Flyway according to the “Installation” section of Flyway’s home page. You should install the tool to support your specific needs. I added the Flyway dependency and the Maven plugin definition to the respective sections of the parent POM of my sample project as described in the installation instructions.

<dependencies>
    <dependency>
        <groupId>com.googlecode.flyway</groupId>
        <artifactId>flyway-core</artifactId>
        <version>1.6.1</version>
    </dependency>
...
</dependencies>
...
<build>
    ...
    <plugins>
        <plugin>
            <groupId>com.googlecode.flyway</groupId>
            <artifactId>flyway-maven-plugin</artifactId>
            <version>1.6.1</version>
            <configuration>
                ....
            </configuration>
        </plugin>
        ...
    </plugins>
    ...
</build>

That was easy enough… or was it?

To test the installation, I figured I’d start simple and just execute the flyway:status goal. So I looked at the Maven Plugin Status goal page on the Flyway wiki and found I nedeed to tell Flyway about our database. You do that via configuration parameters.

Configuring Flyway…

The basic configuration of Flyway is simple. You need to tell Flyway which DBM you are using via the JDBC driver, the JDBC URL to the database, the database user to be used in Flyway operations, the password for that user, and the database schema to be acted upon. There are other configuration parameters that are not used by the flyway:status goal and we will detail them later when we use them.

For now, I added the following configuration to the plugin definition in the POM file:

<configuration>
    <driver>com.microsoft.sqlserver.jdbc.SQLServerDriver</driver>
    <url>jdbc:sqlserver://localhost:1433;databaseName=flyway_db</url>
    <user>SA</user>
    <password>thePwd</password>
    <schemas>dbo</schemas>
</configuration>

Then I ran mvn flyway:status which resulted in my first error:

[INFO] --- flyway-maven-plugin:1.6.1:status (default-cli) @ parent-pom ---
[ERROR] com.googlecode.flyway.core.exception.FlywayException: Unable to
        instantiate jdbc driver: com.microsoft.sqlserver.jdbc.SQLServerDriver

Although the SQL Server driver dependency was already configured in the Maven project, the error above stated that it couldn't find the driver. Why didn't this work?

A word about clear text passwords in configuration files-

It’s a no-no. Ok, that was three (or four) words. But for simplicity’s sake, we show it here (it’s not real anyway). Maven provides a facility to encrypt passwords included in settings files (see Password Encryption in the Maven Guide). I’d recommend the password value be encrypted and in this example be replaced by a property, e.g., ${flyway.password}, which is set in a profile in a settings file. In fact, use of Maven profiles is a topic in the next blog in this series.

Including Flyway’s Dependencies…

After referring to the Maven POM Reference, I remembered plugin dependencies should be declared with the plugin definition itself. So I included the SQL Server Driver as a dependency as well as the dependency to Flyway Core since it was the only component using it anyway. The Flyway wiki doesn’t touch on this, so lesson learned.

So now the Flyway plugin definition looked like this:

<plugin>
    <groupId>com.googlecode.flyway</groupId>
    <artifactId>flyway-maven-plugin</artifactId>
    <version>1.6.1</version>
    <dependencies>
        <dependency>
            <groupId>com.microsoft.jdbc</groupId>
            <artifactId>sqljdbc4</artifactId>
            <version>3.0</version>
        </dependency>
        <dependency>
            <groupId>com.googlecode.flyway</groupId>
            <artifactId>flyway-core</artifactId>
            <version>1.6.1</version>
        </dependency>
    </dependencies>
    <configuration>
      <driver>com.microsoft.sqlserver.jdbc.SQLServerDriver</driver>
       <url>jdbc:sqlserver://localhost:1433;databaseName=flyway_dev</url>
       <user>SA</user>
       <password>thePwd</password>
       <schemas>dbo</schemas>
    </configuration>
</plugin>

Success… or was it? The Maven goal ran and produced the expected output, but it ran only in the build for the parent POM. So I moved the plugin definition to the POM for the module in which I wanted it to run, i.e., where the db migrations are actually defined, and ran mvn flyway:status again. This time Maven failed with a message indicating the plugin could not be found. That made sense- I was trying to run a plugin goal from the top level but the plugin is not defined in the top level POM.

Plugin, Plugin, Who’s Got the Plugin?

So, after referring to the Maven POM Reference again, I copied the plugin definition to the <pluginManagement> section of the <build> section of the parent POM. So, at this point I had the plugin defined in the <pluginManagement> section in the parent POM and in the <plugins> section of the <build> section of the child POM. However, I removed the <dependencies> from the plugin definition in the child POM since they are inherited from the <pluginManagement> definition in the parent POM.

I again ran mvn flyway:status and it executed the plugin in the parent POM as well as in each of the child POMs. Hmmm… that’s not what I wanted. I wanted the plugin to execute only in the build for the domain child POM in which it is defined. Since the requirements for executing the migration goal of the plugin, e.g., directories, etc., are only present in the domain project built by the domain child POM, running that goal of the plugin in the parent and other child projects will result in an error.

The trouble with plugin definition location in parent and child projects…

I struggled with this for a while and finally came to the conclusion (since it is not definitively described in the Maven POM Reference) that the <pluginManagement> section is a convenient way to define plugins in a parent POM to be inherited in ALL child POMs unlike I was lead to believe by these statements in the Maven documentation under <pluginManagement>:

“However, this only configures plugins that are actually referenced within the <plugins> element in the children.”

and

“However, if we apply them under the <pluginManagement> element, then this POM and all inheriting POMs that add the maven-jar-plugin to the build will get the pre-process-classes execution as well.”

Including the plugin in the child POM is NOT an indication that it is to be run in the build for that POM, but rather a mechanism to override whatever was specified for the plugin configuration in the parent <pluginManagement> section. Plugins are inherited from the <pluginManagement> section and will be invoked in all child POMs. Apparently enough people complained about this since there is an issue and work being done for Maven 3.1 to support being able to “skip” execution of a plugin in a child module (see http://jira.codehaus.org/browse/MNG-3102).

Although there isn’t a hint about running the Maven plugin in a multi-module environment in the Flyway documentation, there is a Flyway issue (here) addressing this behavior. Unfortunately, the suggested workaround (wrapping the Flyway execution in a Maven profile) did not work for my scenario.

I actually ended up modifying the Flyway Maven plugin to accept the <skip> tag printing a message that the execution is being skipped. I then modified the <pluginManagement> plugin definition to include the <skip>true</skip> tag so it is inherited by all of the child modules. I also modified the POM of the child module for the build in which I do want the plugin to execute adding <skip>false</skip> to the configuration.

Now I had the behavior I wanted- I ran mvn flyway:status from the top level and it only executed in the child module desired.

[:flyway-test ]$ mvn flyway:status  
[INFO] Scanning for projects...
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Build Order:
[INFO] MyApp :: Parent
[INFO] MyApp :: XSD :: XMLBeans Generated
[INFO] MyApp :: Domain Layer
[INFO] MyApp :: Web Application
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Parent 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:status (default-cli) @ myapp-parent ---
[INFO] <skip>true</skip> detected.
[WARNING] Skipping execution.
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: XSD :: XMLBeans Generated 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:status (default-cli) @ myapp-xmlbeans-generated ---
[INFO] <skip>true</skip> detected.
[WARNING] Skipping execution.
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Domain Layer 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:status (default-cli) @ myapp-domain ---
[INFO] +-------------+------------------------+---------------------+---------+
[INFO] | Version     | Description            | Installed on        | State   |
[INFO] +-------------+------------------------+---------------------+---------+
[INFO] | No migrations applied yet                                            |
[INFO] +-------------+------------------------+---------------------+---------+
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Web Application 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:status (default-cli) @ myapp-web ---
[INFO] <skip>true</skip> detected.
[WARNING] Skipping execution.
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] MyApp :: Parent ..................................... SUCCESS [0.846s]
[INFO] MyApp :: XSD :: XMLBeans Generated .................. SUCCESS [0.058s]
[INFO] MyApp :: Domain Layer ............................... SUCCESS [0.974s]
[INFO] MyApp :: Web Application ............................ SUCCESS [0.171s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2.762s
[INFO] Finished at: Wed Jun 27 08:30:08 EDT 2012
[INFO] Final Memory: 10M/554M
[INFO] ------------------------------------------------------------------------
[:flyway-test ]$ 

Let’s be a little more adventurous. How about actually running a migration?

Finally! Running migrations…

Before we do that, we first have to follow the roadmap for implementing Flyway in an existing database provided by the Flyway Existing Database Setup wiki page. As suggested, I dumped the database I was using and created a script. But being the skeptic I am, I decided to create another database to experiment with as opposed to mucking up a real live database that I use for productive work.

For those of you trying this yourselves, don’t forget to change the plugin configuration to point to the new database especially if you originally configured it to point to the database you just dumped.

Since the roadmap instructs you to clean the existing database, as opposed to dropping it and recreating it, it was obvious the roadmap is starting with a clean database. So I created an empty test database and ran mvn flyway:init as suggested in the roadmap. Perfect- Flyway created the schema_version table with the default entries.

[INFO]  ------------------------------------------------------------------------
[INFO] Building MyApp :: Domain Layer 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:init (default-cli) @ myapp-domain ---
[INFO] Metadata table created: schema_version (Schema: dbo)
[INFO] Schema initialized with version: 0

I named the script I created from dumping the database schema to follow the suggested naming conventions for migrations, which I don’t like (but that’s a subject for a future blog). I also didn’t like Flyway’s default name for the directory to hold the database migrations, db/migration. I like to be a little more explicit, so I created a resources directory named database/migrations. I added the <baseDir> configuration parameter specifying the directory:

<configuration>
   <driver>com.microsoft.sqlserver.jdbc.SQLServerDriver</driver>
   <url>jdbc:sqlserver://localhost:1433;databaseName=flyway_db</url>
   <user>SA</user>
   <password>thePwd</password>
   <schemas>dbo</schemas>
   <baseDir>database/migrations</basedir>
</configuration>

I then copied the script file to this directory and gave it a shot – mvn flyway:migrate. Hmmm…. -

"Unable to find path for sql migrations: database/migrations"

Why couldn’t it find the migrations? The directory I created, src/main/resources/database/migrations wasn’t exactly named as suggested on the Flyway SQL Migrations page, but I specified it in the configuration and it was there.

Again, a struggle- I tried moving the directory, renaming it, changing the configuration to include the full path to the directory, all to no avail. Then I searched the Flyway issues and found one in which the reply was to run mvn compile flyway:migrate.

Although there is a warning at the top of the migrate goal plugin page,

Important: To make sure new migrations are taken into account, mvn compile must be invoked first!

I really didn’t pay attention to it- my bad. And I must have missed this statement on the SQL Migrations page:

New sql migrations are discovered automatically through classpath scanning at runtime.

The keyword here is runtime. It’s the runtime classpath that is being scanned, which includes the Maven output (target) directory, not the source directory. So just defining the migration file in the source directory is only half the job. You need to force Maven to copy the resources to the target directory via compile, package, etc., so they can be discovered by Flyway. As part of the build, Maven copies the contents of src/main/resources, including the database/migrations directory and its contents, to the target/class/database/migrations directory.

That did the trick. The plugin ran and updated the database using the initial load script defined in the src/main/resources/database/migrations directory. It also updated the schema_version table with the version and description of the migration file.

[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Domain Layer 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:resources (default-resources) @ myapp-domain ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 136 resources
[INFO] Copying 3 resources
[INFO] Copying 7 resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ myapp-domain ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:migrate (default-cli) @ myapp-domain ---
[INFO] Validated 0 migrations (mode: ALL) (execution time 00:00.004s)
[INFO] Current schema version: 0
[INFO] Migrating to version 1.0
[INFO] Successfully applied 1 migration (execution time 00:03.040s).

Adding another migration…

Next I created another script to add a table and some reference data. I again ran mvn compile flyway:migrate and Flyway updated the database schema and the schema_version table with the appropriate information.

[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Domain Layer 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:resources (default-resources) @ myapp-domain ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 136 resources
[INFO] Copying 3 resources
[INFO] Copying 7 resources
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ myapp-domain ---
[INFO] Nothing to compile - all classes are up to date
[INFO] 
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:migrate (default-cli) @ myapp-domain ---
[INFO] Validated 1 migration (mode: ALL) (execution time 00:00.003s)
[INFO] Current schema version: 1.0
[INFO] Migrating to version 1.1
[INFO] Successfully applied 1 migration (execution time 00:00.115s).

I then ran each of the remaining Maven Flyway plugin goals to insure the plugin was fully functional. Here’s the output from running mvn flyway:history.

[INFO]                                                                         
[INFO] ------------------------------------------------------------------------
[INFO] Building MyApp :: Domain Layer 3.15.BUILD-SNAPSHOT
[INFO] ------------------------------------------------------------------------
[INFO] 
[INFO] --- flyway-maven-plugin:1.6.2-SNAPSHOT:history (default-cli) @ myapp-domain ---
[INFO] +-------------+------------------------+---------------------+---------+
[INFO] | Version     | Description            | Installed on        | State   |
[INFO] +-------------+------------------------+---------------------+---------+
[INFO] | 0           | << Flyway Init >>      | 2012-06-27 15:40:44 | SUCCESS |
[INFO] | 1.0         | load-flyway test-schem | 2012-06-27 15:41:08 | SUCCESS |
[INFO] | 1.1         | add-flyway test-table  | 2012-06-27 15:41:08 | SUCCESS |
[INFO] +-------------+------------------------+---------------------+---------+

What I learned…

So, the bottom line is Flyway is a good tool for database versioning and migration. It does what it is advertised to do. And the documentation is decent. After tweaking the Maven configuration I was able to get Flyway running in a matter of hours. I didn’t count the time it took to dump the database and tweak the resulting SQL script or write the second script to add a table and reference data. And I also didn’t include the time it took me to make changes to the scripts to get them to run successfully (I’m not an SQL expert).

As a side note, I also did not include the fact that I had to modify the Flyway SQL Migration ClassPathResource class to handle UTF-8 encoded SQL scripts in a friendlier manner. And the reason I did not include that in this article is because the issue has been accepted by the Flyway support team and will be included in a near future release of Flyway.

Post Script:

Although Flyway has proven to be a good tool for managing database versioning and migrations, there is no “built-in” facility to manage multiple databases, i.e., development, QA, staging, and production. Although the “Existing Database Setup” page implies that migrations are automagically applied to all databases with this statement:

As soon as you add a new migration, it will be applied identically to all databases.

Flyway’s <url> configuration specification determines which database is being acted upon. It’s up to you to devise a methodology to “promote” migrations from one database to the next. And that’s one of the topics in the next blog in this series, “Using Maven to Integrate Flyway Migrations in an Existing Database- Part 2”.