Major version upgrades in Drupal are hard. Drupal's open-source development process puts everything on the table every major release, including fundamental APIs and features. This allows Drupal to evolve as a platform, such as when fields moved into core in Drupal 7, and how Drupal 8 will adopt many components from Symfony2. But this also means that every major version upgrade requires significant changes to your data structure.
Part of the challenge is that Drupal installations are never just core, but a large constellation of contributed modules with their own database tables. In other words, for any site more complicated than a basic install, its data is not just being handled by Drupal core, but by a handful of other modules, each with varying levels of complexity and maintainability.
The typical upgrade procedure (for now) on Drupal.org is to use Drupal's built-in update mechanism, update.php. While update.php does a great job keeping your database schema up-to-date when making minor-point updates (eg. D7.10 to D7.12), it is severly limited in handling any major upgrade beyond a stock Drupal installation. Update.php, for example, is capable of upgrading contributed module data only if the maintainer of that module has provided a new version and has written the requisite upgrade code. Established and well-maintained modules are usually good at this, but even the largest modules can go away. As the Drupal ecosystem evolves, better solutions emerge to common use-cases.
For example, the D7 References module, which includes the heavily used Node and User Reference fields, will likely be deprecated in Drupal 8 in favor of Entity Reference. This means that any site that uses User and Node reference fields will need to be upgraded to Entity Reference fields after D7. This kind of data shift between two different modules is only one of the challenges of major upgrades. Update.php is also incapable of jumping Drupal versions. To upgrade from D5 to D7 using update.php, you'd have to first run D6's update.php, then D7's.
With all this data shifting, it's also common to have leftover database tables. These tables can bloat your databse, hurting performance and your data usage. The solution is the Migrate module. Migrate uses an Object-Oriented process to take data from a source–database, CSV, JSON, etc–and insert it into a Drupal database. Each Migrate process is essentially a subclass of Migrate which defines what fields to import and any other needed logic. At the moment, Migrate requires a user to write these PHP classes themselves, since Migrate was originally targeted at importing data from non-Drupal sources. But a new project called migrate_d2d by mikeryan, lead maintainer of Migrate, aims to use Migrate for major version Drupal upgrades.
At DrupalCon Denver 2012, mikeryan mentioned that he is looking at replacing update.php with migrate_d2d for all upgrades to D8. This is awesome. Migrate_d2d effectively takes a clean install of Drupal and brings in all your old data. This means only the data you need is imported, so no old, unused tables. And since Migrate can take data from one module and use another module's hooks to import it, moving data from one module to another becomes possible. This process also opens up the ability to jump versions. This means, then, that a supported upgrade path from D6 to D8, for example, could become a reality. If migrate_d2d matures enough for it to become the default major upgrade mechanism, it will help eleviate many of the upgrade frustrations in Drupal. This is particularly important for organizations who can't afford to always rebuild their site after every major version release. With the added flexibility of migrate_d2d, the Migrate module will further secure Drupal as the CMS of choice for organizations large and small.
For more on Migrate, check out this DrupalCon Denver presentation by drewish, and if you wait till the Q&A, mikeryan talks a little about migrate_d2d.
Need a fresh perspective on a tough project?
Let’s talk about how RDG can help.