We do a lot of Drupal 8 migrations here at Aten. From older versions of Drupal and Wordpress, to custom SQL Server databases, to XML and JSON export files: it feels like we’ve imported content from just about every data source imaginable. Fortunately for us, the migration system in Drupal 8 is extremely powerful. It’s also complicated. Here’s a quick-start guide for getting started with your next migration to Drupal 8.
First, a caveat: we rarely perform simple one-to-one upgrades of existing websites. If that’s all you need, skip this article and check out this handbook on Drupal.org instead: Upgrading from Drupal 6 or 7 to Drupal 8.
It’s Worth the Steep Learning Curve
Depending on what you’re trying to do, using the migrate system might seem more difficult than necessary. You might be considering feeds, or writing something custom. My advice is virtually always the same: learn the migrate system and use it anyway. Whether you’re importing hundreds of thousands of nodes and dozens of content types or just pulling in a collection of blog posts, migrate provides powerful features that will save you a bunch of time in the long run. Often in the short run, for that matter.
Use the Drupal.org Migrate API Handbooks
There’s a ton of great information on Drupal.org in the Migrate API Handbooks. Be prepared to reference them often – especially the source, process, and destination plugin handbooks.
Basic Steps
Here’s a much simplified overview of the high-level steps you’ll use to set up your custom Drupal 8 migration:
All Migrations
- Enable the migrate module (duh).
- Install Migrate Tools to enable Drush migration commands.
- Install Migrate Plus as well. It provides a bunch of extensions, examples and plugins for migrations. I’d just assume you need it.
- Create a custom module for your migration.
- Use YAML configuration files to map fields from the appropriate source, specifying process plugins for necessary transformations, to the destination. The configuration files should exist in “my_migration_module/config/install/“.
(Pro tip: you’ll probably do a lot of uninstalling and reinstalling your module to update the configuration as you build out your migrations. Use “enforced dependencies” so your YAML configurations are automatically removed from the system when your module is uninstalling, allowing them to be recreated – without conflicts – when you re-enable the module.)
Enforced dependencies in your YAML file will looks something like this:
<br /> dependencies:<br /> enforced:<br /> module:<br /> - my_migration_module<br />
See this issue on Drupal.org for more details on enforced dependencies, or refer to the Configuration Management Handbooks.
Drupal-to-Drupal Migrations
- If you’re running a Drupal-to-Drupal migration, run the “migrate-upgrade” Drush command with the “--configure-only” flag to generate stub YAML configurations. Refer to this handbook for details: Upgrade Using Drush.
- Copy the generated YAML files for each desired migration into your custom module’s config/install directory, renaming them appropriately and editing as necessary. As stated above, add enforced dependencies to your YAML files to make sure they are removed if your module is uninstalled.
Process Plugins
Process plugins are responsible for transforming source data into the appropriate format for destination fields. From correctly parsing images from text blobs, to importing content behind HTTP authentication, to merging sources into a single value, to all kinds of other transformations: process plugins are incredibly powerful. Further, you can chain process plugins together, making endless possibilities for manipulating data during migration. Process plugins are one of the most important elements of Drupal 8 migrations.
Here are a few process plugin resources:
- Migrate Process Plugins overview and resources from Drupal.org
- List of Core Migrate Process Plugins quick reference on Drupal.org
- Writing a Process Plugin guide to creating your own process plugin on Drupal.org (Pro tip: do a Google search first; the thing you’re trying to create likely already exists.)
Continuously Migrate Directly from a Pantheon-Hosted Database
Most of our projects are hosted on Pantheon. Storing credentials for the source production database (for example, a D7 website) in our destination website (D8) code base – in settings.php or any other file – is not secure. Don’t do that. Usually, the preferred alternative is to manually download a copy of the production database and then migrate from that. There are plenty of times, though, where we want to perform continuous, automated migrations from a production source database. Often, complex migrations require weeks or months to complete. Running daily, incremental migrations is really valuable. For those cases, use the Terminus secrets plugin to safely store source database credentials. Here’s a great how-to from Pantheon: Running Drupal 8 Data Migrations on Pantheon Through Drush.
A Few More Things I Wish I’d Known
Here are a few more things I wish I had known about back when I first started helping clients migrate to Drupal 8:
Text with inline images can be migrated without manually copying image directories.
It’s very common to migrate from sources that have inline images. I found a really handy process plugin that helped with this. In my case, I needed to first do a string replace to make image paths absolute. Once that was done, I ran it through the inline_images plugin. This plugin will copy the images over during the migration.
<br /> body/value:<br /> -<br /> plugin: str_replace<br /> source: article_text<br /> search: /assets/images/<br /> replace: 'https://www.example.com/assets/images/'<br /> -<br /> plugin: inline_images<br /> base: 'public://inline-images'<br />
Process plugins can be chained.
Process plugins can be chained together to accomplish some pretty crazy stuff. Sometimes I felt like I was programming in YAML. This example shows how to create taxonomy terms on the fly. Static_map allows you to map old values to new. In this case, if it doesn’t match, it gets a null value and is skipped. Finally, the entity_generate plugin creates the new taxonomy term.
<br /> field_webinar_track:<br /> -<br /> plugin: static_map<br /> source: webinar_track<br /> map:<br /> old_tag_1: 'New Tag One'<br /> old_tag_2: 'New Tag One'<br /> default_value: null<br /> -<br /> plugin: skip_on_empty<br /> method: process<br /> -<br /> plugin: entity_generate<br /> bundle_key: vid<br /> bundle: webinar_track<br />
Dates can be migrated without losing your mind.
Dates can be challenging. Drupal core has the format_date plugin that allows specifying the format you are migrating from and to. You can even optionally specify the to and from time zones. In this example, we were migrating to a date range field. Date range is a single field with two values representing the start and end time. As you can see below, we target the individual values by specifying the individual value targets as ‘/’ delimited paths.
<br /> field_date/value:<br /> plugin: format_date<br /> from_timezone: America/Los_Angeles<br /> from_format: 'Y-m-d H:i:s'<br /> to_format: 'Y-m-d\TH:i:s'<br /> source: start_date<br /> field_date/end_value:<br /> plugin: format_date<br /> from_timezone: America/Los_Angeles<br /> from_format: 'Y-m-d H:i:s'<br /> to_format: 'Y-m-d\TH:i:s'<br /> source: end_date<br />
Files behind http auth can be copied too.
One migration required copying PDF files as the migration ran. The download plugin allows passing in Guzzle options for handling things like basic auth. This allowed the files to be copied from an http authenticated directory without the need to have the files on the local file system first.
<br /> plugin: download<br /> source:<br /> - '@_remote_filename'<br /> - '@_destination_filename'<br /> file_exists: replace<br /> guzzle_options:<br /> auth:<br /> - username<br /> - password<br />
Constants & temporary fields can keep things organized.
Constants are essentially variables you can use elsewhere in your YAML file. In this example, base_path and file_destination needed to be defined. Temporary fields were also used to create the exact paths needed to get the correct remote filename and destination filename. My examples use an underscore to prefix the temporary field, but that isn’t required.
<br /> source:<br /> plugin: your_plugin<br /> constants:<br /> base_path: 'https://www.somedomain.com/members/pdf/'<br /> file_destination: 'private://newsletters/'</p> <p> _remote_filename:<br /> plugin: concat<br /> source:<br /> - constants/base_path<br /> - filename<br /> _destination_filename:<br /> plugin: concat<br /> source:<br /> - constants/file_destination<br /> - filename</p> <p> plugin: download<br /> source:<br /> - '@_remote_filename'<br /> - '@_destination_filename'<br /> file_exists: replace<br /> guzzle_options:<br /> auth:<br /> - username<br /> - password<br />
This list of tips and tricks on Drupal Migrate just scratches the surface of what’s capable. Drupalize.me has some good free and paid content on the subject. Also, check out the Migrate API overview on drupal.org.
Further Reading
Like I said earlier, we spend a lot of time on migrations. Here are a few more articles from the Aten blog about various aspects of running Drupal 8 migrations. Happy reading!