Back in 2013, Documentum announced the Enterprise Migration Appliance while pushing to get clients to upgrade to 7.x. Renamed the Documentum Enterprise Migration Appliance (DEMA), one of the selling points of the DEMA is a focus on speed. Architected for the “fastest migration possible”, DEMA was constructed to avoid the Documentum API and go directly to the underlying database. This post will discuss the impact of speed and compare approaches of different migration tools.
Migration Speed – Why does it matter?
The logic for a fast migration typically follows the approach of:
- I have to migrate 1,000,000 documents.
- I would like to complete the migration as fast as possible
- If tool A takes 1 second per document and tool B takes .5 seconds per document in my test environment, tool B will be twice as fast. (277 hours versus 138 hours).
While the scenario above makes sense, it misses a couple of key factors:
- Test Environment – rarely is a test environment set up the same way as a production environment.
- Multi-Threading – Migrations do not have to be single threaded. For the majority of tools, the ability to run multiple threads will reduce the migration time proportionally by the number of threads depending on system resources.
- System Downtime – the above scenario assumes that the migration will be run sequentially and the system will be down for the entire time the migration is taking place.
The focus on speed should really be a focus on reducing the system downtime. The remainder of this post will talk about multi-threading to improve the speed of migration as well as a Delta Migration approach to reduce downtime.
Multi-Threading to speed migration
Multi-threading allows the migration program to run multiple instances of itself and take advantage of parallel processing to reduce the time and throughput of the migration. In the example above, if I run 10 instances of the migration tools, time will reduce by a factor of 10. (in our example, if Tool A can be multi-threaded and Tool B cannot, the 277 hours would be reduced to 27.7 hours for Tool A versus 138 for tool B. Some key points to keep in mind when considering multi-threading:
- Eventually system resources will reduce the efficiency of multi-threading. Ex (100 threads are not 10 times better than 10 threads – sometimes it is worse). Typically in our Documentum or Alfresco migrations, we see the target (or write) repository as the limiting factor. Clients will work to tune the number of threads to not overload the write repository.
- Thread communication – to be truly multi-threaded, the migration tool has to make sure that activities are divided up correctly by each thread to avoid conflicts.
Delta Migration to reduce system downtime
We have discussed Delta migrations before but continue to recommend that clients consider a delta migration versus a big bang approach. With a Delta migration, as much content as possible is migrated while the old system is still up. On cut-over weekend, the delta (or new) documents are migrated over before the new system is turned on. A typical example:
- I need to migrate 1,000,000 documents.
- The migration program begins migrating documents created before X/X/X to the new environment while the old environment is still active.
- On cut-over weekend, the migration program moves every document created or modified after X/X/X to the new repository.
As we mentioned back in 2013, the Documentum Enterprise Migration Appliance works on the database level and, at the date of this posting, cannot execute a delta migration. To migrate 1,000,000 documents, DEMA requires the system to be down the entire time the migration is occurring and has limited ability to test the migration during the migration activities. With a Delta approach, the system only needs to be down while the last “delta” content is being moved.
A Delta approach has several big advantages over the traditional Big Bang approach surrounding risk.
- Majority of documents can be moved while existing repository is still active to reduce downtime window.
- Migration to new environment can be proven while testing a large volume of documents.
Summary – Speed of Tool not as important as reducing system downtime.
While speed of the migration tool is important, the ability to add threads and execute delta migrations have the greatest impact on reducing system downtime. Clients should avoid stopping at the vendor claims that their tool is the “fastest” and further consider the overall migration process.
For additional information on migration, see our recent OpenMigrate webinar recording with Alfresco (or watch below) that demonstrates Big Bang, Delta, Gradual and Rolling Migrations.