• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

ECM 2.0 – One-Step vs. Two-Step Migrations

You are here: Home / Alfresco / ECM 2.0 – One-Step vs. Two-Step Migrations

November 19, 2019

As clients are preparing for 2020, TSG has seen an uptick in the number of requests for OpenMigrate support for migrating from legacy repositories (FileNet, ImagePlus, Mobius, CMOD…..) to a new modern ECM 2.0 repository.  TSG will typically recommend a “one step” migration where OpenMigrate can both retrieve documents from the legacy repository and store the document and metadata in the new repository. This approach provides many advantages a “two-step” approach where documents and metadata are first dumped to a file system to be uploaded later.  This post will discuss the benefits of the one step approach to ensure a smooth transition to a modern ECM 2.0 repository.

Migration Overview – Migration Infrastructure

Many times clients will try to use existing export processes from the legacy system or bulk import tools from the new repository vendor with content and metadata being placed in a file system between the import and export activities. Concerns with this approach include:

  • Limitations of file systems
  • Issue resolution responsibility between the two different processes
  • Issues with the “dump and load” approach versus delta migrations
  • The impact of multiple file stores and retrieval on migration speed and accuracy

Rather than just a one time migration, in working with clients on migrations and ongoing ingestion, TSG recommends clients look for a migration infrastructure like OpenMigrate that can own responsibility for both the extract and import. Good migration infrastructure tools should also include:

  • Ability to repeat the process for ongoing migration needs
  • Ability to apply business logic throughout the migration process
  • Ability to configure many different migrations quickly
  • Ability to quickly address and retry documents/data that failed to migrate correctly
  • Ability to repeat the process for different data sources
  • Ability to provide accurate counts of documents migrated/failed for final decommissioning reports of the legacy system

While OpenMigrate can play a role in both one step and two step migrations, there are substantial advantages to the one step approach. The remainder of this post will focus on understanding how a one-step migration differs from a two-step migration for the above considerations.

Limitations of File Systems

With a two-step approach, a bulk download tool will export batches of files to a file system where all metadata about the document(s) are either stored in the file name or a separate data file format like CSV. From the batches and file system, OpenMigrate can read the files and load the documents and metadata into the new system. With an OpenMigrate one-step approach, the metadata, versions, renditions, and lifecycle values are read and mapped directly from the old legacy repository to the new system repository without any file system hand off. Specific issues with leveraging the file system as a stopping point between export and import functions include:

  • Bulk download tools have limited ability to pull in external data from other systems and can only export data from the legacy repository limiting the ability to transform or store data in a new format.
  • Storing every attribute into the correct place in the file system or naming the document correctly can be very difficult. Directory limitations to file naming and special characters can make the export problematic. One Step migrations do not have to be concerned about storing the file format as they are stored directly in the new ECM 2.0 repository. 
  • Versions, renditions, lifecycles, and custom attributes also need a place to be stored to allow OpenMigrate to populate these values correctly in the new target repository.  CSV or other data types can get very complex and inflexible when unexpected data issues arise. One step migrations do not require the complicated version/lifecycle metadata mapping as the detail is stored directly in the ECM 2.0 repository.
  • Disk space needs to be procured for the dump itself. For large repository migrations, procuring this temporary space can be expensive and difficult to manage as when to delete the documents from an export needs to be coordinated with the import success. One step migrations do not require large temporary space. 
  • Migration speeds are slower with a two step approach as documents need to be both stored and read from the file system. While OpenMigrate provides robust multi-threaded and new high speed ingestion for certain repositories, limiting factors can be the legacy repository itself. Waiting for files to be stored can limit the ability and speed of the migration.
  • Performance issues often arise when using export utilities to dump large amounts of content and metadata from legacy systems. Export tools are often designed for smaller volumes and often can’t handle large batch sizes required for bulk migrations.

Issue Resolution Responsibility during Two-Step Migrations

One major concern with a two-step approach is problem solving and responsibility during a migration run.  Regardless of sample testing, large migrations will often encounter document and metadata issues and anomalies that are unexpected due to the size and age of the legacy repository.  In a two-step approach, the documents export could be unsuccessful and the import job would not know of the failure as the exported files wouldn’t exist and export jobs typically don’t always do the best job of reporting exceptions.  Responsibility for correcting the issues, particularly if the export job itself fails, can be problematic, as it might require a code change in either the dump process or the import process, which would delay the migration.

Clients also struggle with coordinating the activities for document counts for the exporting and importing activities. Many times the export tool can have issues with querying for the counts of documents that were included in each export, and if there are errors during the exports, it is problematic to try to reconcile this with the separate reports for the import process.

When run in a one-step mode, OpenMigrate provides a complete error log where all failed documents are logged throughout the entire extraction and import process. OpenMigrate supports re-running the migration moving only those documents in the error log.  In this manner, the issue can be quickly addressed by taking any or all of the following actions:

  • Making a change to the document/metadata in the source system.
  • Modifying the OpenMigrate mappings to correct the data issue for the failed documents.
  • Re-running a small job to migrate only the failed documents again, allowing the bulk of the other documents to continue migrating.

Summary

Migrating from legacy repositories to new ECM 2.0 repositories can be daunting task. Leveraging legacy export tools to dump content can seem like a simple way to begin the migration but issues in regards to file mapping, file space, migration issue resolution and responsibility and complexity of the migration itself can make a dump and load more complex.

TSG recommends a one step migration where content is moved directly from legacy repository to the new repository. Advantages for this approach include:

  • Faster Migrations by not relying on a file download and metadata mapping.
  • No need for temporary file space for extracted documents and metadata.
  • Simplicity by not having to manage versions, renditions and other document relationships in a dumped file format.
  • Better documented migrations by having one tool, approach and responsibility for both document and metadata extraction and storage.
  • Improved Issue Resolution by having one tool (and resources) responsible for extraction and storage.
  • Faster performance by going directly to the source system via the underlying DB and moving the content in one shot rather than a separate export/import process

Let us know your thoughts below.

Filed Under: Alfresco, CMOD, Documentum, DynamoDB, ECM Landscape, HBase, Migrations, Open Text

Reader Interactions

Trackbacks

  1. Migrations – Why do they fail? (12 Worst Practices) — Technology Services Group says:
    January 15, 2020 at 8:22 pm

    […] for migrations.  While you can find a lot of references here to migration best practices (One Step vs Two, File Formats Lessons, Migrating 11 Billion Documents) , we thought for this post we would be […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • ECM Roadmap – Thoughts on Planning for the Future
  • Content Service Platform Scaling – How Good Key Design and NoSQL can avoid the need for Elastic/Solr or other indexes
  • Print to Repository – OpenContent Print Driver Support
  • Federated Content Management – Enterprise Search with a new moniker?
  • ECM 2.0 – Can you build it yourself?
  • The Deep Analysis Podcast – The 11 Billion File Benchmark
  • ECM Sales Myths for 2019
  • ECM 2.0 – What does it mean?
  • 2017 ECM Thoughts and Predictions as well as recap of 2016 postings
  • Top 5 Differences between Records Management and Document Management

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT