• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

FileNet Migrations – Best Practices for Large Migrations

You are here: Home / FileNet / FileNet Migrations – Best Practices for Large Migrations

January 21, 2020

TSG just completed the bulk of activities to migrate over 4 Billion documents for a large health insurance company.  This was one of our most complex migrations ever with over 4 billion documents and both legacy FileNet Image Services as well as a large amount of the content in FileNet P8.  For this post we will share our FileNet best practices and lessons learned.

FileNet Migration Background

Back in 2016 we wrote a post on best practices for a FileNet migration.  At the time we were focused on migration to Alfresco for a very old FileNet customer (on optical platters with a jukebox).  Since 2016 we have seen more interest from clients in migrating from FileNet as many have struggled with the relationship with IBM.  The struggle was described back in 2017 when Gartner removed IBM from their Leader Quadrant for customer services and vision reasons.  For this latest client we leveraged our NoSQL alternative to migrate to HBase/Hadoop although Alfresco, AWS/DynamoDB, Google or Azure would have all been appropriate modern targets as well.  Components of FileNet Migrations can include:

  • FileNet Image Services/COLD – In 2016, the FileNet migration included a very old OSAR jukebox and optical discs.  By 2020, most FileNet customers have moved to all magnetic removing the physical loading (and performance issues) with optical platters.  While the content has been moved to magnetic, the format and other issues with DAT files still exist. 
  • FileNet P8 – P8 is a more modern platform that many clients that have Image Services have adopted but many still left Image Services in place. 

Specific Details for this 4 Billion plus migration included:

  • Over 3.5 billion documents in ImageServices DAT files
  • Over 500 million documents in P8
  • Unknown number of Annotations that were not migrated to the new system
  • Conversion of FileNet COLD and TIFF formats to PDF

FileNet Image Services Migration – Best Practices

Despite content no longer being stored on optical platters, FileNet Image Services still relies on a meta-data in a database with content stored in a DAT files.  DAT files contain multiple pages and documents consistent with content that was previous written to optical platters.  Best practices for migration include two distinct steps.

  • Metadata migration – moving all the document metadata from FileNet to the new repository.
  • Content migration – moving the documents from DAT to the new repository.  Content migration could also involve moving from FileNet TIFF and COLD to PDF.  FileNet DAT file format can be different between installations.

TSG’s past FileNet migrations relied on calling the FileNet API from the command line to retrieve images from storage. This worked well for lower volume clients and clients that utilized optical disk jukebox storage devices (OSAR) that required constant swapping of disks. For higher volume customers utilizing magnetic storage (MSAR), the FileNet API proved to be too slow to extract all of the documents in a reasonable amount of time. Because of this, TSG developed an adapter for OpenMigrate to be able to extract content directly from the storage device, bypassing the FileNet API.

FileNet stores content by mashing hundreds of files together into a single blob file known as a DAT file. OpenMigrate’s adapter is able to deconstruct the DAT file into its individual file components and match those files with the metadata stored in FileNet’s database at very high speeds.

For this large migration, TSG set up OpenMigrate jobs to move meta-data as well as move content in separate jobs.  Originally we proposed a rolling migration as detailed in this post from last year.  For this large client we eventually moved to a move everything approach as most of the content was archival.

We posted in detail about our FileNet Image Services updates, specifically around the COLD (Computer Output to Laser Disc) approach.  Again, rather than leveraging the FileNet API, TSG built a COLD adapter that has the following benefits over other traditional conversion approaches.

  • Native File Access – The COLD adapter does not require FileNet to access and convert COLD files.  For both scenarios where FileNet is still running and contention might cause user performance issues or where FileNet is to be retired as part of a rolling migration, native file access allows for fulfilling both of these scenarios.
  • Performance – Compared to traditional “launch a viewer and print to PDF”, by working at the file level, the TSG transformation utility can transform files much quicker to better support rolling migration and bulk migrations. This strategy brought migration speeds from an estimated 100 documents/hour to 100,000 documents/hour.
  • Multi-Threaded Infrastructure – Rather than requiring multiple client machines or other complicated infrastructure, the adapter can take advantage of OpenMigrate’s multi-threaded capabilities to quickly process multiple COLD files at the same time from a single machine.
  • Modern Format – FileNet’s COLD format is highly compressed. While PDF is a great universal standard for viewing documents, it also inflated the size of documents.  TSG utilized JBIG2 compression drastically reduce the amount of document size inflation.

See a short video that shows off the COLD adapter

FileNet P8 Migration – Best Practices.

With the P8 migration, we approached it in the same way as Image Services, looking for an underlying way to convert the content without leveraging the P8 API.  While we got close to cracking the P8 code, given the timing of the project we decided to leverage the API for this component of the migration. Given that P8 is a more modern platform than Image Services, the speed that we’re able to extract content from the P8 API was adequate for the client needs.

Since we were leveraging the API, content and meta-data were moved at the same time.  One key finding was that, given the specific client scenario, common explanation of benefit documents were stored multiple times in the repository and were duplicates.  Rather than storing these documents, we were able to take advantage of deduplication to reduce the migration time and document storage.

Specific lessons learned for deduplication included:

  • Subsets of the system took advantage of FileNet P8’s deduplication functionality, which allows the system to store only one copy of a document, even if that exact document is uploaded by multiple users. TSG took advantage of the data stored in FileNet to preserve deduplication during the migration.
  • When a document is deduplicated and is already migrated, TSG can speed up the migration by skipping calls to P8.
  • During migration batches with high deduplication rates, migration speeds increased to 100 documents/second, compared to 50 documents/second using the API.
  • Preserving deduplication reduced the number of pieces of content to migrate from over 530 million to 320 million.

Overall FileNet Migration Lessons Learned

  • FileNet resource availability – FileNet, being a legacy system, will have fewer individuals over time that can support the system.  Migration from FileNet becomes increasingly difficult if resources with knowledge of the system have limited availability or have moved on.
  • Test Data – as a system that likely spanned decades, it is important to have representative test data that spans the life of the FileNet installation. If test data is incomplete or out of date, test with production data whenever possible.
  • Migration planning and timing – as with any migration, understanding the data to be migrated is paramount. Using this understanding and planning migration activities, especially concerning FileNet usage and support is essential. For example, if FileNet is going out of support, migration activities that use the FileNet API should be prioritized while the system is still under support.

Summary

Legacy ECM systems like FileNet typically have vast amounts of data spanning many years, which can make them difficult to move. Delaying migrating from these aging systems puts core operating systems at increasing risk over time. Migrating off of a legacy ECM solution like FileNet to a modern repository that will scale to accommodate millions or billions of documents needs to be prioritized.

TSG leveraged both the FileNet API and native file access approaches, as well as file conversion to an open standard (PDF) within OpenMigrate to extract documents from a legacy ECM into a faster, modern solution.

Let us know your thoughts below.

Filed Under: FileNet, HBase, Migrations, OpenMigrate

Reader Interactions

Trackbacks

  1. FileNet Support – Migrating to mitigate the risk of an unsupportable product — Technology Services Group says:
    January 22, 2020 at 9:32 am

    […] part of our recent migration of an over 4 billion document client from FileNet, TSG had the chance to have detailed discuss with multiple clients on the reasons they were moving […]

    Reply
  2. FileNet Migration – Not as hard as you think? — Technology Services Group says:
    January 23, 2020 at 2:49 pm

    […] blogging back in 2010 and included updates in 2016, as well as our more recent success with over 4 billion documents in 2020.  While TSG has always believed that there will never be an easy button for migrations, for […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • FileNet Migration – Migrating Billions of Documents Case Study
  • FileNet Migration – Not as hard as you think?
  • FileNet Support – Migrating to mitigate the risk of an unsupportable product
  • File Formats Lessons Learned – Legacy ECM Migrations
  • FileNet COLD Migration – Cracking the proprietary format issue
  • FileNet Migration – Recorded Alfresco/TSG Webinar – 05/29/2019
  • FileNet – How to retire in weeks rather than months
  • Migrating to Alfresco – Reducing Risk, Stress and Cost with a Rolling Migration
  • FileNet Migration – Best Practices and Client Experience
  • FileNet Migration to Alfresco with OpenMigrate

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT