Whether it is a trend, just coincidence, or the fading of IBM’s influence in content management, recently we have had multiple clients request migration of FileNet images and documents to Alfresco as part of a long term plan to remove the FileNet/IBM solutions from their infrastructure. This post will share our experience and thoughts around migration with OpenMigrate.
FileNet was originally founded in 1982 with a goal of commercializing optical disk technology. FileNet shipped the first image management system in 1985. In the 80’s, hardware infrastructure components needed for image processing included state of the industry, at the time, image processing boards in PC’s, proprietary servers, scanners as well as optical disk technologies. In 1992 FileNet moved away from proprietary hardware to focus on software but still leveraged a core component called an “optical disk jukebox” capable of holding multiple laser disks.
With the purchase of Saros in the mid-90’s, the Panagon release in 1998 as well as the P8 release in 2003, FileNet has evolved their solution to move away from the hardware components to a software model that leverages the improved storage and display technologies available as commodities today. Panagon brought the beginning of document management (rather than just image management) and P8 extended that capability. FileNet was purchased by IBM in 2006.
FileNet Migration to Alfresco
For our client, the history of FileNet is particularly relevant. See our initial thoughts on the FileNet migration in a previous post. For various reasons, previous consolidations had been delayed resulting in multiple repositories including:
- FileNet Image Services complete with OSAR jukeboxes and optical disks. Client had approximately 1 million documents stored and was continuing to add new documents. Images are stored as single page TIFF documents and HTML files.
- FileNet Document Services/Panagon – 100,000 digital content files (Microsoft Word, PDF, etc…)
- FileNet P8 – 2.75 million documents stored including multiple page TIFF and HTML files.
In migrating to Alfresco (in the Amazon Cloud), the client wanted to consolidate all of the FileNet instances into one Alfresco instance.
FileNet Image Services Migration to Alfresco
The most difficult aspect of migrating from FileNet Image Services deals with the Optical Storage and Retrieval (OSAR) jukebox. Our client’s unique components, in addition to the jukebox, included:
- Optical disks were stored in the jukebox but also loaded manually as the system had evolved past the 200 that could be stored in the jukebox to exceed 2,000 platters.
- The jukebox included four drives in each library but only 2 working drives per library.
- Care had to be taken to migrate documents in platter order to minimize disc swapping.
- Content needed to be changed from single page TIFF to a multi-page PDF. PDF/A configuration was also available for archive.
- Needed to balance migration with current production storage and retrieval.
- Metadata, stored in the FileNet database, had to be migrated as well
Working with the client, we were able to migrate the bulk of content from FileNet to a File System in preparation for the move to Alfresco. By leveraging one of the three jukeboxes for migration, we chose to load one jukebox with 100 platters at a time and run that migration leaving the other 2 jukeboxes available for production needs. Some issues that arose included:
- Periodic failures during FileNet IS Migration – whether it was just FileNet or other issues, we leverage OpenMigrate to automatically retry failed documents. This was critical as the FileNet system was often unreliable during our OpenMigrate runs.
- Command line requirement – unlike the later FileNet instances, FileNet IS did not have a clean API to access images from the OSAR jukebox. We built an IS source connector for OpenMigrate that put a wrapper around that command line interface.
- Multi-Threading – while the source (Jukebox) needed to be single threaded (one platter at time), the target Alfresco instance could have multiple threads to improve the eventual storing in Alfresco.
- Additional Metadata – large components of the target Alfresco attributes were stored in a variety of other systems. OpenMigrate had to take into account the other systems that would require access to store that metadata with the associated images.
FileNet Document Services
FileNet Document Services (DS) migration was much easier than Image Services. Simplification components included:
- DS Database contained direct pointers to file locations (on magnetic rather than Optical disk).
- File Content was not single page TIFF so documents could just be migrated without any conversion.
OpenMigrate for DS could utilize the existing database and metadata to load content from the File System. By leveraging OpenMigrate’s ability to connect to any database, we did not require a custom connector for DS.
The P8 migration was a combination of the DS and IS migrations in that:
- P8, like IS, leverages multi-page TIFFs that were combined into single page PDF or PDF/A.
- File locations were not easily determined from the data in P8 so we need to call a native Java API to extract content.
Other components that were similar included both multi-threading Alfresco target options as well as extraction of metadata from external Oracle databases.
All three instances of FileNet had their own quirks combined with metadata stored in different databases required extensive understanding of the existing environments as well as mapping to match attributes in the destination repository.
If you have any thoughts or questions, please share with us below.