One of the most difficult things about migrating off of FileNet happens for those clients that have relied on leveraging the FileNet COLD format for storage of computer generated documents from internal systems. Built in the 1990’s in a time when network bandwidth and storage conservation was paramount, the format is unlike any others, badly documented online and very difficult to easily convert into non-proprietary, standard formats like PDF. This post will present how TSG was able to crack the format issue to build our own FileNet COLD to PDF conversion utility as part of our OpenMigrate product.
FileNet COLD background
COLD stands for “Computer Output to Laser Disc” and was leading edge technology when the cost of spinning magnetic disc was prohibitive for storage of documents or images. Systems like FileNet evolved to store all documents on optical platters complete with jukeboxes for moving optical platters to drives when the images were requested. When a jukebox would fill, the software would provide for ways to manage platters offline where, when an image was requested, the user would wait until an operator would feed the platter into the jukebox.
Components of the COLD process involved a routine to call by the client’s output system to create and store the document in the COLD format as well as a viewer that could retrieve and view the COLD format. The FileNet COLD format includes:
- Outer Header
- Compression and document size Information
- Content Header – Print Information
- Background Image (template) Id
- The template/background for document, as the COLD file itself only includes the text that’s actually different from document to document.
- Finding template ID: https://www-01.ibm.com/support/docview.wss?uid=swg21957983
- Display Resolution, scale & orientation
- Font Information
- Background Image (template) Id
- Content
- Text Data
Most of the high-level information about the format could be found here : ftp://public.dhe.ibm.com/software/data/cm/filenet/docs/isdoc/412x/COLD.pdf
As seen in the above link however, the specific details about the COLD Format were harder to come by, particularly for the Decompression and Background Image Id, and most methods required tools from IBM.
Decrypting the FileNet COLD Format
There are multiple ways to turn the COLD format into PDF. Two standard ways include:
- FileNet Viewer – Daeja Viewer Automation – One method, similar to other transformations to PDF (like Microsoft Office) involves bringing up the FileNet or Daeja viewer in an automating fashion and initiating the print to PDF function. For small volumes, this method works well but requires the infrastructure (multiple machines – automated scripts) but typically fails for large system migrations with millions of documents to convert.
- FileNet API – Some automation can be done with the FileNet API but also requires the infrastructure and scripts to one by one convert the files.
- Daeja Viewer API – Similar to the FileNet API, but can be done using only the viewing application itself. Requires additional scripts to push files into the viewer, but allows the use of a more light-weight application’s API.
For the first two methods, the approach relies on the access to the FileNet API and old COLD retrieval systems. TSG has found a couple of issues with either approach including:
- Performance of the FileNet API/Infrastructure – typically the old FileNet system, built for a previous time, can perform slowly for converting documents. As we are typically recommending a rolling migration for large FileNet systems, in the majority of cases, the slow performance will not be acceptable.
- Access to FileNet during migrations – another downside to the approach is requiring access to FileNet during the migration. For clients wanting to retire FileNet early or for parallel migrations where content is being migrated while FileNet is still being used and contention could slow down all the production users, this approach is also unacceptable.
FileNet COLD – Developing a better way
As part of the OpenMigrate product set, TSG has developed a Java based adapter for FileNet COLD transformation. The COLD adapter will take FileNet COLD documents and convert them to PDF. Components of the utility include:
- Parser for the outer COLD header
- Decompression of the COLD Content as needed
- Decoding the inner Content Header for display parameters.
- Retrieval of the background image based on header info.
- Conversion of COLD File’s page(s) to a single PDF, using decompressed page’s data, display settings and background image.
By converting the document with just a Java component, the COLD adapter has the following benefits over other traditional conversion approaches.
- Native File Access – The COLD adapter does not require FileNet to access and convert COLD files. For both scenarios where FileNet is still running and contention might cause user performance issues or where FileNet is to be retired as part of a rolling migration, native file access allows for fulfilling both of these scenarios.
- Performance – Compared to traditional “launch a viewer and print to PDF”, by working at the file level, the TSG transformation utility can transform files much quicker to better support rolling migration and bulk migrations.
- Multi-Threaded Infrastructure – Rather than requiring multiple client machines or other complicated infrastructure, the adapter can take advantage of OpenMigrate’s multi-threaded capabilities to quickly process multiple COLD files at the same time from a single machine.
See below for a short video
Summary
One of the most difficult things about migrating off of FileNet is the proprietary FileNet COLD format for storage of computer generated documents. Traditional approaches that rely on the FileNet API or FileNet viewers are good for small volumes of documents but fail in large migrations or rolling migration scenarios. TSG has developed a native file adaptor as part of our OpenMigrate product set that provides better performance as well as superior support for a variety of formats.
Contact us to learn more and let us know your thoughts below.