One of the biggest challenges in rolling out new interfaces for ECM is migration from older/legacy systems. We just started working with one financial services client that, in moving toward an Alfresco and HPI 2.1 solution, has initiated a hybrid approach to allow the interface to be introduced quickly while simplifying and reducing the risk of a large migration and big bang rollout. This post will detail the approach for reference and discussion.
Big Bang Migration – What are the issues for a large, legacy content repository?
Upon moving to a new system, our existing client identified a number of issues that would have to be addressed including:
- Large (10 million) amount of content and storage (multiple terabytes)
- Majority of content is historical but difficult to predict which documents would become “active”
- Existing systems that create content that are tied to the legacy system
- File format of existing document (single page TIFF) that would need to be converted to PDF
The initial approach and thinking was this massive migration effort would need to be completed before the users could start using the new system. Discussion about the sizing of the system would require a large Alfresco instance, a large migration approach leveraging delta migrations to gradually move all the content and a “big bang” with users to start using the new interface where the legacy system would be turned off with the new system turned on.
Hybrid Approach and Rolling Migration
In brainstorming about the approach, we started thinking, “how can we release the interface in a shorter time frame without the effort and risk of a large migration?” The legacy system, similar to many old systems, is a very stable custom database with a file store. While many efforts with migration focus on old systems that are off support or the business case is tied into reduced maintenance, in this case, there is no need to immediately retire the legacy system. Our approach focused on “what if we just display the old content in the new interface and allow new documents to be stored in Alfresco?”
In working through the approach, we came up with the hybrid approach and process for gradually bringing folders from the old system to the new system as presented below:
- User wants to access a folder
- If the folder doesn’t exist in Alfresco – create the folder
- Access the legacy system, transform old documents into PDF
- Store documents in Alfresco
- User accesses documents from Alfresco and creates new ones in Alfresco only
The process is slightly different if the folder already exists.
- User wants to access a folder
- If folder does exist, access old legacy system to determine if any new documents need to be moved
- If documents need to be moved, access legacy system, transform any new documents into PDF
- Store documents in Alfresco
- User accesses documents and creates new ones in Alfresco only
With the approach above, the user might notice a slight delay when they first enter the folder, with subsequent access being much quicker as only the changed content would be copied to the existing folder rather, than the whole folder.
Hybrid/Rolling Migration – Performance Options
With the rolling approach, there are a number of options to tweak performance if required:
- Caching Folders – the business process for our client follows involves an automated process that places a document in a historical folder to kick-off activity within the folder. Our approach allows this activity to initiate the folder copy to Alfresco using the above process to avoid the user having to wait for it to be cached.
- Multi-Threaded Image Conversion – One of the concerns, particularly with large files, was the performance of the conversion of the single page TIFF documents to multi-page PDF. We are discussing ways to kick these off in a multi-threaded approach, similar to OpenMigrate, to improve performance.
- User Interface Background Options – We could allow the user to access the folder and gradually have the converted images appear as a background activity rather than have the user wait for all documents to be converted before the interface appears.
During our initial POC activities, we will be testing the different performance options.
Hybrid/Rolling Migration – Benefits for Final Migration
The Hybrid/Rolling Migration provides a number of benefits as we prepare for eventual retirement of the legacy image system.
- Gradual build-up of Users – The approach allows the client to gradually release the interface to users based on types of folders. The approach does not require all users to start using the new interface immediately.
- Gradual build-up of Alfresco size and infrastructure – The initial deployment will be a smaller Alfresco and server infrastructure that can be expanded as content and users move to the new system.
- Gradual movement of automatic feeds – As the use of Alfresco expands, the team can gradually starting moving the automated feeds from the legacy system to directly feed Alfresco to reduce the copy efforts for new content.
- Gradual Final Migration – Once the automatic feeds are complete and all users are using the new system, processes could be kicked off leveraging the same folder copy infrastructure to finish copying all the remaining folders. The Alfresco system does not need to be down or off-line during these copies.
Summary
A Hybrid/Rolling Migration should be an option for ECM clients to consider that want to avoid a “big-bang” migration effort. Benefits include:
- Quick introduction of a new and improved interface
- Gradual migration of older content rather than a big-bang approach
- Gradual conversion of automatic feeds from old system to new system
- Gradual build-up of users and infrastructure for new approach while legacy system is gradually retired.
Let us know your thoughts below.
Interesting article – some out of the box thinking to deliver an innovative solution in a short time frame (which is increasignly becoming the most important kpi for technology projects)
We faced a similar situation in a project – but our challenge was that the customer wanted to completely overhaul their folder tree and security model; but bulk of their legacy content was stored in Centerra
The quick-win for us was to
– migrate all the legacy content to the new repository – without changing the folder tree but the security model was standardised and aligned to the new security model (to avoid duplicate administration effort)
– if required – users could find and copy legacy content to thier new working folders
– Centerra content did not have to be physically migrated; the new repository pointed to the legac Centerra file-store and new content was archived to a new Centerra filestore