• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Federated Search and Content Services – Is a Publishing Approach better?

You are here: Home / Alfresco / Federated Search and Content Services – Is a Publishing Approach better?

October 23, 2018

Over the last couple of years, some ECM vendors have been touting a Federated Search model to “cure” the issue of access to document content contained in legacy ECM systems.  Whether from an ECM vendor like Nuxeo or supporting software vendor like Simflofy, the marketing message of getting access to multiple repositories from one search hearkens back to the previous promises from Enterprise Search vendors like Automony that never really fulfilled their promise.  This post will discuss the problem Federated Search attempts to solve and present some of the reasons we will typically recommend alternative solutions including a publishing approach.

Federated Search – Easier than migration?

Whether tied to digital transformation or new ECM development efforts, federated search usually arrives as a solution when content is contained in a legacy ECM system that isn’t going to be migrated or replaced by the new development efforts.  As we have talked about here before, migration isn’t easy.  If the new system just needs access to the content but doesn’t want to change what the system is doing, compelling legacy reasons for not migrating include:

  • Finding a reason to move everybody and everything
  • Moving legacy users
  • Migrating legacy content
  • Moving legacy integrations
  • Accessing legacy resources

To better visual the issue, imagine a typical Accounts Payable scenario.  The team that is managing invoice payment and owns the capturing of invoices would like their payment analysts to have access to the signed contracts contained in the legacy contract system.  The invoice payment team is on Alfresco and the contracts are contained within the legal group’s iManage system.  Federated Search would enable to invoice payment team the ability to show both the invoices as well as the contracts within their one invoice by connecting the Alfresco system to the iManage system.

Federated Search – Is it really that simple?

The marketing message for Federated Search typically contains the message “why move when you can just access the legacy content” but is it really that simple?  As pointed out in an excellent article from Accenture, The primary advantage of this approach is ease of implementation because no additional indexing of content is necessary. The query federation system simply taps into existing systems and extracts results, which are then merged….but cons include:

  • Performance issues can occur if the federator waits for the slowest remote search engine to respond
  • The merging of search results into a sensible hit list is difficult if based on relevancy, as each search engine called will score relevancy in a different way.
  • Search engines provide varying levels of query sophistication. Federation at query time usually implies a “dumbing down” to suit the least capable search engine.
  • Document-level security is a potential cause of performance issues, but this depends on the complexity of the security environment

In addition to the points raised above, we have seen our clients that have attempted federated search struggle with other issues including:

  • Security Logistics – In our AP example, legal would have some concerns about allowing access to their system, particularly DRAFT contracts. Making sure that invoice payment only has access to certain documents would require updates to iManage, something legal would not necessarily want to do and support.
  • System Logistics – Legal might be concerned about the load the new access will place on their legacy system as federated searches are not always the best performing.
  • Licensing Logistics – Users would require license access to both systems. In our AP example, all invoice payment analysts would need both Alfresco and iManage licenses.
  • System fault tolerance – Relying on both systems being available increases the concern that if one is made unavailable or struggles with performance issues for any reason, the end user experience will struggle.  Adding more repositories increases this risk.

Data Warehousing Lessons Learned – A Publishing Approach

When it comes to federated search or enterprise search, TSG sees parallels in the data warehouse approach.  In a data warehouse approach, clients wanted access to data contained in other systems but did not want to replace those systems.  Rather than a federated approach, the data warehouse focuses on publishing content from the legacy system to the data warehouse.  With the cost of storage always getting cheaper and cheaper, TSG has been recommending a publishing approach for documents. As we recommended back in 2015 when Enterprise Search was being discussed, TSG will typically recommend a publishing approach rather than a crawler or federated search.

In this publishing approach, a job is set up to monitor the business system looking for documents of a type and that have reached a stage that they can be pushed to the separate repository.  With this push, the new repository will have all the meta-data as well as a copy of the document itself.  Typically we see clients just publish a PDF of the document since it is to only be used for read access.  The publishing job might also push a light version of security in the form of meta-data if required.

In this manner, the legal department can insure that access to their own system is still controlled and documents that are needed to be shared can be pushed to invoice payment system as required.  Advantages of this approach over a Federated approach include:

  • Integration – Rather than having to write real-time integration to the departmental repository, the integration would be required at the publishing job. The Search Interface could be written for just new repository (Alfresco) and take advantage of all the capabilities of the repository.
  • Performance – Search performance is not limited by the system with the slowest response time.
  • Content Format – As part of the publishing job, content could be changed (typically to PDF) and also include additional items (headers/footers….) to provide consistency between systems.
  • Administration – Each user would need to be defined and maintained in the overall search repository rather than the departmental system.

TSG has implemented the publishing approach for multiple clients with OpenMigrate.   Several features include:

  • Ability to pull from a wide variety of ECM repositories including Documentum, FileNet, Alfresco, SharePoint as well as database driven systems (example Custom Oracle/SAP)
  • Ability to “poll” a repository and push content on a set interval (example 5 minutes or once a day).
  • Ability to transform content from a variety of formats into PDF.
  • Ability to store and index into a variety of repositories including Alfresco, Documentum as well as Lucene/Solr and Hadoop.
  • Ability to delete outdated or superseded documents from target repository.

Summary

Federated Search, like Enterprise Search before it, has some positive marketing capabilities but also has some downsides.

Quoting Alan Pelz-Sharpe from Deep Analysis

Federated search has been around a long while, but in my experience its never been easy to implement and in many cases simply not worth the effort.

Similar to Data Warehouse efforts, TSG typically recommends a publishing approach based on licensing, fault tolerance and overall user acceptance.

Let us know your thoughts below:

Filed Under: Alfresco, Documentum, FileNet, Search

Reader Interactions

Trackbacks

  1. Gartner Content Services Platform (CPS) Magic Quadrant 2019 – Where is the Vision? says:
    November 5, 2019 at 3:30 pm

    […] focus on hybrid cloud for OpenText and MFiles and Federated Search for Nuxeo, visions that we would argue are more marketing than reality for the majority of customers.  Both categories on Federation and Hybrid Cloud are barely […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Content Service Platform Scaling – How Good Key Design and NoSQL can avoid the need for Elastic/Solr or other indexes
  • Elastic Services for ECM – TSG OpenContent Roadmap
  • Migrating to Alfresco – Reducing Risk, Stress and Cost with a Rolling Migration
  • OpenContent Solr Services – New TSG Product Offering
  • Alfresco and Amazon Web Services – Disrupting Legacy Content Services – Alfresco Day London – Keynote
  • Alfresco – The Importance of being Cloud Native compared to Legacy ECM vendors
  • FileNet – Adding a Modern Interface to a Legacy ECM
  • Alfresco Guest Contributor – Documentum, FileNet & OpenText – should I stay or should I go?
  • ECM Roadmap – Thoughts on Planning for the Future
  • 2017 ECM Thoughts and Predictions as well as recap of 2016 postings

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT