• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Documentum Search – Why the Google appliance just doesn’t cut it

You are here: Home / Documentum / D2 / Documentum Search – Why the Google appliance just doesn’t cut it

October 30, 2013

Many experienced Documentum customers have attempted to leverage the Google appliance as an alternative to Documentum Search or as part of an Enterprise Search effort.  One of our clients presented their experience at our user group meeting.  This post will discuss their findings.

What’s Wrong with Google – Recap and Client Experience

One of our most popular posts regarding Documentum Search has been “What to do when users just want a Google Search”.  As we pointed out in the post, it isn’t so much that clients want a Google search as much as they hate Webtop Advanced Search.  In talking with a client that added a Google search, they found the following issues:

You get what you asked for – It is a Google search:

  • Index very time consuming (4-5 hrs) and limited to a nightly run.
  • Display output limited to messy Google-type output.
  • No sorting of search results other than the relevance sort that Google provides.
  • No ability to integrate custom metadata into the search.
  • It’s not possible to audit the index content.  The client found numerous missing documents in the index.
  • Google will only index files under 20 MB.

In addition to dealing with search result issues, the client had the following Site Caching Services (SCS) issues:

  • SCS is Folder driven (rather than document) using linked folders, caused problems if file to be published is not in the folder to be published.
  • Site Caching jobs are very cache intensive.  DBA tuning was attempted but could only run the jobs once per day.
  • SCS did not run in real time  in that, due to cache issues, could not publish jobs more than once a day (since Google could only index once a day as well).

Lucene/Solr, OpenMigrate and HPI

In replacing Search with a combination of HPI and OpenMigrate, our client found that:

HPI Search and Output

  • Search filters can use many data fields in combination with search engine.
  • User Driven – User Display output held in cookie.
  • Sorting is allowed on any data field.
  • Searching is Just as fast as Google.
  • HPI provides the ability to apply a 20,000 search result limit.

Open Migrate Flexibility

  • OpenMigrate can be configured to fit the client’s scope.  This particular client uses the Object Type, Facility, ACL and Modified Date attributes to know when to push a document to the search index.
  • Indexing is near real time.  Changes captured AND indexed in search engine hourly. (Likely 20 minutes once other server loads removed)
  • Ability to audit the indexexists.  Runs nightly to identify misfires, typically due to user processing.  (Rendition Import – EMC bug)
  • File size, no limitation. (Google limited to 20 MB, WT 6.7 – 40 MB)

Summary

The desire of users to get data quicker and easier is something we have promoted for years.  Often times we quote that 70% of users just want to search and print.  Clients can be frustrated by complex Documentum searches (ex:  Webtop, D2) that require the “build a search” model.  While clients might request a “Google” search, as presented in this post, the Google appliance along with Site Caching Services can bring along its own set of problems and issues.

For a screencam comparison of searching with Webtop, D2, xCP and HPI, please see our comparison available in the learning zone.

Filed Under: D2, Documentum, Lucene, Migrations, OpenContent Management Suite, OpenMigrate, Webtop, xPlore

Reader Interactions

Comments

  1. Mark Boon (@mhboon) says

    November 1, 2013 at 12:56 pm

    Thanks for sharing!
    The comments on SCS are recognizable; and I can even add more limitations.

    On Google search, the described use case does not really seem to use the Google Search Appliance capabilties or is based on an old version based on my understanding of GSA and feedback from collegues knowing more about this product.

    -Index very time consuming (4-5 hrs) and limited to a nightly run.
    These are implementation choices and not an enforced standard;

    – Display output limited to messy Google-type output.
    Three scenarios possible, from standard Google style, XSLT processing or custom UI based on XML results. So this offers more ways!

    – No sorting of search results other than the relevance sort that Google provides.
    Is indeed true; though date sorting is also possible. 7.2 release (end of this year) will support custom metadata for searching. So soon not an issue anymore.

    – No ability integrate custom metadata into the search.
    Google is able to integrate Documentum metadata; so this seems to be implementation choices.

    – It’s not possible to audit the index content. The client found numerous missing documents in the index.
    The administrative interface provides all the relevant details to analyze.

    -Google will only index files under 20 MB.
    This applied to older versions.

    Reply

Trackbacks

  1. Documentum Cross-Repository Searching – an integrated open source approach | TSG Blog says:
    January 15, 2014 at 3:03 pm

    […] Friendly Search Interface – We have discussed this on this forum many times but the typical “build a search” Documentum search from Webtop or other Documentum tools is not very user friendly.  See related post comparing […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Documentum Performance – Search, Retrieval and Inbox
  • Documentum 6.5 Upgrade – Character Encoding Issues
  • Documentum – Top 12 Tips
  • Enterprise Content Management Predictions – 2015
  • Documentum – Momentum EMC World 2014 Recap – Some bunts, hits as well as some swings
  • Documentum Cross-Repository Searching – an integrated open source approach
  • Documentum, SharePoint, Alfresco – Document Control for Life Sciences
  • Documentum – Replacing External SharePoint Sites with a Simple Cached Approach
  • Documentum – EMC World/Momentum 2012 – TSG Recap
  • Documentum Client Briefing – Final Agenda – June 7th – University of Chicago Gleacher Center in Chicago

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT