• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Documentum xPlore Deployment – Lessons Learned from Pharma Implementaion

You are here: Home / Documentum / Content Server / Documentum xPlore Deployment – Lessons Learned from Pharma Implementaion

March 27, 2012

We’re working with a large Pharmaceutical company to install Documentum xPlore as a replacement for FAST.  We’ve just finished the QA environment deployment, and we’re planning for the Production deployment in mid-April.  For this post, we are going to discuss the cutover strategy as well as some lessons learned from the project.

Cutover and Reindexing Strategy

The client’s current environment contains one Content Server serving five repositories, and one full text server running FAST indexers for each repository.  The upgrade plan is to stop FAST, install xPlore to the existing full text server and re-index all of the repositories.  We’re planning on starting the installation on a Friday evening after business close.  Accounting for time to upgrade the content server to the latest patch and actually do the xPlore installation, we need to make sure that the re-indexing operation is complete before Monday morning.  To increase the likelihood that re-indexing will be complete, we decided to install an extra Content Processing Server (CPS) instance on the full text server.  The full text server has enough memory and CPU to handle the extra instance, so that is the best way to increase our indexing throughput.  We recently executed the upgrade in the client’s QA environment, and our average re-indexing throughput was 70 documents per second.  Assuming we see a similar average in production, this should be more than enough to complete the re-indexing operation over a weekend.

Wildcards and Fragment Searching
Since users can become frustrated with how the search results “should” work based on prior experience with the system, it is important to analyze the differences between FAST and xPlore when it comes to wildcards and fragment matching.  A good place to start is the xPlore 1.2 Administration guide starting on page 175.  Note that in all text below – when discussing a search term, it’s in single quotes.  This is to denote where the search term starts and ends, but the user would not have actually put the quotes in his or her search term.  Here’s how xPlore works out of the box:
  • Full text searches do not support wildcards or fragment matching.  This means that a search for ‘car’ does not return a document containing ‘careful’.  However, a document containing ‘blue car’ is returned.  xPlore treats wildcards as literal values.  Searching for ‘car*’ still does not return ‘careful’.
  • Metadata searches in Webtop’s advanced search work in a similar way, even when using the begins with, ends with, or contains modifiers.  For example, searching for titles that contain ‘car’ will return titles that contain ‘blue car’, but will not return titles containing ‘careful’.  However – xPlore does support wildcard characters in metadata searches.  Therefore, if the user executes the search as ‘*car*’, then documents with titles of both ‘careful’ and ‘blue car’ will be returned.  When executing a contains search on metadata, think of it as a ‘contains word’ search rather than a true contains search.
This last point is very important, since for most users it will not make sense.  They may say – “I searched for documents containing ‘123’ in the name, xPlore is broken because I’m not getting document ‘1234’ in search results.”   However, with the change to fragment matching, this user would have to search for ‘123*’ to return document ‘1234’.
If this is a problem for your users, you can turn on FAST compatibility mode.  Although FAST compatibility mode will slow down search times, it may be worth it depending on how users are used to searching.  This mode does a number of things:
  • Wildcards are supported from simple search and full-text searches.  Searching for ‘car*’ does return ‘careful’.  Note that fragments are still not returned.  Searching for ‘car’ will not return ‘careful’.
  • Wildcards are implicitly placed in metadata searches.  For example, if you search for documents containing ‘123’ in the object_name, the actual search would be for ‘*123*’, and in our previous example, would return document ‘1234’ as expected by the user.
Metadata Searches, Implicit Wildcards and Our Version of Webtop
For our client, we decided to turn on FAST Compatibility.  Users of the system are too used to a contains metadata search to be a true contains search rather than a ‘contains word’ search as described above.  However, we ran into one problem.  The client currently has Webtop 6.5 SP2 installed.  In our development environment, implicit wildcards in metadata searches were not added as documented in the xPlore Administration Guide.  This means that in the above example, searching for ‘123’ in a object_name contains search does not implicitly search as ‘*123*’.  As a workaround, the users are being instructed to add wildcards manually in metadata searches when the results are not as expected.
For example: say the user is searching for a document named ‘ABC-XYZ-1234.pdf.  If the user executes a contains search in the advanced search for ‘1234’, the results will not contain the document.  The user would need to search for ‘1234*’.  This occurs because xPlore indexes the “word” 1234.pdf, and Webtop is incorrectly not adding the implicit wildcards in metadata searches.
Special Characters
In xPlore, certain characters are indexed as white space in the Lucene index.  The default character list, as defined in the indexserverconfig.xml, contains the following characters: @#$%^_~‘*&:()-+=<>/[]{}  For example, say a document’s text or attributes contain ‘PX-SOP-1234’, the CPS will index three tokens ‘PX’, ‘SOP’ and ‘1234’.  According to the xPlore administration guide, a search containing a special character should be treated as a phrase search.  This means that if a user were to search on PX-SOP-1234, the document would be returned.  However, in our development environment, we were not seeing that behavior.  In the previous example, a search for PX-SOP-1234 was returning 0 results.  We didn’t get to the bottom of why this wasn’t working as described in the documentation.  Perhaps it was due to our version of Webtop – 6.5 SP2.  In any case, the business users are used to including special characters in their search queries, so we decided to remove the dash and underscore from the list of special characters.  This ensures that a search for PX-SOP-1234 returns the correct results.

xPlore Admin Interface – IE Settings for Reporting

xPlore’s admin interface is definitely a big upgrade over FAST.  While testing out the admin interface in the client’s development environment, we noticed some odd behavior around reporting in Internet Explorer: every report, no matter what settings we used, would only return a # character.  No results were coming back for any report. To fix the issue, you need to add the xPlore Admin website to the list of IE’s trusted sites, and then set the security level to medium-low:
  1. Navigate to the xPlore admin website
  2. In IE, choose Tools -> Internet Options -> Security tab
  3. Click on Trusted sites and click the ‘sites’ button to add the xPlore admin website to the list
  4. Below, set the security level to Medium-low.
After restarting IE, all reports should work as you would expect
Overall, we feel that xPlore is a great upgrade and much better than the FAST search engine.  If you haven’t upgraded, the process is fairly straightforward.  Hopefully the lessons learned above will help you in your upgrade.  If you have executed the upgrade in your environment, please comment below regarding your lessons learned!

Filed Under: Content Server, Documentum, Search, Upgrades, Webtop, xPlore

Reader Interactions

Comments

  1. Pitch Chevalier (@findingpitch) says

    March 28, 2012 at 12:29 am

    Agreed, the behavior of ‘contains’ on metadata is not optimal and might confuse users especially when searching for a reference, a part number or an identifier. As you noticed the behavior of ‘starts with’ and ‘ends with’ in the compatible mode improved in the latest DFC. We plan to make things simpler to understand in the next version of xPlore.
    Note that soon we will release on the EDN a custom indexing annotator that would help normalize and simplify searches for ID/references like PX-SOP-1234. Stay tuned.

    Final comment. The admin interface does work and is certified with Firefox – I’m using it.

    –pitch

    Reply
    • George says

      March 28, 2012 at 9:41 am

      Pitch – thanks for your comments and clarification on the Admin browser support. I’ve updated the relevant text in the post.

      Reply
  2. Christopher Smith says

    March 30, 2012 at 9:52 am

    We installed xPlore a few months ago. We were having an issue with FAST just shutting off once a week. xPlore has had no issues with staying up and is faster than FAST but is also on a new faster server. We have a lot of hyphens and underscores in batch numbers and etc. From the documentation I thought the default would be these are treated as white space but they where in the indexserverconfig.xml out of the box, no complaint just seemed like a discrepancy between the documentation and the install. One problem we did have was some of our documents though English were getting indexed as “it” Italian, “pt” ? EMC informed us to en to index-default-locale in the indexserverconfig.xml. Most of out properties are alphanumeric, when I put an English word in xPlore would index the document as English. Overall it was an easy install and is working well.

    Reply
  3. Anhtuan Doventry says

    June 1, 2012 at 10:40 am

    Can you enable ‘FAST compatibility mode’ anytime? Or do you have to do it during installtion?

    Reply
    • George says

      June 1, 2012 at 12:12 pm

      FAST compatibility is enabled via the fast_wildcard_compatible attribute on xPlore’s dm_ftengine_config object in the repository, so it can be modified anytime after installation.

      Reply
      • Anhtuan Doventry says

        November 29, 2012 at 11:35 am

        Hey George,

        After editing the object in the repository? What else do I have to do? I restarted the index server services, but it didn’t seem to take affect? Do I need to restart the Docbase?

        Thanks!

        Reply
      • George says

        November 29, 2012 at 11:49 am

        Did you restart the main xPlore server and any secondary instances? The index agents don’t really do anything with search, so restarting those won’t make a difference. If restarting xPlore doesn’t do the trick, try restarting Webtop – it may cache the setting somehow, but I’m not sure.

        Reply
  4. Ishaan Mirza says

    January 21, 2013 at 8:02 pm

    The solution that I reached in this scenario when documents do not get searched in when value of some of its attribute is like ‘PX-SOP-1234’ is Enforcing Default Language to English. This can be done by editing indexserverconfig.xml. In case CPS won’t recognize language, only in that case document get indexed in English language.

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Documentum 6.5 to 6.7 Upgrade Lessons Learned
  • Documentum Upgrade to 6.7 – a simple approach
  • Documentum Performance – Search, Retrieval and Inbox
  • EMC World – 2011 – Day Three – ECM Strategy and Roadmap – 2011/2012 – Mark Arbour – Head of Product Management – ECM Applications
  • Documentum Search – How to get around the user request of “I just want a search like Google”
  • Documentum – Top 12 Tips
  • Documentum Full Text Search with Lucene – Honoring ACL Security
  • Documentum Search – Lucene, FAST, Verity, Google and upcoming DSS
  • Documentum – Webtop and OpenContent Search Comparison
  • HPI Search Compared to Webtop and D2

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT