• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Documentum 6.6 Upgrade – Character Encoding Fail – Part II

You are here: Home / Documentum / D6.5 / Documentum 6.6 Upgrade – Character Encoding Fail – Part II

November 10, 2010

This week we are urgently reminding clients, as part of their upgrade evaluation, to look seriously for character encoding issues in regards to their current Documentum content and the affect on upgrades.

This is an update to the original article that was written in August.  While the post highlighted character encoding issues and DFC 6.5, we are not quite sure readers fully realized the impact to their upgrade efforts.

The Scenario

There are two scenarios that could result in bad characters in the docbase

  1. Over time, users will “cut and paste” from Word or other applications into Webtop fields, Custom applications or other Documentum interfaces.  Within the Browser (Internet Explorer, Firefox, Netscape for the old timers) the character string will look fine, but in reality, the field could contain “special charcters” that end up being passed through to the database.
  2. Migration efforts from previous upgrades/consolidations resulted in character encoding issues that were not identified.

Documentum, before version 6.5, allowed storage and retrieval of these characters without an error.  As noted in the previous post, version 6.5 of the DFC does not support these formats and will throw errors on regtrieval such as

[DFC_OBJPROTO_BAD_NUMBER_FORMAT] Invalid number format for string length in serialized object

[DFC_OBJPROTO_BAD_STRING_FORMAT] Unknown string format in serialized object

Why is this a Big Deal?

The potential critical issues for clients would be

  1. An upgrade to 6.5/6.6 (either Migration, DB Clone or Upgrade in Place) that leaves these characters in the database.
  2. Any 6.5 interface (Webtop, xCP) throws an error when it tries to retrieve content with character encoding issues.
  3. xPlore will index (but very slowly) any content with Character Encoding Issues.

The tough part – garbage in/garbage out – the thought would be to clean up all the meta-data before either the upgrade or the use of DFC 6.5 or 6.6.

We should point out that we have only seen this issue for Oracle.  We cannot either verify or deny that SQL Server clients would have the same issues.

Possible Resolution

Consistent with the previous post – we recommend the following:

  1. Consider leveraging OpenMigrate or a similar application to “scan” your data with DFC 6.5 to determine if any encoding errors exist prior to the upgrade.  DFC 6.5 is compatible with 5.3
  2. During the upgrade, use OpenMigrate to migrate data into a clean repository instead of performing a typical in-place upgrade or dump and load.  Migrations are a great opportunity to “scrub” and validate existing data.  Because every document is touched during a migration, corrupt data can be more easily identified.  We are working on adding a character encoding check for typical errors.
  3. Utilize database tools to help identify potential problems.  Oracle has a Character Set Scanner Utility (CSSCAN) that can scan an entire database to verify that all data stored in the database use the correct character encoding.

As one last push – we are reaching out to Documentum to ask the simple question – “Hey – why not return the string with the bad character encoding rather than throwing the error – consistent with what pre-DFC 6.5 did?” .  Given DFC eventually going away for DFS – it is worth asking.

Please comment below with any thoughts….

Filed Under: D6.5, Documentum, Migrations, OpenMigrate, Upgrades, Webtop, xCP

Reader Interactions

Comments

  1. Paras Jethwani says

    November 11, 2010 at 1:07 am

    Hi,

    Can you give some examples of ‘special characters’?

    What locales does this issue impact? English or internatinal as well?

    – Paras

    Reply
  2. Chris3192 says

    November 12, 2010 at 9:03 am

    Hi Paras.

    The characters that typically cause the issue are not usually ones that are easily seen by eye. I believe this could occur in international character sets as well. I did see it with some Chinese characters at a client. Basically anywhere you might be changing from one character coding to another could produce bad data.

    We find that many of the characters end up coming from fields where users have copy and pasted data from another application into the Documentum attribute. This is why it’s important to do a very through test of the migration data, including retrieving it through a client app to view it.

    For one client, we ran migrated data through a routine that checked to make sure the characters were valid ASCII characters for the target system. This did take awhile and the client then had to remediate any documents and meta data that was deemed “invalid” and manually process them.

    Thank you for your comment and please let us know if you found this helpful.

    Reply

Leave a Reply to Chris3192 Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Documentum 6.5 Upgrade – Character Encoding Issues
  • Documentum Upgrade to 6.7 – a simple approach
  • Documentum – EMC World/Momentum 2012 – TSG Recap
  • Documentum Workflow Manager, BPM, and Licensing
  • Documentum – Top 12 Tips
  • Documentum and Momentum EMC World 2010 Recap
  • Documentum Full Text Search with Lucene – Honoring ACL Security
  • TSG Open Source Product Plans
  • Documentum Content Server 6.7 – Primary Support Ending April 30, 2015 – What should clients do?
  • Enterprise Content Management Predictions – 2015

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2022 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT