Since the very first Momentum (1996 in a very windy Miami), the Documentum user community has pushed for a more reliable means to convert mostly Microsoft office documents into PDF. Back then, during a wrap-up luncheon, the feedback on AutoRender ( a previous incarnation of DTS) was anything but positive. Similar to some complaints today, some of the main complaints included:
- Having to monitor/reboot the AutoRender Server throughout the day
- Unreliable PDF Transformation included:
- Unsupported Document Types
- Font Replacement
- Broken links
At the time, Documentum threw some engineering effort into AutoRender to address some of the shortcomings. One of the changes was to have AutoRender reboot itself (not really a fix but it did address some of the shortcomings). Like other products from Documentum, TSG is occasionally asked for alternatives. This post will address some of the tools we use in non-Documentum environments that could easily be adapted to the PDF rendition needs for Documentum.
For a couple of our non-Documentum customers, we have leveraged the Adobe LiveCycle component PDF Generator. We have been very impressed with their reliability and functionality. Considering Adobe created the best known implementation of Portable Document Format, it makes sense to rely on Adobe technology to convert your native content.
Adobe LiveCycle PDF Generator can convert native document (Microsoft Office, OpenOffice, etc) to PDFs and other formats based on your business’s needs: TIFF, PNG, JPEG, text, EPS, HTML and many others. PDF Generator can also be used to automatically convert existing documents into PDF/A for long term storage and archiving. PDF Generator can also transform flat image files into searchable PDF files using optical character recognition (OCR). Many clients require this when they deal with scanners and faxes.
Adobe LiveCycle PDF Generator is extremely powerful, not only because of its ease of configuration (via a user friendly GUI) and robust toolset, but also thanks to the many integration points that are provided. PDF Generator functionality can be accessed via network watched folders, e-mail, web user interface, virtual print driver, and Java APIs. No matter what your CMS is – LiveCycle can easily integrate.
For our Alfresco practice, we leverage OpenOffice from Sun/Oracle. TSG has been very impressed with OpenOffice from our work with Alfresco. OpenOffice has a built-in PDF generator that can convert many native Microsoft Office formats, OpenOffice formats and many others. In order to access OpenOffice’s document conversion tools – OpenOffice must be run as a service. OpenOffice will listen on a port (by default 8100) and can be accessed via a Java wrapper. Alfresco has its own Java APIs for converting documents. If you aren’t using Alfresco, there are a few open source Java wrappers which can be utilized: JODConverter and NOA. These can be used by directly embedding them into your Java application, via command line, or via a web service. TSG is considering leveraging OpenOffice for our Documentum customers by developing an OpenContent service.
While we hate to admit it, sometimes it just makes sense to allow users to manually add PDF Renditions themselves. Some specific examples include:
- AutoCad – it makes sense to convert and verify at the desktop rather than rely on an automated means.
- Complex Graphic – with some tools (Quark) – it makes sense to have the graphic designer make conversions
- Complex Spreadsheets
- Visio Diagrams
Once a document has been manually converted, it can then be uploaded into your CMS manually either as is, or as a rendition attached to the native content. Below you see an example of importing the content in both Documentum’s Webtop or TSG’s HPI.