• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Alfresco – Working with PDF and Renditions

You are here: Home / Alfresco / Alfresco – Working with PDF and Renditions

May 8, 2018

One of the major recommendations we make for our clients is to leverage PDF as much as possible as both source documents and renditions within Alfresco.  This post will share some of our observations and lessons learned.

Alfresco – Why PDF?

In working with legacy ECM customers, many are used to older formats (like TIFF) or just storing native formats (like Word).  In discussing why PDF, TSG will typically bring up the following reasons:

  • Portable Document Format – PDF stands for “Portable Document Format”. PDF can easily be sent to other people outside the organization without being concerned the audience has the right application to view the document.  With security, certain functions like altering or printing can be curtailed.
  • Browser Support – As all of our ECM customers now leverage web browsers for viewing documents, PDF provides the benefit of allowing the document to view within the browser without any add-ons or launching the native application. (ex: Word).  This viewing or previewing capability provides the ability quickly view a document faster than a native application (Word) could be launched to view a document.  Lots of options from the browser itself, PDF.JS or our viewing/annotation/redacting product OpenAnnotate.
  • Printing Support – PDF provides benefits over other content types that might need the native application to initiate a print. See our earlier article on printing with Alfresco.
  • Overlay Support – PDF also provides the capability for overlays as we do with our OpenOverlay product.
  • Combining and Manipulating – Having PDF renditions available allows pages to be added and deleted or combined to create supersets or subsets of documents. One of the major features of OpenContent Case is the Combine PDF for our insurance clients looking to send out organized packets of documents.
  • Annotation Support – PDF has built in annotation support with the XFDF format.  Annotations can be stored as different files within Alfresco (with different security) or burned into the PDF themselves.  See examples with OpenAnnotate.
  • Long Term Archival – PDFA is the ISO-standardized version of the Portable Document Format for use in the archiving and long-term preservation of electronic documents. Clients with records management requirements will typically use PDFA for long-term storage.
  • Storage Costs – With the price of storage very cheap, it no longer makes sense to avoid storing a PDF rendition of a document just to avoid the storage costs.

Alfresco – How PDF?

There are multiple options to leverage PDF within Alfresco including:

  • Scanning – This is the oldest way to turn paper into PDF.  TSG typically recommends scanning and OCR to have the text available within the PDF for indexing in full-text search.  PDF supports both a native, image and combination text/image.
  • Native PDF Content – more and more content is originating as PDF. Whether that be from outside parties or capturing a print stream, rather than print and scan, many are going directly to text PDF.
  • Native Content Transformed into PDF – This is when documents exist in Word, Excel or a variety of other formats.  Alfresco provides transformation to PDF as part of the base product leveraging LibreOffice.  Alfresco also provides and external transformation server for more difficult transformations for an added cost.  TSG has also worked with Adlib for high-quality document transformation.
  • Legacy Content Transformed into PDF – Within our OpenMigrate practice we typically work with a ton of Legacy TIFF file formats particularly with FileNet coming from the scanning approach. TIFF can transformed with open source libraries into PDF Image or OCRed to Image and Text.

Some of the unique components of Alfresco to keep in mind when working with PDF include:

  • Synchronous Transformation – For both the internal and external transformation server, Alfresco will transform the document when it is being stored. Having to wait until the transformation is complete can slow down ingestion, particularly on mass migrations.  TSG has built alternatives with OpenMigrate to transform asynchronously.
  • Transformation Quirks – The native transformation server as well as external doesn’t always transform documents correctly and TSG recommends clients test their document types. TSG has found that both will sometimes struggle with certain MS Word documents, particularly legal agreements that have two columns, complex tables, or tricky fonts.  Contact us if you need some example documents.
  • Versioning – Out of the box, Alfresco versioning only allows for the latest version of a document to have a PDF rendition. TSG has built the Chain Versioning upgrade to allow for both current and previous versions in Alfresco to contain renditions.  See the discussion around associations here for details as to why Alfresco cannot support separate renditions per version out of the box.

TSG  – PDF Recommendations and Best Practices for PDF in Alfresco

In talking with our team, some recommendations for clients include:

  • Transform Legacy Content during the Migration rather than after – For our large volume clients, leaving the transformation until the end of the migration can result in a backlog of transformations that affect viewing performance as well as perception. We would recommend transforming during the migration process whenever possible.
  • Test different document types and formats – As mentioned above, checking how well documents transform before choosing a transformation solution.
  • Consider PDF Overlays for print control – Overlaying the Date and User on a document are easy steps to make sure that printed documents are properly controlled.
  • Implement TSG Chain Versioning – to allow versions to have PDF renditions.
  • Transform to PDF and then Annotate rather than Annotate on Native Formats – some tools support viewing and annotating on the native format. While the approach might seem cleaner, the tools and viewing can be expensive and troublesome.
  • Build interfaces that rely on PDF Viewing First – With PDF available, viewing and printing should focus on PDF with access to the native application added later.

Summary

PDF Renditions used correctly within an Alfresco implementation can simplify viewing, printing, annotating and manipulating documents.  Users should be aware of the uniqueness of using PDF and implement Alfresco with those factors in mind.

Let us know your thoughts or best practices below:

Filed Under: Alfresco, OpenAnnotate, OpenContent Management Suite, OpenMigrate

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Alfresco 5 – TSG Products and Solutions Now Available
  • Next Generation ECMS – Architecture Thoughts
  • ImageWare from Cannon – Migration to Alfresco in the Cloud
  • Alfresco 2013 Summit – Recap
  • Documentum – EMC World/Momentum 2013 – TSG Recap
  • Alfresco Consulting – Documentum Disruptor #2
  • TSG "Hack" Day – June 29th – 2011
  • Third Annual TSG Client Briefing – June 3rd – 2010
  • Alfresco – Viewing Annotations on Versions
  • Alfresco – Do More with OpenContent Management Suite and OpenAnnotate

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT