• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Documentum or Alfresco Reporting – Big Data to the Rescue

You are here: Home / Alfresco / Documentum or Alfresco Reporting – Big Data to the Rescue

March 10, 2017

We commonly see clients struggle with reporting requirements from their ECM platform, whether that be Documentum, Alfresco or any other ECM repository.  While typical ECM repositories have a relational database included in the infrastructure, the database is typically difficult to navigate with typical SQL based reporting tools.  This post will discuss how TSG has been leveraging Open Source and Big Data tools to provide more “out of the box” reporting for ECM customers.

ECM Reporting – What are the issues?

Reporting requirements for typical ECM systems can be divided into two basic requirements:

1. Reporting on Documents

Reporting on Documents can include requirements like:

  • How many documents were created, approved or retired last month? Faceted by specific object types or attribute values.
  • What is my backlog of unprocessed documents?

These requirements typically fall as part of the document and attributes of the system but can be difficult to report on as the metadata that drives the reporting might be constantly changing.

2. Reporting on Actions

Reporting on Actions can include requirements like:

  • How many times was this document viewed?
  • Who has viewed this document?
  • How long did this document approver’s task take to approve?

As the majority of legacy reporting tools focus on SQL reporting on a relational database (like Cognos, Pentaho), the difficulty with most ECM systems center around the inability to access the underlying abstracted relational database to retrieve reporting statistics. The way that Alfresco and Documentum abstract and normalize their database make it undesirable to point these tools at the database and be able to pull any meaningful data. Most ECM relational databases are laid out with a variety of different tables to provide the ability to add attributes easily.  This structure makes it difficult to report on document metadata.

Another major issue is typically the ECM tools built in audit capabilities.  While the audit capabilities exist, the resulting relational table within the database can quickly become overwhelmed (tens of millions of entries).  As an example, Documentum provides a capability to turn on an audit trail for document viewing.  While this would seem to address our requirement of who has viewed the document, Documentum counts a view as not when the document file was requested (API example) but when any of the metadata is viewed.  For a search that returns 100 results in a list, the audit trial would have 100 entries quickly filling the audit trail.

ECM reporting with Big Data Tools – a New Approach

For multiple clients, TSG has developed solutions leveraging Big Data Open Source tools to construct robust “out of the box” reporting.  The solutions are constructed in such a way as to not affect performance or flexibility of the underlying ECM system. The individual components we would recommend that make up the Open Source ELK stack:

  • Elasticsearch – Robust Open Source index (built on Apache Lucene efforts) for indexing timestamped events/actions.
  • Logstash – Robust Open Source data collection pipeline for pushing data into Elasticsearch.
  • Kibana – Reporting tool for visualizing and navigating data indexed in Elasticsearch

To illustrate how these tools can be leveraged for ECM reporting, consider the following two scenarios.

  • File Access – Clients have used the ELK stack to log an event into a standalone log file every time a file is retrieved. Logstash then captures and streams that log data to Elasticsearch.  The event includes all of the attributes (ex: Document Status, Vendor Name, etc) as well as the username and time of the event.  Leveraging the Kibana GUI, the business can quickly construct a report of how many documents of a certain type were retrieved as well as who is reading those documents.
  • Performance – Some of our clients have included additional information in the logging including things like performance. While more difficult, the retrieval time was calculated between when the file (or search) was requested and how long it took to ultimately return the results to the user.  In this manner, administrators can have a clear understanding of performance of the system based on user access time of day as well as user activity.

big-data-dashboard

Adding Big Data reporting to Documentum or Alfresco

To add this functionality to your existing ECM repository, clients have added hooks at the following places to create the “event” entries

  • Documentum – TBO Code (for low level events like create, checkout, checkin, getContent, etc)
  • Alfresco – Alfresco Behaviour (for low level events like create, checkout, checkin, getContent, ect).
  • Application specific events (for logging when users click on certain actions in the UI in Webtop/D2

Summary

ECM customers have always struggled with reporting.  By adding Big Data reporting, users can get better reporting while not affecting their ECM infrastructure.

 

 

Filed Under: Alfresco, Documentum, Product Suite

Reader Interactions

Trackbacks

  1. ECM 2.0 - Can you build it yourself? — Technology Services Group says:
    June 9, 2020 at 1:45 pm

    […] https://tsgrp.wpengine.com/2017/03/10/documentum-or-alfresco-reporting-big-data-to-the-rescue/OpenContent Web Services – Provides isolation as well as a robust API to access Alfresco, Documentum, Hadoop or DynamoDB repositories.  All of our interfaces are built on OpenContent Web Services so that DynamoDB or Hadoop customers have access to interfaces that have been thoroughly tested and in production for other repositories.  For DynamoDB and Hadoop, we have added 100% support for OpenContent Web Services to support versioning, relationships and other complex ECM capabilities.  […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Microsoft Teams integration and the Alfresco Enterprise Viewer for Document Review
  • Alfresco Enterprise Viewer – Offline Annotation for Efficient Review
  • Redaction for AWS, Alfresco, Documentum and Hadoop – Bulk Redaction upon Ingestion or Migration
  • Suggested Redactions for Documentum, Alfresco or Hadoop using OpenRedact
  • ECM Roadmap – Thoughts on Planning for the Future
  • Top 5 Differences between Records Management and Document Management
  • Hadoop – Why Hadoop as a Content Store when Caching Content for ECM Consumers
  • Ephesoft Partnership
  • Documentum or Alfresco – Redacting Sensitive Information with OpenRedact
  • Documentum or Alfresco Annotation Tool – Introducing new OpenAnnotate Features

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT