• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

The Problem – Born in the mailroom scanning paper

TSG_opencap-featured-image

Most legacy capture vendors began focused on the electronic capture of paper.  Focused on the automation of paper capture in the mailroom, legacy scanning vendors focused on:

  • Scanning and Recognition – Components that are unique for mailroom scanning include scanning batches of documents, separator pages, bar-code reading, Optical Character Recognition as well as hand writing recognition.
  • Indexing – Screens for indexing of documents.  Based for paper, results are around recognition of characters and fields and leveraging confidence levels for keying of data.
  • Bulk Ingestion into ECM and Data platforms – Growing up as a separate infrastructure, vendors would have specific adapters for where the content would flow to after it was indexed.  For example, Captiva has adapters for Documentum and Application Extender along with many others before being acquired by Documentum.

Next generation capture solutions need to do all of the above and more embracing the affordability and accessibility of limitless computing power with technology like Machine Learning/Artificial Intelligence as well as cloud capabilities.

The Solution: OpenCapture – intelligent capture with machine learning

Legacy capture tools generally rely on two approaches to data capture:

  • Location Template Approach (example – DataCap, Kofax, InputAccel)– a template defines where data is located in a given document.  A zone is given to denote where a piece of data resides.  For example, the tool could be told to look in a given box in the top right corner of the header to pull the “Report Number” value.  This approach only works well when the positional data is known and very consistent across all documents.  Templates need to be created for every type of captured document.
  • Key/Value pair Template Approach (example – Ephesoft– A second approach is to provide a Key/Value pair template.  In this approach, instead of defining the zonal position of the data, the tool is told to look for a given key, for example: “Invoice Number”, and then the tool will look at surrounding text to pull the value – for example, preferring text to the left or underneath the key.  This approach works well when the target data may be anywhere within the document, but runs into problems when the Key text is inconsistent.  Using our invoice example, some vendors may display Invoice Number as Invoice Num, Invoice Nbr, Invoice #, etc.  Existing Capture tools have approaches for minimizing this problem, but it is still an issue for many clients.

Both approaches are typically augmented with additional processing to look up and verify sources against other systems (example PO number, account number….).  This processing can include both configuration and customization depending on requirements.

OpenCapture combines the above approaches while adding Machine Learning to address handling incorrect data extraction that is corrected by the user during indexing.  For legacy tools, an error that is manually corrected on one document will continue to be an error on the next, similar document unless the algorithm or template is changed.  OpenCapture leverages machine learning to correct the template/approach to gradually reduce the indexing effort for subsequent documents.  Current capture tools require a manual administrative update to the template or an entirely new template.  In reality, this means that templates aren’t updated for most corrected extraction mistakes leading to user frustration.

Focusing on modern technologies, OpenCapture does more than just intelligently extract content, OpenCapture and Capture 2.0 will take into account machine learning to allow the indexing components to learn over time to achieve better results.  View the video below or look at our blog for other information about OpenCapture

Primary Sidebar

PRODUCTS

  • OpenContent Search
  • OpenContent Case
  • OpenContent Forms
  • Alfresco Enterprise Viewer
  • OpenMigrate
  • OpenContent Web Services
  • OpenOverlay
  • OpenCapture

Recent OpenCapture News

  • Capture 2.0 – Document Classification with Machine Learning
    on August 3, 2020

    Recently we have added on to the machine learning power of Capture 2.0 with the development of the Document Classification Engine. This Capture […]

  • Alfresco – Do More with Capture 2.0
    on April 17, 2020

    Now that TSG is an Alfresco Company, we wanted to highlight how we are helping clients improve their legacy capture solutions to make ingestion of […]

  • Capture 2.0 – Metadata Extraction with Machine Learning Upon Ingestion
    on April 7, 2020

    TSG is predicting future disruptions to content capture within the ECM industry. In the 4th quarter of 2019, we focused on improving […]

  • ECM 2.0 – Vision & Review of 2019
    on December 17, 2019

    TSG recently posted an article with Document Strategy on the Vision for Content Services Platform /ECM 2.0 for 2020 and beyond.  As part of our […]

  • Capture 2.0 – Disrupting Legacy Capture Solutions with Machine Learning
    on December 12, 2019

    TSG is predicting upcoming disruptions to content capture within the ECM industry. We have been working hard this quarter to improve metadata […]

stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT