• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Redaction for AWS, Alfresco, Documentum and Hadoop – Bulk Redaction upon Ingestion or Migration

You are here: Home / Alfresco / Redaction for AWS, Alfresco, Documentum and Hadoop – Bulk Redaction upon Ingestion or Migration

October 18, 2018

As we presented in our Redaction Roadmap earlier this year, one of our product development additions to OpenMigrate this quarter is the ability to bulk redact incoming documents as part of an ingestion or migration into Alfresco, Documentum, AWS, or Hadoop. As detailed earlier in our Redacting Roadmap, both OpenMigrate and the OpenContent Management Suite will have capabilities surrounding the redaction of specific values. This post will focus on demonstrating how OpenMigrate can be used to redact content, particularly focused on a case management scenario during ingestion or migration.

OpenMigrate Redaction Capabilities

OpenMigrate is one of the most successful enterprise migration tools for Documentum and Alfresco.  OpenMigrate uses a high-throughput, multi-threaded configurable approach to migrate content to, from, or within a variety of repositories (e.g. FileNet, CMOD, OpenText, and others) as well as for specific cloud vendors like Azure and Amazon Web Services S3. With the new capabilities added to OpenMigrate, the following redaction scenario is supported:

  • Document is extracted from either an ECM (Alfresco, Documentum, Filenet, etc.) or a file system
  • If the document is PDF Text or PDF Image with Text, the redaction processing can occur immediately
  • If the document is not PDF Text or PDF Image with Text, (e.g. TIF, PDF Image, or Microsoft Word) a text searchable PDF is created leveraging Adlib, Nuance, or another vendor-specific transformation tools
  • The text-searchable PDF is analyzed and redacted for any configured patterns requiring redaction. This could include credit card numbers, social security numbers, or phone numbers
  • The text-searchable PDF document is analyzed and redacted for specific components configured to be redacted for that particular document based on specific metadata defined for the document. This could include case file names, addresses, or other metadata associated with the PDF document
  • The redacted document is stored in the repository either as a redacted copy or as the primary document. The original document can also be stored in the repository to support evidence rules as required

For our demonstration scenario, we will migrate documents for a medical case folder.  In this scenario, we are automatically redacting social security numbers based on a pattern, and we’re redacting other personally identifiable information (PII) for the patient based on metadata that’s defined for the document, such as the patient’s name. As part of the ingestion process, OpenMigrate is importing the medical case files from an Excel file that contains the documents’ metadata, including the patient name and patient ID. To support privacy rules, the patient ID will be the only property stored in the target ECM repository, and the patient name and other PII will be automatically redacted from the documents upon ingestion with OpenMigrate.

Summary

Redacting documents as part of an ongoing ingestion or migration is a common request. OpenMigrate now has the capabilities to both pattern redact for common fields like social security numbers, as well as redact specific fields for known values (patient name).  Look here for future posts as TSG continues to add additional capabilities, including redaction for values (e.g. dates older than 18 years – birthdates) as well as analyze documents for additional values (incident date) that could be extracted from the documents.

See our previous posts for how documents can be redacted once already in the system with either manual redaction leveraging OpenAnnotate or Case Field Redaction leveraging the OpenContent Management Suite.

Filed Under: Alfresco, Amazon, Amazon EC2, Cloud Computing, Documentum, Hadoop, Microsoft Azure, Migrations, OpenContent Management Suite, OpenMigrate, OpenRedact, Product Suite

Reader Interactions

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • ECM Roadmap – Thoughts on Planning for the Future
  • Alfresco Solutions of the Year 2017 – TSG wins Alfresco award for sixth year in a row
  • Third Annual TSG Client Briefing – June 3rd – 2010
  • Redaction for AWS, Alfresco, Documentum and Hadoop – Folder Case Redaction
  • Redacting Roadmap – User Scenarios
  • Alan Pelz-Sharpe – Deep Analysis Review of Technology Services Group
  • Hadoop – Why Hadoop as a Content Store when Caching Content for ECM Consumers
  • ImageWare from Cannon – Migration to Alfresco in the Cloud
  • Alfresco DevCon Wrap Up
  • Alfresco Consulting – Consulting in an Open Source world

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT