• Skip to primary navigation
  • Skip to main content
  • Skip to primary sidebar
  • Skip to footer
TSB Alfresco Cobrand White tagline

Technology Services Group

  • Home
  • Products
    • Alfresco Enterprise Viewer
    • OpenContent Search
    • OpenContent Case
    • OpenContent Forms
    • OpenMigrate
    • OpenContent Web Services
    • OpenCapture
    • OpenOverlay
  • Solutions
    • Alfresco Content Accelerator for Claims Management
      • Claims Demo Series
    • Alfresco Content Accelerator for Policy & Procedure Management
      • Compliance Demo Series
    • OpenContent Accounts Payable
    • OpenContent Contract Management
    • OpenContent Batch Records
    • OpenContent Government
    • OpenContent Corporate Forms
    • OpenContent Construction Management
    • OpenContent Digital Archive
    • OpenContent Human Resources
    • OpenContent Patient Records
  • Platforms
    • Alfresco Consulting
      • Alfresco Case Study – Canadian Museum of Human Rights
      • Alfresco Case Study – New York Philharmonic
      • Alfresco Case Study – New York Property Insurance Underwriting Association
      • Alfresco Case Study – American Society for Clinical Pathology
      • Alfresco Case Study – American Association of Insurance Services
      • Alfresco Case Study – United Cerebral Palsy
    • HBase
    • DynamoDB
    • OpenText & Documentum Consulting
      • Upgrades – A Well Documented Approach
      • Life Science Solutions
        • Life Sciences Project Sampling
    • Veeva Consulting
    • Ephesoft
    • Workshare
  • Case Studies
    • White Papers
    • 11 Billion Document Migration
    • Learning Zone
    • Digital Asset Collection – Canadian Museum of Human Rights
    • Digital Archive and Retrieval – ASCP
    • Digital Archives – New York Philharmonic
    • Insurance Claim Processing – New York Property Insurance
    • Policy Forms Management with Machine Learning – AAIS
    • Liferay and Alfresco Portal – United Cerebral Palsy of Greater Chicago
  • About
    • Contact Us
  • Blog

Redacting PDF – What did the Manafort Lawyers do wrong?

You are here: Home / Adobe / Redacting PDF – What did the Manafort Lawyers do wrong?

January 29, 2019

During a demo this week for a potential OpenAnnotate redaction customer, the question was raised “So you guys don’t do redaction like the lawyer’s for Paul Manafort did, do you?”.  To help understand the best way to redact PDF documents to avoid the issue of the unsuccessful redaction, we thought we would provide some additional detail on how to redact (and not redact) PDF documents.

Manafort Lawyer Redacting – What went wrong?

The issue occured recently when Manafort’s lawyers filed a response to special counsel Robert Mueller’s team’s allegations that Manafort lied to prosecutors.  On page five, six, seven and nine either the lawyers or the special council staffers attempted to redact sensitive passages.  While the redaction blocks prevent the words from being read at first glance, anyone with Adobe Acrobat or other PDF viewing tools or even browser based viewing tools could easily copy and paste the text that still existed under the redaction blocks to another document to easily read the passages that had been redacted.

Redacting Text from a PDF – Our guess at what went wrong

PDF provides a number of different types of documents that could have played a role in how the redaction was wrongly carried out.  Typically a document that is scanned in is referred to as a PDF – Image.  This document, like a fax, is only made of up black and white (or color if it is a color scanner) dots that do not contain any text for copy and paste.  Redacting this type of document can just involve converting all of the dots around the image of text that should be redacted to black.  Given lawyers and the amount of scanned documents for signatures, we would imagine that whoever did the redacting thought this document was a scanned document as just drawing boxes over the text would have successfully redacted the ability to view the black and white dots that make up the words.

Unfortunately, there are two other types of PDF documents that in addition to the image of text, contain text data.  In these other PDFs, this text data is what allows searching and copy-pasting of the document’s text.

Such documents can be created in two ways.  Either the image document is run through an Optical Character Recognition (OCR) Module and the text is embedded behind the image to enable search and other text capabilities like copy and paste.  Alternatively, the document could be created from a word processing or font capable program directly into a PDF including text and fonts.  From our quick review, we would guess that the document was created and never scanned as the Manafort document as it is very clean (no stray dots from scanning) and very small compared to an image with text.  In either of these cases, simply drawing a box over the words does not remove the text from underneath.

OpenAnnotate Redaction

In the short demo below, we will use the Manafort document with OpenAnnotate to show both how to do text redaction versus just drawing blocks over the text.

Thanks again for reading.  Let us know if you have any additional thoughts or comments below.

Filed Under: Adobe, GDPR, OpenAnnotate

Reader Interactions

Trackbacks

  1. Amazon Textract for Full Text Search says:
    June 3, 2019 at 7:00 am

    […] PDF Image format to embed the Textract OCR results behind the image.  See our related post on Redacting PDF – What did the Manafort Lawyers do wrong to better understand PDF with image and how text can be embedded in the PDF.  By […]

    Reply
  2. Alfresco - Do More With Redactions and Personally Identifiable Information — Technology Services Group says:
    April 17, 2020 at 10:28 am

    […] Redacting PDF – What did the Manafort Lawyers do wrong? […]

    Reply

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Primary Sidebar

Search

Related Posts

  • Alfresco – Do More with OpenAnnotate
  • Adobe Acrobat Alternative – Doing more with just a web browser
  • Migrating FileNet with Daeja Annotations to AWS S3
  • PDF Annotation Tools That Work Beyond Documentum 5.3
  • Alfresco – Viewing Annotations on Versions
  • Redaction – a focus on efficiency with Alfresco Enterprise Viewer
  • Zoom integration and the Alfresco Enterprise Viewer for Document Review
  • Alfresco Enterprise Viewer – Offline Annotation for Efficient Review
  • Amazon S3 – Viewing content fast and securely in-browser with the Alfresco Enterprise Viewer
  • Alfresco Enterprise Viewer: Enabling Pharmaceutical and Covid-19 Teams to Function more Efficiently to Fight the Pandemic

Recent Posts

  • Alfresco Content Accelerator and Alfresco Enterprise Viewer – Improving User Collaboration Efficiency
  • Alfresco Content Accelerator – Document Notification Distribution Lists
  • Alfresco Webinar – Productivity Anywhere: How modern claim and policy document processing can help the new work-from-home normal succeed
  • Alfresco – Viewing Annotations on Versions
  • Alfresco Content Accelerator – Collaboration Enhancements
stacks-of-paper

11 BILLION DOCUMENT
BENCHMARK
OVERVIEW

Learn how TSG was able to leverage DynamoDB, S3, ElasticSearch & AWS to successfully migrate 11 Billion documents.

Download White Paper

Footer

Search

Contact

22 West Washington St
5th Floor
Chicago, IL 60602

inquiry@tsgrp.com

312.372.7777

Copyright © 2023 · Technology Services Group, Inc. · Log in

This website uses cookies to improve your experience. Please accept this site's cookies, but you can opt-out if you wish. Privacy Policy ACCEPT | Cookie settings
Privacy & Cookies Policy

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Non-necessary
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.
SAVE & ACCEPT