I presented last week at the document strategy forum in a session “Cloud Wars – Top Trends in Cloud Content Management for 2018” with representatives from Box and Newgen and moderated by Joe Shepley from Doculabs. Joe concluded the session with the question, what are the major trend that you see affecting this space in the next 3 to 5 years? My response was the cloud object store, particularly offerings from Amazon, Microsoft and Google. This post will discuss those thoughts as well as how ECM vendors should best position themselves to take advantage of the disruption.
What is a Cloud Object Store?
The concept of an object store evolved in the 90’s and was productized by a variety of hardware vendors from 1995-2013 according to Wikipedia. As it relates to ECM, before object storage or Storage Area Networks, many of the ECM systems had to define their own file storage systems. For many clients, TSG would help decide on RAID technologies and have to deal with the intricacies of the ECM system owning both the database and file storage. Object Stores simplified the infrastructure components where the lower levels of storage were maintained by the object store and abstracted from the ECM system.
As it relates to ECM, object stores are capable of storing both the object itself as well as meta-data about the object. Storing both components together provides some benefits for distribution of the object as well as recoverability. Amazon, Microsoft, Google, IBM, Oracle and others are now offering object store capabilities via the cloud to allow a variety of different applications, including ECM, to leverage robust and cost-effective storage of all types of files and objects.
How will Cloud Object Storage Disrupt ECM?
We are seeing some early signs from multiple clients that Cloud Object Storage will have a major impact on ECM/Content Services systems. Before cloud based object storage, typical on-premise implementations would have the ECM system “owning” all the files and applications calling the ECM system to store and manage the file. Typical ingestion and retrieval looked something like:
- ECM Application or API is called to store the file from the user
- ECM system stores the document and meta-data
- ECM Application or API is called to retrieve the file
- ECM system delivers the file to the user
With Cloud Based Object Stores, we see the Object Store as more in control of the file itself with the ECM system holding the key. Typically, performance of storing and viewing the file, particularly large files like video, can be severely impacted by bandwidth. With more and more external users or users working remotely, cloud based object stores have a huge advantage in ingesting and delivering content. Cloud based object stores provide much better bandwidth ingestion and transmission (along with streaming) than ECM systems where the file needs to be processed through the application server. In our benchmark tests with Alfresco and AWS for large migrations, we found ingestions leveraging the object store and linking to Alfresco to be 10x or more efficient versus the traditional approach.
To take advantage of the bandwidth of cloud object stores, ECM storage and retrieval can look something like:
- Application stores the document/file and meta-data in the object store
- ECM API is called with the object key to store the meta data
- ECM Application or API is called to retrieve the file
- ECM API passes the object key to the Application
- Application retrieves the object directly from the object store
Content Retrieval becomes more streamlined with the “key” to the object being provided by the ECM system.
Cloud based object stores already provide encryption, redundancy, de-duping, transformation and better bandwidth capabilities than any ECM system by itself. Many are adding meta-data, analysis, security, versioning and other ECM-like capabilities to the object store itself.
Which Cloud Object Store vendors are positioned for the best success?
We would predict the following about several of the cloud object store vendors:
- Amazon Web Services – has the best brand with S3 and will always be competitive on cost based on history of the company. Still is the clear leader in the Infrastructure as a Service (IaaS) space with estimated 50% growth year over year. Has the largest current client footprint both for consumer as well as business usage. For those clients worried about cloud vendors having access to content, Amazon provides a “bring your own encryption key”, which ensures Amazon cannot read the file contents.
- Microsoft Azure – a strong second place. Will look to leverage existing business relationships and Office Online offering. Growing more than Amazon but it is a significantly smaller number.
- Google – the wildcard. Not sure businesses trust the brand to not capture data, particularly with sensitive documents.
- Oracle – would anticipate success within the Oracle client base. They have a Cloud ECM Biz App offering.
- IBM and Dell (EMC) – will try to continue to leverage their on-premise positioning to extend to the cloud. Only sale will be to dedicated IT but would see it difficult to compete with Amazon or Microsoft for the rest of the IaaS offering. Hard to see them successful given the amount of revenue they generate with on-premise hardware and having to cannibalize it for the cloud.
- Box – might consider themselves an object store but more of a collaboration utility. Lacks their own infrastructure to compete with the bigger players. Will have a strategy that adds onto the capabilities of Azure and Amazon but can’t price as low as direct.
How will ECM vendors adapt?
ECM vendors should feel threatened as more and more of the typical ECM vault capabilities are being moved into the object store. They are also competing with the biggest players in the IT space with an unlimited cash reserve and the technical chops to add whatever they feel would add value. ECM vendors that hope to survive the disruption of the cloud object store need to find a way to add value to the object store rather than compete as a file storage mechanism and take advantage of the fact that they are not specifically tied to just one cloud storage vendor. Additional value that our team brainstormed for ECM vendors includes:
- Object Store Support – Adding capabilities to manage the links to a variety of object stores to allow easy leverage for both cloud native as well as on-premise/cloud hybrid approaches. Should support both storing through the ECM system as in the traditional approach as well as storing in the object store and linking to the ECM system.
- Indexing – Adding capabilities to scan object stores and capture keys based on business rules to provide for the management of a diverse community uploading content.
- Applications – Interfaces should be able to take advantage of the bandwidth of the object store for retrieval and ingestion.
- Security – Providing a buffer to limit access to certain documents/objects based on business rules.
- Audit Trail – Ability to see activity against the object store by application and user.
- Lifecycle and Versioning – More than simple versioning, robust lifecycle with version and security support.
- User Alerts – Ability to add user alerts to when the object store is updated, modified or accessed.
- Analyzing the content of the store – We do see the cloud vendors pushing more analytics. Ability to apply analytics across cloud vendors or deliver more benefits to the object store itself.
- Overlays – Ability to deliver content with metadata overlays.
- Redaction – Ability to redact information from the content. Could including scanning the object store(s) for additional content to be redacted.
- Object Relationships – Associating objects to other objects based on meta-data to provide quicker access to the object store.
- Monitoring – While multiple applications might be directly storing content in object stores, ECM tools having the keys could provide monitoring of the object store and added value.
- Records Management – Adding a DOD certified records management capability on top of the object store would add value for corporate compliance.
- Transformation – Ability to render content into different formats, with own tools or cloud tools and store additional objects. Note that Amazon is in this space as well with their Elastic Transcoder product, but currently it is only available for media files.
- Annotation – Ability to securely annotate different documents and store the annotation layers with security.
While all of the above capabilities might be added by the cloud vendors eventually, ECM vendors have a significant lead advantage in understanding the requirements and documented success. The ability of the vendors to provide across both on-premise as well as multiple cloud object storage vendors could help them maintain their footprint in providing ECM content services.
Summary
Cloud Object Storage has and will continue to do some of the things that ECM systems have typically done and can leverage better bandwidth for ingestion and distribution of content. Successful ECM vendors need to adapt and evolve from managing the files themselves to instead linking to the files as well as provide additional capabilities not found in the cloud object store. To compete against the big vendors of cloud storage, ECM vendors need to differentiate themselves with differing capabilities as well as their ability to work with multiple cloud vendors.
Let me know your thoughts below:
[…] TSG has experience in developing new approaches to storing and linking content into Alfresco, our blog post on object stores describes how storage can be leveraged to improve the speed of delivering content to a […]