IBM recently announced a new Quick Start capability for FileNet to run in AWS. This post will share our thoughts and try to better explain the “cloud native” concept with a comparison between FileNet’s offering, Alfresco’s offering and TSG’s own DynamoDB offering with some detail on how the three different offerings run in Amazon Web Services.
FileNet – Understanding Kubernetes and Containerization
To understand the FileNet AWS cloud offering requires an understanding of Kubernetes. Google developed Kubernetes as an application container orchestration platform for configuration and automation of its production workloads and open sourced it in 2014. Kubernetes is a framework for organizing and managing containers which run application code. Kubernetes has the capability to monitor container execution and the underlying servers and take corrective action to keep an application deployment stable. Collectively, these functions are known as a control plane. The FileNet Quick Start deploys containerized FileNet application code within AWS’ Elastic Kubernetes Service (EKS). For the Quick Start, EKS is the management control plane for running the containers and organizes them accordingly. The Quick Start specifies how the containers are deployed, storage mounted, monitored, load balanced and scaled.
AWS offers an option to run single or clusters of containers on EC2 servers using AWS EKS or AWS ECS (Elastic Container Service). A newer service from AWS is Fargate. Fargate runs containers in a serverless architecture. In a serverless configuration, the user only needs to specify the memory and CPU requirements of the container AWS figures out how to organize and manage the server resources the containers require.
Why Containerization?
Software vendors including IBM and Alfresco are moving towards containerization to improve the portability and reliability of their applications as well as increase speed to market. Containers are desired over the use of virtual machines because they are easier to deploy as the container includes all necessary software to run the application it contains, including the OS. When a vendor releases a containerized version of the software, it is virtually guaranteed it can run in any environment where there is supporting container hosting software, like AWS EKS. These characteristics make deploying containerized applications whether internally at a company or to multiple cloud providers (cloud-neutral) well defined and, in many cases, automated, also known as a DevOps pipeline.
How Does the FileNet Quick Start Leverage AWS?
The FileNet Quick Start leverages several AWS services but essentially lifts and shifts a system that has traditionally run on-premise to now run in the AWS cloud or other clouds with containerization. The Quick Start automates a FileNet P8 deployment that takes 2 hours and is definitely an improvement over a manual deployment. The Quick Start uses several existing AWS Quick Starts as a foundation: Kubernetes, VPC, and Microsoft Active Directory.
Even though the FileNet Quick Start architecture takes advantage of core AWS services, EC2, RDS (Oracle-only), EFS, and S3; it uses the services in a similar manner to how any corresponding cloud services might be used in another public or private cloud. Essentially by wrapping the central FileNet Content Manager software in containers, IBM has made the deployment more cloud neutral or capable than cloud native.
To execute the Quick Start template, the user must first have an entitlement key from IBM. This key allows downloading of the FileNet Content Manager containers. The user also needs to download and post the database drivers for Oracle to an S3 bucket since they were not included in the FileNet container build.
From our review, we would only recommend FileNet cloud offering for those clients that want to “lift and shift” their existing P8 instance to Amazon. Some significant concerns with this approach:
- Amazon Services – Since FileNet P8 is wrapped similar to an on-premise installation, FileNet doesn’t take advantage of AWS native services that are used by Alfresco and TSG, including Amazon Aurora, DynamoDB, Elasticsearch and S3 for content storage durability.
- Support – We anticipate that the majority of FileNet installations will likely remain onsite and while containerization does isolate FileNet from the AWS native services, when something goes wrong we expect that IBM support would be particularly difficult with a unique and small number of customers using the AWS cloud version. See our related post on FileNet Support.
- Pricing – Given a “lift and shift”, the cloud version wouldn’t reduce the price of FileNet. While AWS might offer better server and infrastructure costs, most times the hardware used for legacy on-premise FileNet installations has already been purchased and depreciated. We anticipate the costs of the software will remain the same with additional costs for AWS and for migration/testing/support efforts for the new environment.
- Migration – Due to how P8 encrypts content links, we anticipate that migrations have to be done given an API approach to retrieve and store content making the migration more difficult as well as time consuming. Costs for migration would be an issue as well. See our related post on FileNet Migrations.
- Uses network cloud storage instead of object storage for content – use of Elastic File Store (EFS) instead of S3. Using EFS for critical content storage may pose a risk since AWS does not have some of the safeguards or prompts around deleting EFS stores as they have around S3 buckets. It is fairly simple through the console to delete an EFS store whereas for S3, AWS will prompt and require the user to enter the bucket name and if the bucket is not empty it will warn again. TSG recommends that when using EFS for critical content, an IAM policy be put in place to tightly control who or what can delete EFS stores.
- No Solr or Elasticsearch – The Quick Start deploys with FileNet P8’s out of the box Search Service which is based on Verity. Verity is a legacy search vendor and the product has since been replaced by Solr and Elasticsearch in ECM’s with more modern architectures. Alfresco’s latest release uses Solr 6 and TSG’s products can leverage both Solr and Elasticsearch.
Alfresco – AWS Cloud Native Aurora
Alfresco also offers AWS Quick Starts for Content Services and Process Services. These Quick Starts differ in that they use AMIs for deployment instead of containers. The images are built and hardened by Alfresco for security. The Alfresco Quick Start uses an AWS RDS MySQL instance and stores content in S3. There is no usage of EFS. The Quick Starts only show one way to deploy Alfresco on AWS. However, since Alfresco is more cloud native than FileNet it has several other possible configurations.
- RDS – Alfresco can natively connect to Amazon Aurora as well as many of the other database flavors. Aurora has superior performance and resiliency when compared to legacy databases.
- S3 – Content storage; Alfresco can store content in S3 as well as link in existing S3 content into the repository.
- EC2 – Alfresco can run on Amazon Linux as well as multiple other operating systems that exist as Amazon images (AMIs)
- Autoscaling – Alfresco’s servers self-identify when launched on the same network and will automatically cluster when launched in an AWS VPC
- EFS – not required but can be used optionally as a file system instead of S3
- Containers – Alfresco can also run in Docker containers and use Kubernetes for a more cloud neutral approach if desired. These containers are available for download from Alfresco.
- Searching – Alfresco uses Solr for its search indexing and does not rely on proprietary software
TSG – AWS Cloud Native DynamoDB/Elasticsearch
TSG has worked with several on-premise FileNet systems and we see little difference between what is offered in the Quick Start and what is deployed by our clients. AWS has several management plane controls built in to ease the management and scaling of the FileNet solution but these same controls are also available to Alfresco’s Quick Start and TSG’s Hbase and DynamoDB solutions. How thoroughly services such as CloudFormation, CloudWatch Logs and Events, EC2 Autoscaling, EC2 User Data (Bootstrapping), and IAM security are leveraged can make a significant difference in the running, testing, and supporting of a solution on AWS. Read more about TSG’s AWS platform here.
Our DynamoDB solution leveraging native cloud services has several benefits:
- Reduction in emergency changes – pre-configure the solution for automated monitoring, remediation, and automated scaling
- Reduction in cost – for serverless solutions like DynamoDB, only pay for what you use
- Reduction in downtime – shifting management of infrastructure to AWS reduces the risk of downtime being caused by infrastructure outages, especially if serverless services like DynamoDB are used
In 2019, TSG developed and released our DynamoDB solution using Solr in the AWS Marketplace. While the marketplace solution uses Solr, TSG executed an 11 billion document benchmark using AWS’ Elasticsearch service. This whitepaper discusses more about this highly scalable solution. The benchmark migrated 20,000 documents per second and the analysis of the performance showed it could be scaled for an even faster ingestion processing.
TSG designed the DynamoDB ECM solution to maximize the use of AWS’ cloud native services. It uses DynamoDB for a fully managed auto-scalable NoSQL database solution freeing system administrators from creating and monitoring any database, even Aurora. For search, it uses Elasticsearch service but may also use Solr. While setting up an Elasticsearch cluster requires specification of server sizes and numbers, the administrators are freed from installing and managing the individual servers. AWS is also constantly improving their offerings and TSG expects AWS to eventually release an update to the ES service to move it to more of a serverless offering, removing the need to specify sever number or types.
Summary
The AWS FileNet Quick Start is an example of a cloud neutral rather than cloud native architecture. The product software is containerized and is simply lift and shifted from an on-premise solution. While the Quick Start does take advantage of AWS’ management control plane and utilizes the Elastic Kubernetes Service (EKS) it is an architecture that with a few modifications could be deployed anywhere. Alfresco’s AWS solutions have variations in deployment on AWS. It can follow a lift and shift pattern similar to FileNet or modified to leverage AWS services natively such as EC2 AMIs, EFS, S3, Aurora, and Glacier. TSG’s DynamoDB solution was designed from the ground up to leverage AWS’ fully managed services, particularly DynamoDB, freeing up resources from managing server infrastructure and worrying about monitoring a database deployment. With a few configurations, TSG’s DynamoDB solution can automatically scale up and down rapidly to accommodate spikes in usage and release resources to save money when demand is low.
We’re interested in hearing more about how ECM users are using Quick Starts and planning their work in the cloud. Please leave any questions or comments below.