One of the great things about Alfresco is that it offers all-in-one installation packages to make installation simple on both Windows and Linux servers. With a few clicks, you can have an Alfresco repository deployed and running in a matter of a few minutes. This is especially useful when setting up development environments.
When using the all-in-one installer, some of the repository configuration is determined based on how you answer questions in the installation wizard, like whether or not to deploy a PostgreSQL database, Solr, and/or LibreOffice. Other configuration is determined automatically by the installer based on the environment that Alfresco is being deployed to, like the amount of memory to allocate to the Alfresco Java virtual machine (JVM). Alfresco takes a guess at the rest of the configuration and deploys the repository with default configuration values. Depending on the volume of content and your intended use for Alfresco, some of the default configuration can be modified right out of the gate to improve system performance.
Some key points to consider include:
- “Day Zero” Configuration
Running through Alfresco’s suggested day zero configuration guide to optimize the repository can significantly boost repository performance. Some of the topics covered in day zero configuration include:
- Disabling Unused Features
- JVM Tuning
- Database Settings
Additional details about day zero configuration can be found in our previous post here.
- Solr Index Server Deployment
The Solr index server deployed with Alfresco can have a considerable impact on system performance. Full-text indexing and searching can be processor and memory intensive. The Alfresco all-in-on installation packages deploy Solr on the same Tomcat and JVM instance as the repository. As content volume, especially full-text indexable content, increases we recommend the following:
- Deploy Solr on a separate Tomcat/JVM on the same server as the Alfresco repository. This is a best practice that we recommend to all of our clients, regardless of content volume, and requires no additional licensing. Deploying on a separate JVM allows the JVM parameters for the repository and index server to be tuned independently. It also provides a layer of isolation, preventing any issues with the index server from bringing down the repository.
- Deploy Solr on a separate server from the Alfresco repository. While this model requires additional licensing from Alfresco, it transfers the full-text indexing workload off of the Alfresco server onto different hardware. This model is suggested for systems with a high volume of content and high rates of ingestion of full-text indexable content.
- Full-Text Indexing
Alfresco automatically full-text indexes the content of documents added to the system that are of text searchable format. Full-text searching is a powerful feature that’s provided by Alfresco. However, when implementing a new system, it’s important to consider if full-text searching is a required feature. If not, full-text indexing can be disabled to significantly improve performance, especially for high-volume systems. Full-text indexing can be disabled system wide, or for individual object types via content model configuration.
An example of a case where full-text indexing might not be required would be an Accounts Receivable system containing a high volume of customer invoices. Chances are, the metadata that needs to be searched should already be configured as part of the content model. Full-text search is generally not useful for this type of document because most invoices will contain the same or similar text.
- Thumbnails
As of Alfresco 4.2, Alfresco automatically generates thumbnail images of content that is uploaded into the repository. The ability to see thumbnails is a nice feature that is exposed via the Alfresco Share interface. Generating the thumbnails, however, can be processor intensive. If your Alfresco implementation does not utilize the Share interface and/or does not have use for thumbnails, disabling automatic thumbnail generation can improve system performance, especially for systems with high rates of content ingestion. Disabling thumbnails can also reduce the content storage requirements. Automatic thumbnail generation can be easily disabled by adding the following to your alfresco-global.properties file:
#Automatic Thumbnail Generation - Disable if thumbnails are not needed system.thumbnail.generate=false
- Transformation
Alfresco performs transformations of many document formats to other formats based on configuration. For example, in order for the document viewer to work in Alfresco Share, the system transforms Office documents (Word, Excel, PowerPoint) to PDF. Depending on the file formats, transformation are performed by Alfresco using tools like LibreOffice, ImageMagick, and GhostScript. Transformation processing can be resource intensive. For systems where high volumes of transformations are required, external transformation servers can be deployed for document and media transformations to reduce the load on the Alfresco server. Additional licensing is required for external transformation servers.
- Timestamp Propagation
Default behavior in Alfresco is to update the modify date of a folder when new content is added to the folder. This is similar to how Windows Explorer behaves when you add files to a folder. When the modify date is updated on a folder in Alfresco, it must be reindexed by Solr. This seems like a minor nuance, but for system with high volumes of transactions, constantly reindexing folders can impact performance. Folders that contain a large number of documents can take a while to reindex. If system requirements determine that timestamp propagation is not necessary, it can be disabled by adding the following to your alfresco-global.properties file:
#Automatic Timestamp Propagation - Disable to avoid SOLR reindex system.enableTimestampPropagation=false
Leave a Reply