The document provides an overview and best practices for tuning an Alfresco installation for performance. It discusses disabling unused services, limiting folder hierarchies and group nesting, monitoring resources, tuning Solr indexes and caches, and using separate servers for specific tasks like indexing. General tips include testing changes thoroughly before deploying, adjusting sizing for increased usage, and following the standard performance methodology.
Alfresco DevCon 2019 Performance Tools of the TradeLuis Colorado
Discover tips and tools that will help you to keep your Alfresco environment in shape. Most of the best tools are free or Open Source, and this presentation will guide you through the steps to improve the performance of your system.
Alfresco Content Modelling and Policy BehavioursJ V
Alfresco DevCon 2010 (Paris and New York)
This session starts by giving an overview of components of an Alfresco content model. We then examine the various forms of call-backs and hook-points available to the developer and give some examples of how these can be used to enforce custom business logic and model consistency.
Moving Gigantic Files Into and Out of the Alfresco RepositoryJeff Potts
This talk is a technical case study showing show Metaversant solved a problem for one of their clients, Noble Research Institute. Researchers at Noble deal with very large files which are often difficult to move into and out of the Alfresco repository.
This session will provide a guide to Alfresco truststores and keystores. Several live examples will be shown, including the replacement of existing cryptographic stores or certificates. Additionally, a troubleshooting configuration guide for mTLS communication will be provided.
The objective of this article is to describe what to monitor in and around Alfresco in order to have a good understanding of how the applications are performing and to be aware of potential issues.
Sizing an alfresco infrastructure has always been an interesting topic with lots of unrevealed questions. There is no perfect formula that can accurately define what is the perfect sizing for your architecture considering your use case. However, we can provide you with valuable guidance on how to size your Alfresco solution, by asking the right questions, collecting the right numbers, and taking the right assumptions on a very interesting sizing exercise.
How many alfresco servers will you need on your alfresco cluster? How many CPUs/cores do you need on those servers to handle your estimated user concurrency? How do you estimate the sizing and growth of your storage? How much memory do you need on your Solr servers? How many Solr servers do you need to get the response times you require? What are the golden rules that can drive and maintain the success of an Alfresco project?
Alfresco DevCon 2019 Performance Tools of the TradeLuis Colorado
Discover tips and tools that will help you to keep your Alfresco environment in shape. Most of the best tools are free or Open Source, and this presentation will guide you through the steps to improve the performance of your system.
Alfresco Content Modelling and Policy BehavioursJ V
Alfresco DevCon 2010 (Paris and New York)
This session starts by giving an overview of components of an Alfresco content model. We then examine the various forms of call-backs and hook-points available to the developer and give some examples of how these can be used to enforce custom business logic and model consistency.
Moving Gigantic Files Into and Out of the Alfresco RepositoryJeff Potts
This talk is a technical case study showing show Metaversant solved a problem for one of their clients, Noble Research Institute. Researchers at Noble deal with very large files which are often difficult to move into and out of the Alfresco repository.
This session will provide a guide to Alfresco truststores and keystores. Several live examples will be shown, including the replacement of existing cryptographic stores or certificates. Additionally, a troubleshooting configuration guide for mTLS communication will be provided.
The objective of this article is to describe what to monitor in and around Alfresco in order to have a good understanding of how the applications are performing and to be aware of potential issues.
Sizing an alfresco infrastructure has always been an interesting topic with lots of unrevealed questions. There is no perfect formula that can accurately define what is the perfect sizing for your architecture considering your use case. However, we can provide you with valuable guidance on how to size your Alfresco solution, by asking the right questions, collecting the right numbers, and taking the right assumptions on a very interesting sizing exercise.
How many alfresco servers will you need on your alfresco cluster? How many CPUs/cores do you need on those servers to handle your estimated user concurrency? How do you estimate the sizing and growth of your storage? How much memory do you need on your Solr servers? How many Solr servers do you need to get the response times you require? What are the golden rules that can drive and maintain the success of an Alfresco project?
Practical information for Alfresco integration with AOS (Sharepoint Protocol), Google Drive, Microsoft 365, ONLYOFFICE and Collabora Online.
Additionally ADW support for ONLYOFFICE is provided by https://github.com/atolcd/adf-onlyoffice-extension#installation
This is the session delivered during the Alfresco Developers Conference in Lisbon, January 2018. Learn all what you need to know to perform a proper backup and disaster recovery strategy. From a single server installation with hundreds of documents to a large deployment with multiple nodes, layers, databases and multi-million documents. What is the best way for each case?
In this session, we'll discuss architectural, design and tuning best practices for building rock solid and scalable Alfresco Solutions. We'll cover the typical use cases for highly scalable Alfresco solutions, like massive injection and high concurrency, also introducing 3.3 and 3.4 Transfer / Replication services for building complex high availability enterprise architectures.
Features of Alfresco Search Services.
Features of Alfresco Search & Insight Engine.
Future plans for the product
---
DEMO GUIDE
[1] Queries: Share > Node Browser
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
[2] Queries: Share > JS Console
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_FTS_ALFRESCO,
"ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'");
logger.log(ResultSet.getNodeRefs());
---
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_CMIS_ALFRESCO,
"SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')");
logger.log(ResultSet.getNodeRefs());
---
var def =
{
query: "ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'",
language: "fts-alfresco"
};
var results = search.query(def);
logger.log(results);
[3] Queries: api-explorer
{
"query": {
"language": "afts",
"query": "ASPECT:\"cm:titled\" AND cm:title:\"*Sample\" AND TEXT:\"code\""
}
}
---
{
"query": {
"language": "cmis",
"query": "SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')"
}
}
[4] Queries: CMIS Workbench > Groovy Console
rs = session.query("SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')", false)
for (res in rs) {
println(res.getPropertyValueById('cmis:objectId'))
}
[5] Queries: SOLR Web Console > (alfresco) > Query
/afts
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
---
/cmis
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
---
Infrastructure, use cases and performance considerations for
an Enterprise Grade ECM implementation up to 1B documents on AWS (Amazon Web Services EC2 and Aurora) based on the Alfresco (http://www.alfresco.com) Platform, leading Open Source Enterprise Content Management system.
Alfresco Web Scripts have become an important part of any Alfresco developer's tool kit and in this session we will take a deep dive into how Web Scripts can be used to provide public APIs for Alfresco extensions. After briefly reviewing the anatomy of a Web Script and discussing Alfresco's approach to Service development, we will work through an example that extends Alfresco with a simple service and creates a REST API using Web Scripts.
Alfresco node lifecyle, services and zonesSanket Mehta
This ppt explains you the details about an alfresco node lifecycle (including which alfresco database tables are affected upon node operation-like node creation, deletion). Apart from it, it also explain which particular case-sensitive alfresco service should be used (nodeService vs NodeService, searchService vs SearchService) in order to maintain security in your application. Lastly it covers zones in alfresco (authentication-related zones and application-related zones)
No Docker? No Problem: Automating installation and config with AnsibleJeff Potts
In this talk I show how to bring stability and repeatability to your Alfresco installation by automating install and config management with Ansible.
This talk was originally given at Alfresco DevCon 2020 (virtual edition).
Alfresco DevCon 2019 (Edinburgh)
"Transforming the Transformers" for Alfresco Content Services (ACS) 6.1 & beyond
https://community.alfresco.com/community/ecm/blog/2019/02/07/alfresco-transform-service-new-with-acs-61
Alfresco provides various content transformation options across the Digital Business Platform (DBP). In this talk, we will explore the new independently-scalable Alfresco Transform Service. This enables a new option for transforms to be asynchronously off-loaded by Alfresco Content Services (ACS).
https://devcon.alfresco.com/speaker/jan-vonka/
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
Presentation on how to move from the Alfresco Search Services product based in Apache Solr to the new Alfresco Search Enterprise integrated with Elasticsearch and Amazon Opensearch.
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Performance tuning Grails Applications GR8Conf US 2014Lari Hotari
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Practical information for Alfresco integration with AOS (Sharepoint Protocol), Google Drive, Microsoft 365, ONLYOFFICE and Collabora Online.
Additionally ADW support for ONLYOFFICE is provided by https://github.com/atolcd/adf-onlyoffice-extension#installation
This is the session delivered during the Alfresco Developers Conference in Lisbon, January 2018. Learn all what you need to know to perform a proper backup and disaster recovery strategy. From a single server installation with hundreds of documents to a large deployment with multiple nodes, layers, databases and multi-million documents. What is the best way for each case?
In this session, we'll discuss architectural, design and tuning best practices for building rock solid and scalable Alfresco Solutions. We'll cover the typical use cases for highly scalable Alfresco solutions, like massive injection and high concurrency, also introducing 3.3 and 3.4 Transfer / Replication services for building complex high availability enterprise architectures.
Features of Alfresco Search Services.
Features of Alfresco Search & Insight Engine.
Future plans for the product
---
DEMO GUIDE
[1] Queries: Share > Node Browser
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
[2] Queries: Share > JS Console
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_FTS_ALFRESCO,
"ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'");
logger.log(ResultSet.getNodeRefs());
---
var ctxt = Packages.org.springframework.web.context.ContextLoader.getCurrentWebApplicationContext();
var searchService = ctxt.getBean('SearchService', org.alfresco.service.cmr.search.SearchService);
var StoreRef = Packages.org.alfresco.service.cmr.repository.StoreRef;
var SearchService = Packages.org.alfresco.service.cmr.search.SearchService;
var ResultSet = Packages.org.alfresco.repo.search.impl.lucene.SolrJSONResultSet;
ResultSet =
searchService.query(
StoreRef.STORE_REF_WORKSPACE_SPACESSTORE,
SearchService.LANGUAGE_CMIS_ALFRESCO,
"SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')");
logger.log(ResultSet.getNodeRefs());
---
var def =
{
query: "ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'",
language: "fts-alfresco"
};
var results = search.query(def);
logger.log(results);
[3] Queries: api-explorer
{
"query": {
"language": "afts",
"query": "ASPECT:\"cm:titled\" AND cm:title:\"*Sample\" AND TEXT:\"code\""
}
}
---
{
"query": {
"language": "cmis",
"query": "SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')"
}
}
[4] Queries: CMIS Workbench > Groovy Console
rs = session.query("SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')", false)
for (res in rs) {
println(res.getPropertyValueById('cmis:objectId'))
}
[5] Queries: SOLR Web Console > (alfresco) > Query
/afts
ASPECT:'cm:titled' AND cm:title:'*Sample*' AND TEXT:'code'
---
/cmis
SELECT * FROM cm:titled WHERE cm:title like '%Sample%' AND CONTAINS('code')
---
Infrastructure, use cases and performance considerations for
an Enterprise Grade ECM implementation up to 1B documents on AWS (Amazon Web Services EC2 and Aurora) based on the Alfresco (http://www.alfresco.com) Platform, leading Open Source Enterprise Content Management system.
Alfresco Web Scripts have become an important part of any Alfresco developer's tool kit and in this session we will take a deep dive into how Web Scripts can be used to provide public APIs for Alfresco extensions. After briefly reviewing the anatomy of a Web Script and discussing Alfresco's approach to Service development, we will work through an example that extends Alfresco with a simple service and creates a REST API using Web Scripts.
Alfresco node lifecyle, services and zonesSanket Mehta
This ppt explains you the details about an alfresco node lifecycle (including which alfresco database tables are affected upon node operation-like node creation, deletion). Apart from it, it also explain which particular case-sensitive alfresco service should be used (nodeService vs NodeService, searchService vs SearchService) in order to maintain security in your application. Lastly it covers zones in alfresco (authentication-related zones and application-related zones)
No Docker? No Problem: Automating installation and config with AnsibleJeff Potts
In this talk I show how to bring stability and repeatability to your Alfresco installation by automating install and config management with Ansible.
This talk was originally given at Alfresco DevCon 2020 (virtual edition).
Alfresco DevCon 2019 (Edinburgh)
"Transforming the Transformers" for Alfresco Content Services (ACS) 6.1 & beyond
https://community.alfresco.com/community/ecm/blog/2019/02/07/alfresco-transform-service-new-with-acs-61
Alfresco provides various content transformation options across the Digital Business Platform (DBP). In this talk, we will explore the new independently-scalable Alfresco Transform Service. This enables a new option for transforms to be asynchronously off-loaded by Alfresco Content Services (ACS).
https://devcon.alfresco.com/speaker/jan-vonka/
How to migrate from Alfresco Search Services to Alfresco SearchEnterpriseAngel Borroy López
Presentation on how to move from the Alfresco Search Services product based in Apache Solr to the new Alfresco Search Enterprise integrated with Elasticsearch and Amazon Opensearch.
Organizations continue to adopt Solr because of its ability to scale to meet even the most demanding workflows. Recently, LucidWorks has been leading the effort to identify, measure, and expand the limits of Solr. As part of this effort, we've learned a few things along the way that should prove useful for any organization wanting to scale Solr. Attendees will come away with a better understanding of how sharding and replication impact performance. Also, no benchmark is useful without being repeatable; Tim will also cover how to perform similar tests using the Solr-Scale-Toolkit in Amazon EC2.
Performance tuning Grails Applications GR8Conf US 2014Lari Hotari
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Silicon Valley Code Camp 2015 - Advanced MongoDB - The SequelDaniel Coupal
MongoDB presentation from Silicon Valley Code Camp 2015.
Walkthrough developing, deploying and operating a MongoDB application, avoiding the most common pitfalls.
The venerable Servlet Container still has some performance tricks up its sleeve - this talk will demonstrate Apache Tomcat's stability under high load, describe some do's (and some don'ts!), explain how to performance test a Servlet-based application, troubleshoot and tune the container and your application and compare the performance characteristics of the different Tomcat connectors. The presenters will share their combined experience supporting real Tomcat applications for over 20 years and show how a few small changes can make a big, big difference.
Grails has great performance characteristics but as with all full stack frameworks, attention must be paid to optimize performance. In this talk Lari will discuss common missteps that can easily be avoided and share tips and tricks which help profile and tune Grails applications.
Mtc learnings from isv & enterprise interactionGovind Kanshi
This is one of the dated presentation for which I keep getting requests for, please do reach out to me for status on various things as Azure keeps fixing/innovating whole of things every day.
There are bunch of other things I can help you on to ensure you can take advantage of Azure platform for oss, .net frameworks and databases.
Mtc learnings from isv & enterprise (dated - Dec -2014)Govind Kanshi
This is little dated deck for our learnings - I keep getting multiple requests for it. I have removed one slide for access permissions (RBAC -which are now available).
Real-time Big Data Analytics Engine using ImpalaJason Shih
Cloudera Impala is an open-source under Apache Licence enable real-time, interactive analytical SQL queries of the data stored in HBase or HDFS. The work was inspired by Google Dremel paper which is also the basis for Google BigQuery. It provide access same unified storage platform base on it's own distributed query engine but does not use mapreduce. In addition, it use also the same metadata, SQL syntax (HiveQL-like) ODBC driver and user interface (Hue Beeswax) as Hive. Besides the traditional Hadoop approach, aim to provide low-cost solution for resiliency and batch-oriented distributed data processing, we found more and more effort in the Big Data world pursuing the right solution for ad-hoc, fast queries and realtime data processing for large datasets. In this presentation, we'll explore how to run interactive queries inside Impala, advantages of the approach, architecture and understand how it optimizes data systems including also practical performance analysis.
In order to obtain the best performance possible out of your AEP server, the core architecture provides methods to reuse job processes multiple times. This talk will cover how the mechanism functions, what performance improvements you might expect as well as what potential problems you might encounter, how to use pooling in protocols and applications, and how the administrator or package developers can configure and debug specialized job pools for their particular applications
Maria DB Galera Cluster for High AvailabilityOSSCube
Want to understand how to set high availability solutions for MySQL using MariaDB Galera Cluster? Join this webinar, and learn from experts. During this webinar, you will also get guidance on how to implement MariaDB Galera Cluster.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
3. Agenda
1 - General best practices on tuning
2 - Common mistakes
3 - Sizing, what to expect from a single server
4 – Solr tuning
5 – Jvm tuning ( Part 2 )
6 – Caches (Part 2 )
7 - Alfresco is running slow.. where to start ?
(Part 2)
4. 1 - General Best Practices on Tuning
Disable Un-used services and features
• Disable virtual file-systems
• cifs.enabled=false, ftp.enabled=false
• webdav.enabled=false, nfs.enabled=false, imap.enabled=false
• Disable thumbnails and documents previews
• system.thumbnail.generate=false
• Disable share web-preview (on share-config-custom)
<config evaluator="string-compare" condition="DocumentDetails" replace="true">
<document-details>
<!-- display web previewer on document details page -->
<display-web-preview>false</display-web-preview>
</document-details>
</config>
• Disable replication
• replication.enabled=false
• transferservice.receiver.enabled=false
5. 1 - General Best Practices on Tuning
Disable Un-used services and features (cont.)
• Disable cloud-sync features
• syncService.mode=OFF
• sync.mode=OFF
• sync.pullJob.enabled=false
• sync.pushJob.enabled=false
• Disable user quotas
• system.usages.enabled=false
• system.usages.clearBatchSize=0
• Disable eager creation of home folders
• system.usages.enabled=false
• Disable activities feed
• activities.feed.notifier.enabled=false
• activities.feed.cleaner.enabled=false
• activities.post.cleaner.enabled=false
6. 1 - General Best Practices on Tuning
Golden Rules for the Repository
• Limit Groups hierarchy to 5 (nested groups)
• User inheritance based permission model
• Limit the maximum number of nodes in a folder
• Have a certain degree of control on the number of sites, do you really need
10000 sites ?
• Keep a low ratio on user/groups membership
• Limit the depth of the folder hierarchy
7. 2 – Common Mistakes
• Not keeping extended configurations and customizations separate in the shared
directory. Do not put them in the configuration root. If you do, you will lose them
during upgrades.
• Not testing the backup strategy.
• Insufficient Monitoring, Insufficient troubleshooting tools
• Making changes to the system without testing them thoroughly on a test and pre-
production machine first.
• Forgetting to adjust the system sizing for increased users and sessions
• Increase the database connection pool
• Tune the maxThreads on tomcat
• Tune Jvm and Gc
• Benchmarks and Stress tests
• Using a shared database with other applications
• Network/Infrastructure constraints
• Not following the SPM
8. 2 – Common Mistakes
• Customizations / Custom Code mistakes
• Not closing search resultsets on try..catch…finally blocks (memory leaks)
• Incorrect usage of policies/behaviors (collisions, poor code quality)
• Using lower case versions of Alfresco beans
• Direct access to the database (should use Spring and the existing DAOs)
• Usage of private API’s
9. 2 – Common Mistakes
• Customizations / Custom Code mistakes
• Using Transaction Service instead of
RetryingTransactionHelper
• Not using CMIS query language when using SearchService
• Improper exception handling
10. 3 – Sizing - what to expect from a single server
Lets assume we’re running alfresco on a single server with the following hardware.
Red Hat Linux 64 bits, 16 GB RAM, 2 quad-core cpus 3.2Ghz, local SSD disk
We will have 3 web-applications running on the same JVM and container (i.e tomcat)
• Alfresco Repository
• Alfresco Share UI
• Solr
According to our internal benchmarks, and highly dependent on the specifics of each
use case this server should be able to serve handle 200 concurrent users or up to
2000 casual users
11. 3 – Sizing - what to expect from a single server
The following facts will affect the sizing and architecture.
• Use Case
• Concurrent users
• Document types, sizes and distribution ratios
• Architecture (virtualization ?, fail safe ?, replication ? Integrations ? Component stack
• Authority structure
• Operations
• Components, Protocols and Apis
• Batch operations
• Response times requirements
13. 3 – Sizing Divide and Conquer
• Know when,where and what are the processes that are running on your
server and what resources are those processes influencing.
• Do it with appropriate monitoring
• Javamelody as a simple approach (DEMO)
• https://github.com/miguel-rodriguez/alfresco-monitoring
• Use support tools for troubleshooting (DEMO)
• https://github.com/Alfresco/alfresco-support-tools
• Have specific servers dedicated to specific tasks
• Offload the user facing nodes
16. 4 – Solr Tuning
Golden Rules for Solr
• Do you search on deleted content ? If not, disable the archive core.
• Go to solrHome and edit the solr.xml file commenting out the archive core
• Also disable the archive core backup scheduled task
• Do you search on content or only meta-data ? You can disable full-text-indexing
• alfresco.index.transformContent=false
• Alfresco can make use of Transactional Metadata queries (db fetch)
• SSL really needed? If inside the intranet, it should be disabled to reduce complexity.
• Optimize your ACL policy, re-use your permissions, use inherit and use groups
17. 4 – Solr Tuning
Golden Rules for Solr Indexing
• Have local indexes (don’t use shared folders, NFS, use Fast hardware
(RAID, SSD,..)
• Tune the mergeFactor, 25 is ideal for indexing, while 2 is ideal for search.
• Tune your Ram buffer size (ramBufferSizeMB) on solrconfig.xml, 32 MB by
default
• Analyze your indexing processes (check alfresco repository health)
• Tune the transformations that occur on the repository side, set a
transformation timeout.
18. 4 – Solr Tuning
Golden Rules for Solr Indexing
• Closely monitor Solr JVM (especially GC and Heap usage)
• Enable GC logs, analyze Gc performance, tune the GC algorithm
• Do you need tracking to happen every 15 seconds ?
• Use a dedicated tracking alfresco instance, several architecture options
• Increase your index batch counts to get more results on your indexing
webscript
• In each core solrcore.properties, raise the batch count to 2000
• Impacting factors in Indexing
• Jvm Memory and Cpu usage on Repository Layer (text extraction /transformations)
• Jvm Memory ,Cpu , Disk I/O, Disk Cache size on Solr Layer
• Number of threads for indexing, Solr caches
19. 4 – Solr Tuning
Golden Rules for Solr Search
• Have local indexes (don’t use shared folders, NFS, use Fast hardware (RAID,
SSD,..)
• Tune the mergeFactor, 2 is ideal for search.
• Increase your query caches and the RAMBuffer
• Avoid path search queries, those are be slow,.Avoid * search, avoid ALL search
• Avoid using sort, you can sort your results on the client side using js or any client
side framework of your choice.
• Search is CPU intensive rather then RAM intensive, increase cpu power.
• Upgrade your Alfresco release with the latest service packs and hotfixes. Those
contain the latest Solr improvements and bug fixes that can have great impact on the
overall search performance
20. 4 – Solr Tuning
Solr Caches
Tracking the usage of the solr caches can help to tune them for your use case.
• http://<solr_server>:<solr_port>/solr/alfresco/admin/stats.jsp#cache
The url above show you statistics on cache usages, If you have many evictions you
should look into increasing that cache module so all elements can fit (but don't
overdo it, adjust it and see what fits for your setup). It is likewise also a idea to decrease
the size of some of them if they have a lot of unused slots.
Goal should be to get the hit rate as close to 1.00 as possible (1.00 beeing 100% hit ratio)
21. 4 – Solr Tuning
Solr usage on Alfresco Share
• Solr indexing and search performance will affect positively the overall share
performance. Share relies on Solr in the following situations:
• Full Text Search (search field in top right corner)
• Advanced Search
• Filters
• Tags
• Categories (implemented as facets)
• Dashlets such as the Recently Modified Documents
• Wildcard searches for People, Groups, Sites (uses database search if not wildcard)
Editor's Notes
Disabling some features that are not being used will release important resources allowing them to be used for active tasks, contributing for increased performance.
Transformations
When users are accessing alfresco via the share interface, when they access the document details page a full document preview is generated. This involves calls to various third-party tools such as OpenOffice, Ghoscript, ImageMagick to create a flash version of the document. Since previews are not being used, we can prevent their creation by including the following snipped of xml in the share-config-custom.xml file.
Disabling some features that are not being used will release important resources allowing them to be used for active tasks, contributing for increased performance.
User Quotas
Checking for user quotas can add some overload to alfresco. When not needed, this feature can also be disabled
User home folders
Alfresco creates a home folder for each new user automatically. If your users are not utilizing this folder for any business related tasks you can disable the automatic creation of home folder for new users.
Activities feed
If this feature is not necessary and is not being used disabling it will prevent regular checks to the activities and will again save on system resources.
activities.feed.notifier.enabled=false
activities.feed.cleaner.enabled=false
activities.post.cleaner.enabled=false
1,2 - Acl checks are known to slow down performance when the maximum group hierarchy depth exceeds 5 levels. Our advice is, when possible, limiting the max group hierarchy depth to 5.
3 - When using share or another client to browse a repository folder, alfresco needs to perform a series of actions before it actually renders or delivers the content (permission checking, etc). The more nodes that reside inside a specific folder the slower will be the response time. We recommend, when possible, limiting the maximum number of document nodes inside a folder to 2000.
4 - The number of sites on the system has some relative influence on performance, especially when checking the site membership for the users. Although this is (on 3 limits suggested here) the factor that has less impact, we recommend keeping the number of sites below 5000.
5- The number of groups a user belongs to has impact on performance while rendering some client pages (specially on some share dashlets, like the mysites dashlet). Alfresco run some complex queries (based on the user groups membership) while determining the assets to render on some share pages. We advice on keeping a low ratio on the number of groups a user belongs to. When possible, and to optimize the share client rendering performance, a user should not belong to more than 5 or 6 groups.
6 - The depth of the folder hierarchy also has an impact when browsing and performing document actions under a certain folder. We recommend, when possible, limiting the maximum depth of a folder hierarchy to 15 levels
Not closing resources
Certain resources in Alfresco (specifically search result objects) are not cleaned up automatically by Alfresco and must therefore be cleaned up correctly by extension code (i.e. in a “finally” block). Failing to clean up such resources results in leaks, not only of memory but also, in some cases, “real” operating system resources (such as file handles) as well.
Lower case beans
The “lower case” versions of the Alfresco service beans (i.e. those whose first name starts with a lowercase letter e.g. nodeService) are configured to bypass Alfresco’s security, transaction and auditing checks, with no recourse for the administrator to turn them back on. There have been persistent (but incorrect) rumors in the Alfresco community that these versions of the services perform significantly better than the official (“upper case”) versions, but that hasn’t been the case since at least Alfresco v2.x.
Content policies
Content policies are wired in at a very low level in the repository and as a result can be called many hundreds of times a second in some cases (e.g. when content is being manipulated via CIFS). In addition they are executed synchronously within each Alfresco transaction. The result is that even the slightest poor performance in a custom content policy or behavior can have a profound impact on Alfresco performance. For that reason it is critically important that custom content policies / behaviours are either fast (conduct minimal computation and only perform minimal I/O to the repository) or are made asynchronous.
Direct access to database
Alfresco’s database schema and the SQL the product uses has been carefully tuned. Uncontrolled access to the same tables can interfere with Alfresco’s normal operation, impacting both performance and (in some cases) stability. Note that this includes reads (SELECTs), as this can block concurrent write operations in some circumstances.
Use of private API’s
Only the public Alfresco Java APIs may be used in a certified extension, the private APIs should not to be used and are not supported. http://docs.alfresco.com/4.2/concepts/java-public-api-list.html
Using Transaction Service instead of RetryingTransactionHelper
As the name implies, RetryingTransactionHelper contains retry logic for certain recoverable, expected database exceptions (deadlocks, basically). It also uses Spring’s “template” pattern to ensure a transaction is always completed (committed or rolled back) correctly, regardless of what happens in the logic inside the transaction.
The “raw” TransactionService provides neither of these benefits and for that reason is considered unsafe for use in extensions
CMIS query Language
Alfresco’s SearchService API supports different “languages” (XPath, Lucene, SOLR and CMIS), which roughly correlate to the different underlying search implementations in Alfresco. Of these languages, only CMIS is fully abstracted away from the underlying implementation, and so is the only language that provides some guarantee of consistent behavior, regardless of how a given Alfresco instance has been configured (SOLR vs Lucene, or MDQ, for example). Note however that SearchService doesn’t fully implement CMIS-QL – specifically, the “SELECT” clause in CMIS queries sent to the SearchService are not processed (they are silently ignored). SearchService, regardless of the query language used, always returns sets of matching NodeRefs.
Improper exception handling
Exceptions should either be caught and recovered from, or allowed to flow up the call stack. It is almost never appropriate to “swallow” an exception (catch it and do nothing), and excessive wrapping of exceptions inside other exceptions should be minimized (it makes triage more difficult).
Catching or throwing java.lang.Error instances, and catching java.lang.Throwables is also inappropriate – these classes (java.lang.Error, specifically) indicate fatal JVM problems and therefore cannot be safely caught or thrown.
Alfresco sizing like other systems has many subtleties that are hard to fully systematize. Each deployment will have specificities that will demand consideration while estimating sizing for the different layers (share front-end, repository, indexing, transformation, storage). In any case, for the numbers presented above we are assuming that there are certain fairly generic steps and concerns that are mostly common between the different use cases. Changes on such assumption can drastically change the predictions above.
For memory calculations, consider the repository L2 Cache, plus initial VM overhead, plus basic Alfresco system memory.
This means that you can run the Alfresco repository and web client with many users accessing the system with a basic single CPU server ,However, you must add additional memory as your user base grows, and add CPUs depending on the complexity of the tasks you expect your users to perform, and how many concurrent users are accessing the client.
The terms concurrent users and casual users are used. Concurrent users are users who are constantly accessing the system through Alfresco with only a small pause between requests (3-10 seconds maximum) with continuous access 24/7. Casual users are users occasionally accessing the system through the Alfresco or WebDAV/CIFS interfaces with a large gap between requests (for example, occasional document access during the working day).
Common use cases
Alfresco use cases vary considerably because of the elasticity of the platform and although we can enumerate some generic common use cases, the details on which each real implementation differs may be very important from architecture and sizing perspective. We can consider that generally alfresco solutions can be classified on one of the 2 following cases:
• Collaboration
• Backend Repository
Authority Structure
We know from recent benchmarks comparisons that authority structure has a direct and important impact on performance of SOLR especially. So its important when sizing SOLR the importance of search operations for the solution, the types of searches being executed, the repository size and characteristics but also and equally important the authority structure of the corresponding use case. Collaboration use cases will so be in general more demanding in principle (keeping the other mentioned factors constant) than backend use cases with simpler authority structures.
Common use cases
Alfresco use cases vary considerably because of the elasticity of the platform and although we can enumerate some generic common use cases, the details on which each real implementation differs may be very important from architecture and sizing perspective. We can consider that generally alfresco solutions can be classified on one of the 2 following cases:
• Collaboration
• Backend Repository
Injection Rate: the repository growth or document upload rate. The document types and sizes will impact transformation, text extraction and indexing. The repository size has impact on performance will grow with the ratio of search operations expected on the solution.
The injection rate has an obvious impact on the server concerning near future repository sizes, but also around the capability of the different solution layers on handling the throughput that not only stress the content storage and database (metadata extraction/upload/rules) but also the indexing layer. This may imply depending on the amount of document injection happening that dedicated nodes are reserved for the injection (that can be on cluster or not with the front end service nodes) and also Solr layer scaled up/out. The requirements around uploading and downloading large or small documents, and indexing, transforming different types of documents will vary. This can have architecture consequences. Preference for certain type of protocols and mechanisms for large files for example: FTP, bulk ingestion; or use of dedicated caching solutions for large documents downloads. But this also has consequences at the sizing level. It will be very different having to index large documents than smaller ones, and also between different types.
Repository size is also dynamical, and the project may expect even on the near term to go through different phases: first migrating legacy repository, new content roll out, archival, etc. So sizing estimates should cover the different phases especially on short term.
One of the secrets of a successful architecture is to know exactly what, when and where are the processes occurring and what resources are those processes influencing. Having this information brings the architect the power to “Divide and Conquer”.
Working with a fairly flexible technology he can wisely divide the overall processes across the resources (servers) achieving the necessary balance. Let’s consider for example the schedule jobs that Alfresco executes, on a distributed architecture we have lots of advantages to offload some of those jobs to a specific server, releasing important resources from the servers that are actually serving user requests.
From an alfresco perspective offloading (disabling) this scheduled jobs from the front-end servers is no more than configured their cron job to execute very far away in the future and have a dedicated server (normally separated from the cluster) to execute this jobs.
Capacity Planning is the science and art of estimating the space, computer hardware, software and connection infrastructure resources that will be needed over some future period of time. It’s a mean to predict the types, quantities, and timing of critical resource capacities that are needed within an infrastructure to meet accurately forecasted workloads.
Predicting and sizing a system depends on a good understanding of user behavior. The following diagram represents the stages of a capacity-planning scenario.
Validate your predictions with stress tests, compare obtained data with the LIFE data from production adjusting the REAL user behaviors to your stress tests scenarios.
Performing a regular analysis to the monitoring/capacity planning data will help you know exactly when and how you need to scale our architecture, allowing for incremental and continuous improvement. The more data that gets indexed inside elastic search along the application life cycle the more accurate your capacity predictions will be. They represent the “real” application usage and how that usage affects the various layers of your application. This plays a very important role when modeling and sizing the architecture for future business requirements.
The Peak period Methodology is an efficient way to implement a capacity planning strategy allowing to analyze vital performance information when the system is under more load/stress, furthermore it represents YOUR system. The peak period may be an hour, a day, 15 minutes or any other period that is used to analyze the collected utilization statistics. Assumptions may be estimated based on business requirements or specific benchmarks of a similar implementation.
The diagram shows what are the most important factors in the deployment that should be analyzed on a regular basis.Note that certain inspection targets have overhead when they are being inspected. These targets might not be appropriate for long-term monitoring or may require tuning to minimize impact.
Pay attention to alfresco transformations and text extraction
Alfresco executes a high number of transformations while working with documents, those include document’s text extraction (for indexing), previews generation, thumbnails generations and renditions. It’s wise to keep a regular monitoring on the health and performance on the Transformations, check the longest running transformations, measuring transformation times and using transformation limits when applicable.
HTTP Requests and Responses
Debugging HTTP requests can yield useful information to help you tune your applications. Find out what component are making the page take so long to load, or make sure the JSON your web script is returning looks like you expect it to. There are a couple of tools that can be used to debug http requests and responses like Charles or Fiddler, some other tools are included in the browsers (firebug, chrome inspector).
Disable all full-text indexing
We can disable all full-text indexing activities and tune our search layer for performance on meta-data based searches. We can make usage on a recent feature that alfresco makes available called transactional metadata queries. To disable full-text search we need to configure the workspace Spacestore solr core. Edit the solrcore.properties file and set the following property:
alfresco.index.transformContent=false
Disable archive core
If you are not planning on searching for deleted content, we can safely disable the indexing of archive content. Alfresco never searches for content inside files that are deleted/archived.
We can disable the indexing of archived content by going into solrHome and edit the solr.xml file that resides on the root of this folder. Edit this file and comment out the archived content indexing using the xml below.
<?xml version='1.0' encoding='UTF-8'?><solr sharedLib="lib" persistent="true">
<cores adminPath="/admin/cores" adminHandler="org.alfresco.solr.AlfrescoCoreAdminHandler">
<!-- <core name="archive" instanceDir="archive-SpacesStore"/>-->
<core name="alfresco" instanceDir="workspace-SpacesStore"/>
</cores>
</solr>
Note that we are commenting out the solr core named “archive”. This will prevent solr from indexing archived content. This saves on disk space, memory on Solr, Cpu during Indexing and overall resources.
We also need to disable the archive core backup scheduled task. We do this by configuring the cron to a date in a very far away future. You should do this on every alfresco node (including the new tracking instance)
# disabling the archive backup as we are not using archive search
solr.backup.archive.cronExpression=* * * * * ? 2199
solr.backup.archive.numberToKeep=0
Optimize your ACL policy, re-use your permissions, use inherit and use groups. Don’t setup specific permissions for users or groups at a folder level. Try to re-use your Acls.
ramBufferSizeMB
ramBufferSizeMB sets the amount of RAM that may be used by Solr indexing for buffering added documents and deletions before they are flushed to disk.
generally increasing this to 64 or even 128 has proven increased performance. But this depends on the amount of free memory you might have available.
Analyze Indexing process
During the indexing process, plug in a monitoring tool (YourKit) to check the repository health during the indexing. Sometimes, during indexing, the repository layer executes heavy and IO/CPU/Memory intensive operations like transformation of content to text in order to send it to Solr for indexing. This can become a bottleneck when for example the transformations are not working properly or the GC cycles are taking a lot of time.
GC Tuning
Solr operations are memory intensive so tuning the Garbage collector is an important step to achieve good performance. Jclarity Cemsum tool is really good to analize gc logs, but there are others.
Tracking frequency
Consider if you really need tracking to happen every 15 seconds (default). This can be configured in Solr configuration files on the Cron frequency property.
alfresco.cron=0/15 * * * * ? *
This property can heavily affect performance, for example during bulk injection of documents or during a lucene to solr migration. You can change this to 30 seconds or more when you are re-indexing.
This will allow more time for the indexing threads to perform their actions before they get more work on their queue.
Increase your index batch counts to get more results on your indexing webscript on the repository side. In each core solrcore.properties, raise the batch count to 2000 or more alfresco.batch.count=2000
For index updates, Solr relies on fast bulk reads and writes. One way to satisfy these requirements is to ensure that a large disk cache is available. Use local indexes and the fastest disks possible.
In a nutshell, you want to have enough memory available in the OS disk cache so that the important parts of your index, or ideally your entire index, will fit into the cache. Let’s say that you have a Solr index size of 8GB. If your OS, Solr’s Java heap, and all other running programs require 4GB of memory, then an ideal memory size for that server is at least 12GB. You might be able to make it work with 8GB total memory (leaving 4GB for disk cache), but that also might NOT be enough.
Troubleshooting common Indexing problems
Database – If it’s a database performance issue, normally adding more connections to the connection pool can increase performance.
I/O – If it’s a IO problem, it can normally occur when using virtualized environments, you should use hdparam to check read/write disk speed performance if you are running on a linux based system, there are also some variations for windows. Find the example below: sudo hdparm -Tt /dev/sda
The rule for troubleshooting involves testing and measuring initial performance, apply some tuning and parameter changes and retest and measure again until we reach the necessary performance. I strongly advice to plugin a profiling tool such as Yourkit to both the repository and Solr servers to help with the troubleshooting.
Make sure you are using only one transformation subsystem. Check the alfresco-global.properties and see if you are using either OooDirect or JodConverter, never enable both sub-systems.
Typical issues with Searching
It can happen that you are searching and indexing on the same time, this causes concurrent accesses to the indexes and that is known to cause performance issues. There are some workarounds for this situation.
To start you should Plugin a profiler and search for commit Issues (I/O locks), this will allow you to check if you are facing this problem.
Solr makes some statistics available by default. Analysing those statistics can help you to tune your solr caches for your use case.
The way to read this results (analysing the caches) is the following.
If you have many evictions you should look into increasing that cache module so all elements can fit (but don't overdo it, adjust it and see what fits for your setup).
It is likewise also a idea to decrease the size of some of them if they have a lot of unused slots.
A goal should be to get the hit rate as close to 1.00 as possible (1.00 beeing 100% hit ratio)
If your project relies on the share client offered by Alfresco, you should know that tuning your Solr indexing and search performance will affect positively the overall share performance.