"9η Ιουλίου 1821" Βασίλη Μιχαηλίδη- "Ελεύθεροι Πολιορκημένοι" Διονυσίου ΣολωμούFlora Kyprianou
Συνανάγνωση αποσπασμάτων του επικολυρικού έργου " 9η Ιουλίου εν Κύπρω" του Βασίλη Μιχαηλίδη, με αποσπάσματα από το έργο του Διονυσίου Σολωμού " Ελεύθεροι Πολιορκημένοι".
Chetan Mehrotra, Senior Computer Scientist, and Alex Parvulescu, Senior Developer, Adobe presented on Jan 20, 2016. They describe the features of Oak Lucene indexes and how they can be used to get your queries perform better. In the second part we will talk about how asynchronous indexing works in general and how it can be monitored.To view the on-demand session go to: http://bit.ly/AEMGems1202016 or for the MP4 version http://bit.ly/AEMGemsMP41202016
"9η Ιουλίου 1821" Βασίλη Μιχαηλίδη- "Ελεύθεροι Πολιορκημένοι" Διονυσίου ΣολωμούFlora Kyprianou
Συνανάγνωση αποσπασμάτων του επικολυρικού έργου " 9η Ιουλίου εν Κύπρω" του Βασίλη Μιχαηλίδη, με αποσπάσματα από το έργο του Διονυσίου Σολωμού " Ελεύθεροι Πολιορκημένοι".
Chetan Mehrotra, Senior Computer Scientist, and Alex Parvulescu, Senior Developer, Adobe presented on Jan 20, 2016. They describe the features of Oak Lucene indexes and how they can be used to get your queries perform better. In the second part we will talk about how asynchronous indexing works in general and how it can be monitored.To view the on-demand session go to: http://bit.ly/AEMGems1202016 or for the MP4 version http://bit.ly/AEMGemsMP41202016
We describe the features of Oak Lucene indexes and how they can be used to get your queries perform better. In the second part we will talk about how asynchronous indexing works in general and how it can be monitored.
This was presented as part of AEM Gem Series -http://dev.day.com/content/ddc/en/gems/oak-lucene-indexes.html
Oak, the architecture of Apache Jackrabbit 3Jukka Zitting
Apache Jackrabbit is just about to reach the 3.0 milestone based on a new architecture called Oak. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
Apache Jackrabbit Oak is a new JCR implementation with a completely new architecture. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMtherealgaston
A presentation by Gaston Gonzalez at adaptTo() 2014 describing several approaches for integrating Apache Solr with AEM. It starts with an introduction to various pull and push indexing strategies (e.g., Sling Eventing, content publishing and web crawling). The topic of content ingestion is followed by an approach for delivering rapid search front-end experiences using AEM Solr Search.
Oak, the Architecture of the new RepositoryMichael Dürig
Apache Jackrabbit Oak is a new JCR implementation with a completely new architecture. Based on concepts like eventual consistency and multi-version concurrency control, and borrowing ideas from distributed version control systems and cloud-scale databases, the Oak architecture is a major leap ahead for Jackrabbit. This presentation describes the Oak architecture and shows what it means for the scalability and performance of modern content applications. Changes to existing Jackrabbit functionality are described and the migration process is explained.
/path/to/content - the Apache Jackrabbit content repositoryJukka Zitting
Looking for a database where user profiles and image galleries are equally at home? That comes with built-in full text search, fine-grained access control, flexible schemas, versioning and many more advanced features? Take a look at Apache Jackrabbit, the Java-based content repository that combines the best parts of file systems and databases. This introductory presentation covers Apache Jackrabbit and its hierarchical content model, and shows how it can be used as a powerful foundation of modern content-based applications.
Sham Hassan Chikkegowda, CS Engineer, and Timothee Maret, Senior Developer, of Adobe provide a review of using Security Assertion Markup Language (SAML) with your Experience Manager deployments. SAML is an XML-based, open-standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider. SAML is a product of the OASIS Security Services Technical Committee. To watch the session on demand at http://bit.ly/AEMGems72016 or the MP4 version http://bit.ly/AEMGem72016
OpenCms 8.5 integrates Apache Solr. And not only for full text search, but as a powerful query engine as well.
Imagine you want to show a list of "all resources of type news, that have changed since yesterday, where property X has the value Y" on your web page. Sure, there are API methods in OpenCms to load resources based on the type, on the date of change, or on the value of a specific property. But for many common use case combinations, there is no single API call. This means if you create a collector, you often end up sorting out the results of the initial API query in code.
In this session, Rüdiger will show how Apache Solr has been integrated in OpenCms 8.5. He will explain how to create improved front-end full text search functions with advanced options like faceting and spell check suggestions. And he will explain how to use Solr to directly read resources from the OpenCms VFS, allowing query combinations that combine resource attributes, properties and content in a powerful new way.
After completing this lesson, you should be able to do the following:
Describe some database objects and their uses
Create, maintain, and use sequences
Create and maintain indexes
Create private and public synonyms
[Session given at Engage 2019, Brussels, 15 May 2019]
In this session, Tim Davis (Technical Director at The Turtle Partnership Ltd) takes you through the new Domino Query Language (DQL), how it works, and how to use it in LotusScript, in Java, and in the new domino-db Node.js module. Introduced in Domino 10, DQL provides a simple, efficient and powerful search facility for accessing Domino documents. Originally only used in the domino-db Node.js module, with 10.0.1 DQL also became available to both LotusScript and Java. This presentation will provide code examples in all three languages, ensuring you will come away with a good understanding of DQL and how to use it in your projects.
Spring Day | Spring and Scala | Eberhard WolffJAX London
2011-10-31 | 09:45 AM - 10:30 AM
Spring is widely used in the Java world - but does it make any sense to combine it with Scala? This talk gives an answer and shows how and why Spring is useful in the Scala world. All areas of Spring such as Dependency Injection, Aspect-Oriented Programming and the Portable Service Abstraction as well as Spring MVC are covered.
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
Slides for my presentation at SoCal Code Camp, June 29, 2014
(http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6337660f-37de-4d6e-a5bc-46ba54478e5e)
Rails and the Apache SOLR Search EngineDavid Keener
What good is content if nobody can find it? Many information sites are like icebergs, with only a limited amount of content directly accessible to users and the rest, the "underwater" potion, only available through searches. This talk shows how Rails web sites can take advantage of the world-class Apache SOLR search engine to provide sophisticated and customizable search features. We'll cover how to get started with SOLR, integrating with SOLR using the Sunspot gem, indexing, hit highlighting and other topics.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Understanding Globus Data Transfers with NetSageGlobus
NetSage is an open privacy-aware network measurement, analysis, and visualization service designed to help end-users visualize and reason about large data transfers. NetSage traditionally has used a combination of passive measurements, including SNMP and flow data, as well as active measurements, mainly perfSONAR, to provide longitudinal network performance data visualization. It has been deployed by dozens of networks world wide, and is supported domestically by the Engagement and Performance Operations Center (EPOC), NSF #2328479. We have recently expanded the NetSage data sources to include logs for Globus data transfers, following the same privacy-preserving approach as for Flow data. Using the logs for the Texas Advanced Computing Center (TACC) as an example, this talk will walk through several different example use cases that NetSage can answer, including: Who is using Globus to share data with my institution, and what kind of performance are they able to achieve? How many transfers has Globus supported for us? Which sites are we sharing the most data with, and how is that changing over time? How is my site using Globus to move data internally, and what kind of performance do we see for those transfers? What percentage of data transfers at my institution used Globus, and how did the overall data transfer performance compare to the Globus users?
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Into the Box Keynote Day 2: Unveiling amazing updates and announcements for modern CFML developers! Get ready for exciting releases and updates on Ortus tools and products. Stay tuned for cutting-edge innovations designed to boost your productivity.
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?XfilesPro
Worried about document security while sharing them in Salesforce? Fret no more! Here are the top-notch security standards XfilesPro upholds to ensure strong security for your Salesforce documents while sharing with internal or external people.
To learn more, read the blog: https://www.xfilespro.com/how-does-xfilespro-make-document-sharing-secure-and-seamless-in-salesforce/
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar
The European Union Agency for Law Enforcement Cooperation (Europol) has suffered an alleged data breach after a notorious threat actor claimed to have exfiltrated data from its systems. Infamous data leaker IntelBroker posted on the even more infamous BreachForums hacking forum, saying that Europol suffered a data breach this month.
The alleged breach affected Europol agencies CCSE, EC3, Europol Platform for Experts, Law Enforcement Forum, and SIRIUS. Infiltration of these entities can disrupt ongoing investigations and compromise sensitive intelligence shared among international law enforcement agencies.
However, this is neither the first nor the last activity of IntekBroker. We have compiled for you what happened in the last few days. To track such hacker activities on dark web sources like hacker forums, private Telegram channels, and other hidden platforms where cyber threats often originate, you can check SOCRadar’s Dark Web News.
Stay Informed on Threat Actors’ Activity on the Dark Web with SOCRadar!
Multiple Your Crypto Portfolio with the Innovative Features of Advanced Crypt...Hivelance Technology
Cryptocurrency trading bots are computer programs designed to automate buying, selling, and managing cryptocurrency transactions. These bots utilize advanced algorithms and machine learning techniques to analyze market data, identify trading opportunities, and execute trades on behalf of their users. By automating the decision-making process, crypto trading bots can react to market changes faster than human traders
Hivelance, a leading provider of cryptocurrency trading bot development services, stands out as the premier choice for crypto traders and developers. Hivelance boasts a team of seasoned cryptocurrency experts and software engineers who deeply understand the crypto market and the latest trends in automated trading, Hivelance leverages the latest technologies and tools in the industry, including advanced AI and machine learning algorithms, to create highly efficient and adaptable crypto trading bots
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
5. 5
WHY SHOULD YOU CARE?
• Search is the most significant change for AEM developers between CRX2 and
Oak.
6. 6
WHY SHOULD YOU CARE?
CRX2 Search – Limited Optimization Opportunities
Baseline Search Performance – OK
No “Plan” Output
Single Index
Minimal Configuration
7. 7
WHY SHOULD YOU CARE?
Oak Search – Many Optimization Opportunities
Baseline Performance – Slow
Viewable Plan
Different Index Types
10. 10
SEEING THE PLAN
Oak supports an “explain” query prefix, similar to what many RDMBS’s support.
explain /jcr:root/content/geometrixx/en/products//element(*,
nt:unstructured)[@sling:resourceType = 'geometrixx/components/title’]
Shows you which index was used.
queryResult.getRows().nextRow().getValue("plan")
11. 11
SEEING THE PLAN – EXPLAIN QUERY TOOL
Plan
Explanation
12. 12
INDEX DEFINI T IONS
Stored in the repository as nodes under /oak:index
Node Type is oak:QueryIndexDefinition
Single mandatory property – “type”
Optional generic properties:
async – set to “async” to do index updates asynchronously
reindex – set to true to trigger a reindex
declaringNodeTypes – one or more node types to restrict indexing
entryCount – used to weight indexes
13. 13
SYNC VS. ASYNC INDEX
Sync indexes (the default) update in the context of a save() call
Async indexes do not.
Every 5 seconds, the diff between the last successful indexed state and the
HEAD state is read and used to update the index
CONSEQUENCE - async indexes may not return up-to-date returns
The OOTB ordered and Lucene indexes are defined as async.
All external indexes (e.g. Solr) should also be async.
15. 15
VIEWING INDEX CONTENT
Many indexes store their content in the repository, but hidden.
Cannot be viewed using CRXDE Lite.
Must use oak-run
TarMK – use either “explore” (GUI) or “console” (CLI) command
MongoMK – use “console” command
• Vote for OAK-2096 to get “explore” support working for MongoMK
16. 16
CREAT ING AN INDEX
Created as content via CRXDE Lite / deployed using content package
Created through code.
Created through configuration.
17. 17
WHEN SHOULD YOU REINDEX?
When the configuration changes
For example, changing the declaringNodeTypes
But not the entryCount
(Sometimes) After updating Oak
Check the Release Notes, this should be prominently indicated.
But not arbitrarily…
Reindexing is a resource intensive process.
Reindexing will NOT help query performance.
18. 18
COST CALCULAT ION
Each Index calculates a relative cost for the query
Number between 0 and Infinity
0 = “Pick me!”
Infinity = “Don’t Pick Me!”
19. 19
DEBUGGING COST CALCULAT ION
Enable DEBUG logging on org.apache.jackrabbit.oak.query.QueryImpl
Per Index Type Cost
Enable DEBUG logging on
org.apache.jackrabbit.oak.plugins.index.property.PropertyIndex
Detailed Property Cost
Enable DEBUG logging on
org.apache.jackrabbit.oak.plugins.index.property.OrderedPropertyIndex
Detailed Ordered Property Cost
Enable DEBUG logging on org.apache.jackrabbit.oak.plugins.index.lucene
Detailed Lucene Cost
20. 20
SAMPLE DEBUG OUTPUT
Query = /jcr:root/content/geometrixx/en/products//element(*,
nt:unstructured)[@sling:resourceType = 'geometrixx/components/title' and @jcr:title
= 'Triangle']
cost for aggregate lucene is Infinity
cost for reference is Infinity
cost for ordered is Infinity
cost for nodeType is Infinity
property cost for sling:resourceType is 10003.0
property cost for jcr:title is Infinity
Cheapest property cost is 10003.0 for property sling:resourceType
cost for property is 10003.0
cost for traverse is 199996.0
21. 21
SAMPLE DEBUG OUTPUT
Query = /jcr:root/content/geometrixx/en/products//element(*,
nt:unstructured)[@sling:resourceType = 'geometrixx/components/title' and
@type='large']
cost for aggregate lucene is Infinity
cost for reference is Infinity
cost for ordered is Infinity
cost for nodeType is Infinity
property cost for sling:resourceType is 10003.0
property cost for type is 21.0
Cheapest property cost is 21.0 for property type
cost for property is 21.0
cost for traverse is 199996.0
22. 22
INDEX IMPLEMENTAT IONS
These indexes you can create new ones of
Property
Ordered Property
Solr
Lucene
These you shouldn’t
Reference
Node Type
And then there is a special one
Traversing
23. 23
PROPERTY INDEX
Stores node paths indexed by a particular property value
Example: /oak:index/slingResourceType
Can be unique (unique = true)
Examples: rep:principalName & jcr:uuid
Only usable with sync indexes
27. 27
PROPERTY INDEX – COST CALCULAT ION
Generalized Cost Calculation:
Cost per Execution + (Estimated Matches * Cost per Entry)
Cost per Execution – 2
Cost per Entry – 1
28. 28
PROPERTY INDEX – EST IMAT ING MATCHES
For name=value queries (e.g.
[@sling:resourceType=‘foundation/components/text’], including lists
If entry count provided, the estimated cost is entry count / key count + number
of values in the query
• Key count defaults to entry count / 10000, but can be manually specified
Otherwise, counts up to 100 matches across the first three values.
If > 100 matches, estimated matches are 1.1 ^ (the average depth of matches)
If > 3 values, estimated matches are extrapolated from the first three values.
For exists queries (e.g. [@sling:resourceType]
If entry count provided, it is the estimated count.
Otherwise, counts up to 100 matches across all values.
If > 100 matches, estimated matches are 1.1 ^ (the average depth of matches)
29. 29
ORDERED INDEX
Stores node paths indexed by a particular property value
Has extra :next property on each value node to handle ordering
Example: /oak:index/cqLastModified
WARNING – only supports lexigraphic sorting
34. 34
NODE TYPE INDEX
Special type of Property Index
Note that not all node types are indexed by default
Has a default entryCount of a very high value
41. 41
LUCENE
Customize the tika configuration
Configurable analyzers (OAK-2177)
Synonyms
Boost Terms at index time (OAK-2178)
42. 42
SOLR
Based on Lucene
Fault Tolerant
Rich Document Handlers
Geospatial Search
Load Balancing
AEM 6.0 Configurable:
Full Text Search
Indexing
Native Queries
43. 43
SOLR CONFIGURAT ION
There are 4 configurable components
Oak Solr embedded server
Oak Solr indexing / search
Oak remote server
Oak Solr server provider
44. 44
SOLR DEFINI T ION
oak:QueryIndexDefintion
type = solr
async = async
reindex = true
45. SOLR FUL L TEXT QUERIES
//*[jcr:contains(., ‘Experience
45
Manager’)]
Solr enables restrictions based on:
• Path
• Property
• Primary Type
46. 46
jcr:contains
query
detected
Remote solr
index
queried
Results
Returned
FLOW
• In oak-solr-core 1.0.1+ (AEM 6 SP1) you can add property, path & primary
• type restrictions to your query
47. 47
SOLR TYPES
Types of Solr that Oak uses
Embedded Solr
Primarily used for development
work. The solr instance runs within
AEM and can be configured similar
to the remote instance
Remote Solr
Used for non-development
level environments. Typically
these instances take
advantage of fault tolerant
features of the Solr cloud. In
many cases, existing solr
instances are used.
49. 49
LUCENE VS. SOLR
Main differences with the Lucene index
You create and control the solr config
Analyzers
Schema
• You must have a schema.xml that accurately reflects the properties and fields you want
indexed (and queried). Which is similar to how the property indexes are configured.
Currency
Language
Enabling additional Solr native functionality (example: mlt - more like this)
Some indexing overhead offloaded
All of this is configured on the Solr servers
50. //*[rep:native('lucene', 'wine OR beer')]
50
NAT IVE QUERIES
native
function
query type
solr or
lucene
query
select [jcr:path] from [nt:base]
where native('solr', 'mlt?q=Wine&mlt.fl=text&mlt.mindf=1&mlt.mintf=1')
51. 51
JCR BASED SOLR QUERIES
• Oak index cost is
factored
• Transparent to
executor
• Familiar JCR query
syntax
• Easy access to
repository objects
56. 56
Training http://bit.ly/AEMTraining
Documentation http://bit.ly/AEM5Docs &
http://bit.ly/AEM6Docs
GEMs Webinar Knowledge Exchange
www.adobe.com/go/gems
Mobile Dev: Get started with Adobe PhoneGap
https://github.com/blefebvre/aem-phonegap-kitchen-sink
https://github.com/blefebvre/aem-phonegap-starter-kit
Community
Meet with your peers on-line and in-person, get technical
help from the community, access community articles
• AEM Technologist Community: http://adobe.ly/Qe5BBw
• Evolve for AEM Technologists: http://bit.ly/EvolveDev
• AEM Help Forum: http://adobe.ly/OYdtY0
PackageShare
Sign in to the Adobe
Marketing Cloud to
access packages
http://bit.ly/AMCPKGSHARE
Marketing Cloud
Exchange
http://bit.ly/MCXChange
ADOBE EXPERIENCE MANAGER
Developer Resources
Editor's Notes
This is a bold statement, but the facts back it up.
Changes to clustering and Mongo mostly impact operations
Support for flat node structures is limited.
At a high level, this is how a query is processed. Keep in mind that this is actually split into two separate JCR API calls – execute() and getRows() or getNodes().
First the query is parsed into an Abstract Syntax Tree. If the query is an Xpath query, it is first transformed to SQL-2.
Then each index is consulted to estimate the cost for the query.
Then the results from cheapest index are retrieved.
Finally, these results are filtered, both to ensure that the current user has read access to the result and that the result matches the complete query.
Let’s look at a simple query. Here we are doing two property matches, a node type match, and a path restriction.
The cheapest index is the index on sling resource type. We’ll talk shortly about how this is determined.
In the sling resource type index, there are 100 nodes with the selected value.
This then gets filtered down to just a single result.
To determine what index is used, you can run an explain query.
This is the query prefixed with explain
The tricky part is that you have to look for this specific plan value and neither CRXDE Lite nor CRX Explorer can do that, at least not yet.
ACS AEM Tools, however, comes with this Explain Query tool which allows you to provide a query and get the plan. It will also attempt to decode the plan into a simple explanation. This doesn’t always work as the plan syntax is evolving. But if you see a plan which isn’t properly explained, please let us know.
One known issue is that the plan output doesn’t differentiate between property and ordered property indexes.
Index definitions are stored in the repository. There is a special node type, oak Query Index Definition and you will hear these referred to as QIDs in some parts.
There is just a single mandatory property named type which governs what type of index it is.
Note that the node name of an index isn’t particularly relevant, although you should keep it reasonable.
There are also some generic properties which are useable across several different index types.
If you want to view the current indexes, one option is the Oak Index Manager from ACS AEM Commons. This lists the current indexes in a table and allow for easy-access to reindexing.
The index content for several index types is stored in the repository, but as hidden nodes. So you can’t just view them with CRXDE Lite or CRX Explorer. You have to use oak-run.
For TarMK, this means shutting down AEM and using either the explore or console command.
For MongoMK, you don’t have to shut anything down, but you can only use the console command.
Later in the presentations, we’ll see some screenshots of what the index content looks like.
For Solr, you can view the raw index content in the Solr HTTP interface
There’s several different ways of creating an index definition.
You can create it as content using CRX DE Lite and deployed in a content package
You can also write code which creates the appropriate nodes
And in ACS AEM Commons, we have a configuration based utility for creating indexes. This only supports property indexes for now.
As in CRX2, reindexing requires traversing the entire repository. Unlike, CRX2, however, since there are multiple indexes, you can reindex one index at a time.
You need to reindex when a configuration changes which impacts the indexed content, for example changing the declaring node types.
Sometimes, especially before Service Pack 1, some Oak updates required reindexing. This hopefully won’t be the case in the future, but it is worth checking the release notes.
You should not reindex for fun. It is a resource intensive process.
For each query, the indexes are asked to estimate the cost. This is a relative value between 0 and Infinity. The index with the lowest cost wins and will be asked to actually execute the query.
The index’s cost should in theory represent the number of reads it will take to execute the query.
Here’s some sample debug output. I’ve removed the logger names so the text is legible.
Purple text is the output from QueryImpl
Orange text is the output from PropertyIndex
If you look at the orange text, you can see that the cost for the jcr:title property is Infinity. This is because there is no index on this property.
We also see in this output the first mention of ‘traverse’. This is the worst-case scenario where no index is usable and some portion of the repository needs to be
Here’s another example, this time with two indexed properties.
Purple text is the output from QueryImpl
Orange text is the output from PropertyIndex
There’s a number of OOTB index types and in fact you can write your own index type, although we won’t go into that in this presentation.
The Traversing index isn’t configured – it is hardcoded in the Oak index implementation. It is basically the worst case scenario – where a repository tree needs to be traversed node by node in order to find matches.
Property indexes index property values. They store node paths in a tree structure under each property value.
Property indexes can be defined as unique in which case they are a way to enforce a property’s uniqueness.
You can see the index data using the Oak Explorer. Here we are looking at the sling resource type index for the value foundation/components/image.
The nodes which match the property value have a match property set to true.
The Ordered Property Index is similar to the Property Index. The key difference is that each index value node has a special next property indicating the next value.
This index, at present, is basically broken for any non-string type as it only supports lexigraphic sorting.
Point AEM to Zookeeper, Zookeeper directs the query request to a “live” Shard.
Troubleshooting purposes
I would be remiss if I didn’t take this opportunity to mention one other thing – XPath still works. The reasons it is deprecated in the spec are complex and not worth going into here.
But it isn’t going away in AEM. And in fact, the XPath query parser will in many cases, specially with or clauses, handle some optimizations.