The document provides a deep dive into the lifecycle of a Solr search request, from the initial HTTP request to the generation of the response. It describes each stage of processing, including how the request is routed through the Solr core, how the query and filters are parsed and executed against the index, how various caches and plugins can be leveraged, and how the final response is generated. It uses examples of simple and more complex queries to demonstrate how each component interacts throughout the processing pipeline.
The next major release of Solr is right around the corner! Join Solr Committer Cassandra Targett and Lucidworks SVP of Engineering Trey Grainger for a first look into what’s included in the upcoming release.
Ingesting and Manipulating Data with JavaScriptLucidworks
Data in the wild isn’t always in the right format we need for search or even mere usability. Lucidworks Fusion offers powerful pipelines, parsers, and stages to wrangle your data into the right format to make it more findable and friendly. However, there are some cases where more obscure data will require the power of scripting.
Your data may need a complex transformation, a custom decryption algorithm, or you may already have existing code for handling a piece of data. Even in these more complex cases, Fusion’s JavaScript capabilities have got you covered.
SolrTM is the popular, blazing fast open Source Enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites like (Aol, Yahoo, Buy.com, Cnet, CitySearch, Netflix, Zappos, Stubhub!, digg, eTrade, Disney, Apple, NASA and MTV).
Got data? Let's make it searchable! This presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how to showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
The next major release of Solr is right around the corner! Join Solr Committer Cassandra Targett and Lucidworks SVP of Engineering Trey Grainger for a first look into what’s included in the upcoming release.
Ingesting and Manipulating Data with JavaScriptLucidworks
Data in the wild isn’t always in the right format we need for search or even mere usability. Lucidworks Fusion offers powerful pipelines, parsers, and stages to wrangle your data into the right format to make it more findable and friendly. However, there are some cases where more obscure data will require the power of scripting.
Your data may need a complex transformation, a custom decryption algorithm, or you may already have existing code for handling a piece of data. Even in these more complex cases, Fusion’s JavaScript capabilities have got you covered.
SolrTM is the popular, blazing fast open Source Enterprise search platform from the Apache LuceneTM project. Its major features include powerful full-text search, hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites like (Aol, Yahoo, Buy.com, Cnet, CitySearch, Netflix, Zappos, Stubhub!, digg, eTrade, Disney, Apple, NASA and MTV).
Got data? Let's make it searchable! This presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how to showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
Lucene powers the search capabilities of practically all library discovery platforms, by way of Solr, etc. The Lucene project evolves rapidly, and it's a full-time job to keep up with the ever improving features and scalability. This talk will distill and showcase the most relevant(!) advancements to date.
You’re Solr powered, and needing to customize its capabilities. Apache Solr is flexibly architected, with practically everything pluggable. Under the hood, Solr is driven by the well-known Apache Lucene. Lucene for Solr Developers will guide you through the various ways in which Solr can be extended, customized, and enhanced with a bit of Lucene API know-how. We’ll delve into improving analysis with custom character mapping, tokenizing, and token filtering extensions; show why and how to implement specialized query parsing, and how to add your own search and update request handling.
Overview of Solr 6.2 examples, including features they have and challenges they present. A contrasting demonstration of a minimal viable example. A step-by-step deconstruction of "films" example to show what part of shipped examples are not actually needed.
Faster Data Analytics with Apache Spark using Apache SolrChitturi Kiran
Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Spark SQL allows users to execute relation queries in Spark with distributed in-memory computations. Though Spark gives us faster in-memory computations, Solr is blazing fast for some analytic queries. In this talk, we will take a deep dive into how to optimize the SQL queries from Spark to Solr by plugging into the Spark LogicalPlanner using pushdown strategies. The key take aways from the talk will be:
How to perform Spark SQL queries with Apache Solr?
What happens inside a Spark SQL query?
How to plug into Spark Logical Planner?
What type of push-down strategies are optimal with Solr?
Examples of push-down strategies
Presented at Lucene Revolution - http://sched.co/BAwV
Building your own search engine with Apache SolrBiogeeks
Andrew Clegg : Building your own search engine with Apache Solr
Apache Solr (http://lucene.apache.org/solr/) is an open-source search
engine based on the popular Lucene library with a huge variety of
features. In this talk, Andrew describes how he used it to build a
high-performance search tool for protein and domain structures at
CATH, and talks about some of the suprisingly cool things you can do
with it beyond simple searching.
Webinar: Solr & Spark for Real Time Big Data AnalyticsLucidworks
Lucidworks Senior Engineer and Lucene/Solr Committer Tim Potter presents common use cases for integrating Spark and Solr, access to open source code, and performance metrics to help you develop your own large-scale search and discovery solution with Spark and Solr.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward.
Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward. Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Come learn how you can get your content into Solr and integrate it into your applications!
Solr 4.0 dramatically improves scalability, performance, and flexibility. An overhauled Lucene underneath sports near real-time (NRT) capabilities allowing indexed documents to be rapidly visible and searchable. Lucene’s improvements also include pluggable scoring, much faster fuzzy and wildcard querying, and vastly improved memory usage. These Lucene improvements automatically make Solr much better, and Solr magnifies these advances with “SolrCloud.” SolrCloud enables highly available and fault tolerant clusters for large scale distributed indexing and searching. There are many other changes that will be surveyed as well. This talk will cover these improvements in detail, comparing and contrasting to previous versions of Solr.
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
Slides for my presentation at SoCal Code Camp, June 29, 2014
(http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6337660f-37de-4d6e-a5bc-46ba54478e5e)
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
Lucene powers the search capabilities of practically all library discovery platforms, by way of Solr, etc. The Lucene project evolves rapidly, and it's a full-time job to keep up with the ever improving features and scalability. This talk will distill and showcase the most relevant(!) advancements to date.
You’re Solr powered, and needing to customize its capabilities. Apache Solr is flexibly architected, with practically everything pluggable. Under the hood, Solr is driven by the well-known Apache Lucene. Lucene for Solr Developers will guide you through the various ways in which Solr can be extended, customized, and enhanced with a bit of Lucene API know-how. We’ll delve into improving analysis with custom character mapping, tokenizing, and token filtering extensions; show why and how to implement specialized query parsing, and how to add your own search and update request handling.
Overview of Solr 6.2 examples, including features they have and challenges they present. A contrasting demonstration of a minimal viable example. A step-by-step deconstruction of "films" example to show what part of shipped examples are not actually needed.
Faster Data Analytics with Apache Spark using Apache SolrChitturi Kiran
Apache Spark is a fast and general engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing. Spark SQL allows users to execute relation queries in Spark with distributed in-memory computations. Though Spark gives us faster in-memory computations, Solr is blazing fast for some analytic queries. In this talk, we will take a deep dive into how to optimize the SQL queries from Spark to Solr by plugging into the Spark LogicalPlanner using pushdown strategies. The key take aways from the talk will be:
How to perform Spark SQL queries with Apache Solr?
What happens inside a Spark SQL query?
How to plug into Spark Logical Planner?
What type of push-down strategies are optimal with Solr?
Examples of push-down strategies
Presented at Lucene Revolution - http://sched.co/BAwV
Building your own search engine with Apache SolrBiogeeks
Andrew Clegg : Building your own search engine with Apache Solr
Apache Solr (http://lucene.apache.org/solr/) is an open-source search
engine based on the popular Lucene library with a huge variety of
features. In this talk, Andrew describes how he used it to build a
high-performance search tool for protein and domain structures at
CATH, and talks about some of the suprisingly cool things you can do
with it beyond simple searching.
Webinar: Solr & Spark for Real Time Big Data AnalyticsLucidworks
Lucidworks Senior Engineer and Lucene/Solr Committer Tim Potter presents common use cases for integrating Spark and Solr, access to open source code, and performance metrics to help you develop your own large-scale search and discovery solution with Spark and Solr.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward.
Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward. Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Come learn how you can get your content into Solr and integrate it into your applications!
Solr 4.0 dramatically improves scalability, performance, and flexibility. An overhauled Lucene underneath sports near real-time (NRT) capabilities allowing indexed documents to be rapidly visible and searchable. Lucene’s improvements also include pluggable scoring, much faster fuzzy and wildcard querying, and vastly improved memory usage. These Lucene improvements automatically make Solr much better, and Solr magnifies these advances with “SolrCloud.” SolrCloud enables highly available and fault tolerant clusters for large scale distributed indexing and searching. There are many other changes that will be surveyed as well. This talk will cover these improvements in detail, comparing and contrasting to previous versions of Solr.
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
Slides for my presentation at SoCal Code Camp, June 29, 2014
(http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6337660f-37de-4d6e-a5bc-46ba54478e5e)
Slides to the Hands On Spring Data lab, presented in Paris on Dec 10th, 2012. Code exercises are here: https://github.com/ericbottard/hands-on-spring-data
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
code4lib 2011 preconference, presented by Erik Hatcher of Lucid Imagination.
Abstract: The library world is fired up about Solr. Practically every next-gen catalog is using it (via Blacklight, VuFind, or other technologies). Solr has continued improving in some dramatic ways, including geospatial support, field collapsing/grouping, extended dismax query parsing, pivot/grid/matrix/tree faceting, autosuggest, and more. This session will cover all of these new features, showcasing live examples of them all, including anything new that is implemented prior to the conference.
Solr at zvents 6 years later & still going stronglucenerevolution
Presented by Amit Nithianandan, Lead Engineer Search/Analytics New Platforms, Zvents/Stubhub
Zvents has been a user of Apache Solr since 2007 when it was very early. Since then, the team has made extensive use of the various features and most recently completed an overhaul of the search engine to Solr 4.0. We'll touch on a variety of development/operational topics including how we manage the build lifecycle of the search application using Maven, release the deployment package using Capistrano and monitor using NewRelic as well as the extensive use of virtual machines to simplify node management. Also, we’ll talk about application level details such as our unique federated search product, and the integration of technologies such as Hypertable, RabbitMQ, and EHCache to power more real-time ranking and filtering based on traffic statistics and ticket inventory.
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Search is the Tip of the Spear for Your B2B eCommerce StrategyLucidworks
With ecommerce experiencing explosive growth, it seems intuitive that the B2B segment of that ecosystem is mirroring the same trajectory. That said, B2B has very different needs when it comes to transacting with the same style of experiences that we see in B2C. For instance, B2B ecommerce is about precision findability, whereas B2C customers can convert at higher rates when they’re just browsing online. In order for the B2B buying experience to be successful, search needs to be tuned to meet the unique needs of the segment.
In this webinar with Forrester senior analyst Joe Cicman, you’ll learn:
-Which verticals in B2B will drive the most growth, and how machine-learning powered personalization tactics can be deployed to support those specific verticals
-Why an omnichannel selling approach must be deployed in order to see success in B2B
-How deploying content search capabilities will support a longer sales cycle at scale
-What the next steps are to support a robust B2B commerce strategy supported by new technology
Speakers
Joe Cicman, Senior Analyst, Forrester
Jenny Gomez, VP of Marketing, Lucidworks
Customer loyalty starts with quickly responding to your customer’s needs. When it comes to resolving open support cases, time is of the essence. Time spent searching for answers adds up and creates inefficiencies in resolving cases at scale. Relevant answers need to be a few clicks away and easily accessible for agents directly from their service console.
We will explore how Lucidworks’ Agent Insights application automatically connects agents with the correct answers and resources. You’ll learn how to:
-Configure a proactive widget in an agent’s case view page to access resources across third-party systems (such as Sharepoint, Confluence, JIRA, Zendesk, and ServiceNow).
-Easily set up query pipelines to autonomously route assets and resources that are relevant to the case-at-hand—directly to the right agent.
-Identify subject matter experts within your support data and access tribal knowledge with lightning-fast speed.
How Crate & Barrel Connects Shoppers with Relevant ProductsLucidworks
Lunch and Learn during Retail TouchPoints #RIC21 virtual event.
***
Crate & Barrel’s previous search solution couldn’t provide its shoppers with an online search and browse experience consistent with the customer-centric Crate & Barrel brand. Meanwhile, Crate & Barrel merchandisers spent the bulk of their time manually creating and maintaining search rules. The search experience impacted customer retention, loyalty, and revenue growth.
Join this lunch & learn for an interactive chat on how Crate & Barrel partnered with Lucidworks to:
-Improve search and browse by modernizing the technology stack with ML-based personalization and merchandising solutions
-Enhance the experience for both shoppers and merchandisers
-Explore signals to transform the omnichannel shopping experience
Questions? Visit https://lucidworks.com/contact/
Learn how to guide customers to relevant products using eCommerce search, hyper-personalisation, and recommendations in our ‘Best-In-Class Retail Product Discovery’ webinar.
Nowadays, shoppers want their online experience to be engaging, inspirational and fulfilling. They want to find what they’re looking for quickly and easily. If the sought after item isn’t available, they want the next best product or content surfaced to them. They want a website to understand their goals as though they were talking to a sales assistant in person, in-store.
In this webinar, we explore IMRG industry data insights and a best-in-class example of retail product discovery. You’ll learn:
- How AI can drive increased revenue through hyper-personalised experiences
- How user intent can be easily understood and results displayed immediately
- How merchandisers can be empowered to curate results and product placement – all without having to rely on IT.
Presented by:
Dave Hawkins, Principal Sales Engineer - Lucidworks
Matthew Walsh, Director of Data & Retail - IMRG
Connected Experiences Are Personalized ExperiencesLucidworks
Many companies claim personalization and omnichannel capabilities are top priorities. Few are able to deliver on those experiences.
For a recent Lucidworks-commissioned study, Forrester Consulting surveyed 350+ global business decision-makers to see what gets in the way of achieving these goals. They discovered that inefficient technology, lack of behavioral insights, and failure to tie initiatives to enterprise-wide goals are some of the most frequent blockers to personalization success.
Join guest speaker, Forrester VP and Principal Analyst, Brendan Witcher, and Lucidworks CEO, Will Hayes, to hear the results of the Forrester Consulting study, how to avoid “digital blindness,” and how to apply VoC data in real-time to delight customers with personalized experiences connected across every touchpoint.
In this webinar, you’ll learn:
- Why companies who utilize real-time customer signals report more effective personalization
- How to connect employees and customers in a shared experience through search and browse
- How Lucidworks clients Lenovo, Morgan Stanley and Red Hat fast-tracked improvements in conversion, engagement and customer satisfaction
Featuring
- Will Hayes, CEO, Lucidworks
- Brendan Witcher, VP, Principal Analyst, Forrester
Intelligent Insight Driven Policing with MC+A, Toronto Police Service and Luc...Lucidworks
Intelligent Policing. Leveraging Data to more effectively Serve Communities.
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
-The technology needs of an intelligent police force.
-How a Global Search improves an officer's interaction with existing data.
Featuring:
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
[Webinar] Intelligent Policing. Leveraging Data to more effectively Serve Com...Lucidworks
Policing in the next decade is anticipated to be very different from historical methods. More data driven, more focused on the intricacies of communities they serve and more open and collaborative to make informed recommendations a reality. Whether its social populations, NIBRS or organization improvement that’s the driver, the IT requirement is largely the same. Provide 360 access to large volumes of siloed data to gain a full 360 understanding of existing connections and patterns for improved insight and recommendation.
Join us for a round table discussion of how the Toronto Police Service is better serving their community through deploying a unified intelligent data platform.
Data innovation improves officers' engagement with existing data and streamlines investigation workflows by enhancing collaboration. This improved visibility into existing police data allows for a more intelligent and responsive police force.
In this webinar, we'll cover:
The technology needs of an intelligent police force.
How a Global Search improves an officer's interaction with existing data.
Featuring
-Simon Taylor, VP, Worldwide Channels & Alliances, Lucidworks
-Michael Cizmar, Managing Director, MC+A
-Ian Williams, Manager of Analytics & Innovation, Toronto Police Service
Accelerate The Path To Purchase With Product Discovery at Retail Innovation C...Lucidworks
Wish your conversion rates were higher? Can’t figure out how to efficiently and effectively serve all the visitors on your site? Embarrassed by the quality of your product discovery experience? The bar is high and the influx of online shopping over recent months has reminded us that the opportunities are real. We’re all deep in holiday prep, but let’s take a few minutes to think about January 2021 and beyond. How can we position ourselves for success with our customers and against our competition?
Grab your lunch and let’s dive into three strategies that need to be part of your 2021 roadmap. You don’t need an army to get there. But you do need to take action and capitalize on the shoppers abandoning the product discovery journey on your site.
In this session, attendees will find out how to:
-Take control of merchandising at scale;
-Implement hands-free search relevancy; and
-Address personalization challenges.
AI-Powered Linguistics and Search with Fusion and RosetteLucidworks
For a personalized search experience, search curation requires robust text interpretation, data enrichment, relevancy tuning and recommendations. In order to achieve this, language and entity identification are crucial.
For teams working on search applications, advanced language packages allow them to achieve greater recall without sacrificing precision.
Join us for a guided tour of our new Advanced Linguistics packages, available in Fusion, thanks to the technology partnership between Lucidworks and Basistech.
We’ll explore the application of language identification and entity extraction in the context of search, along with practical examples of personalizing search and enhancing entity extraction.
In this webinar, we’ll cover:
-How Fusion uses the Rosette Basic Linguistics and Entity Extraction packages
-Tips for improving language identification and treatment as well as data enrichment for personalization
-Speech2 demo modeling Active Recommendation
-Use Rosette’s packages with Fusion Pipelines to build custom entities for specific domain use cases
Featuring:
-Radu Miclaus, Director of Product, AI and Cloud, Lucidworks, Lucidworks
-Robert Lucarini, Senior Software Engineer, Lucidworks
-Nick Belanger, Solutions Engineer, Basis Technology
The Service Industry After COVID-19: The Soul of Service in a Virtual MomentLucidworks
Before COVID-19, almost 80% of the US workforce worked service in jobs that involve in-person interaction with strangers. Now, leaders of service organizations must reshape their offerings during the pandemic and prepare for whatever the new normal turns out to be. Our three panelists will share ideas for adapting their service businesses, now that closer-than-six-feet isn’t an option.
Join Lucidworks as we talk shop with 3 service business leaders, covering:
-Common impacts of the pandemic on service businesses (and what to do about them),
-How service teams can maintain a human touch across virtual channels, and
-Plans for the future, before and after the pandemic subsides.
Featuring
-Sara Nathan, President & CEO, AMIGOS
-Anthony Carruesco, Founder, AC Fly Fishing
-sara bradley, chef and proprietor, freight house
-Justin Sears, VP Product Marketing, Lucidworks
Webinar: Smart answers for employee and customer support after covid 19 - EuropeLucidworks
The COVID-19 pandemic has forced companies to support far more customers and employees through digital channels than ever before. Many are turning to chatbots to help meet increasing demand, but traditional rules-based approaches can’t keep up. Our new Smart Answers add-on to Lucidworks Fusion makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
Smart Answers for Employee and Customer Support After COVID-19Lucidworks
Watch our on-demand webinar showcasing Smart Answers on Lucidworks Fusion. This technology makes existing chatbots and virtual assistants more intelligent and more valuable to the people you serve.
In this webinar, we’ll cover off:
-How search and deep learning extend conversational frameworks for improved experiences
-How Smart Answers improves customer care, call deflection, and employee self-service
-A live demo of Smart Answers for multi-channel self-service support
Applying AI & Search in Europe - featuring 451 ResearchLucidworks
In the current climate, it’s now more important than ever to digitally enable your workforce and customers.
Hear from Simon Taylor, VP Global Partners & Alliances, Lucidworks and Matt Aslett, Research Vice President, 451 Research to get the inside scoop on how industry leaders in Europe are developing and executing their digital transformation strategies.
In this webinar, we’ll discuss:
The top challenges and aspirations European business and technology leaders are solving using AI and search technology
Which search and AI use cases are making the biggest impact in industries such as finance, healthcare, retail and energy in Europe
What technology buyers should look for when evaluating AI and search solutions
Webinar: 5 Must-Have Items You Need for Your 2020 Ecommerce StrategyLucidworks
In this webinar with 451 Research, you'll understand how retailers are using AI to predict customer intent and learn which key performance metrics are used by more than 120 online retailers in Lucidworks’ 2019 Retail Benchmark Survey.
In this webinar, you’ll learn:
● What trends and opportunities are facing the ecommerce industry in 2020
● Why search is the universal path to understanding customer intent
● How large online retailers apply AI to maximize the effectiveness of their personalization efforts
Where Search Meets Science and Style Meets Savings: Nordstrom Rack's Journey ...Lucidworks
Nordstrom Rack | Hautelook curates and serves customers a wide selection of on-trend apparel, accessories, and shoes at an everyday savings of up to 75 percent off regular prices. With over a million visitors shopping across different platforms every day, and a realization that customers have become accustomed to robust and personalized search interactions, Nordstrom Rack | Hautelook launched an initiative over a year ago to provide data science-driven digital experiences to their customers.
In this session, we’ll discuss Nordstrom Rack | Hautelook’s journey of operationalizing a hefty strategy, optimizing a fickle infrastructure, and rallying troops around a single vision of building an expansible machine-learning driven product discovery engine.
The audience will learn about:
-The key technical challenges and outcomes that come with onboarding a solution
-The lessons learned of creating and executing operational design
-The use of Lucidworks Fusion to plug custom data science models into search and browse applications to understand user intent and deliver personalized experiences
Apply Knowledge Graphs and Search for Real-World Decision IntelligenceLucidworks
Knowledge graphs and machine learning are on the rise as enterprises hunt for more effective ways to connect the dots between the data and the business world. With newer technologies, the digital workplace can dramatically improve employee engagement, data-driven decisions, and actions that serve tangible business objectives.
In this webinar, you will learn
-- Introduction to knowledge graphs and where they fit in the ML landscape
-- How breakthroughs in search affect your business
-- The key features to consider when choosing a data discovery platform
-- Best practices for adopting AI-powered search, with real-world examples
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 4
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
1. Lifecycle of a
Solr Search
Request
Chris "Hoss" Hostetter - 2017-09-14
https://home.apache.org/~hossman/rev2017/
https://twitter.com/_hossman
https://www.lucidworks.com/
Abstract:
This intermediate session for existing Solr users will provide a
Deep Dive look into the lifecycle of a Solr Search Request. We
will drill down through each layer of code, discussing what
happens at each stage -- including when & how inter-node
communication takes place in a multi-node SolrCloud cluster.
Along the way, we will also review the various places where
users can configure existing (or custom written) plugins to
override or amend the default behavior.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
1 of 24 10/4/17, 4:32 PM
2. Agenda
Deep Dive look into the lifecycle of 4 Solr Search Requests...
Single Node: Single SolrCore
Simple Query1.
Facet Query2.
SolrCloud: 2 Shards + 2 Replicas
Simple Query3.
Facet Query4.
...and where various types of Plugins can be used.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
2 of 24 10/4/17, 4:32 PM
3. Simple Query
Single Node: Single SolrCore
bin/solr -e techproducts
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& rows = 10
This sample paginated query is based off of the techproducts
example configs & data that have been included in ever release of Solr
since it was first open sourced.
I have a nostalgic affection for this silly little dataset.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
3 of 24 10/4/17, 4:32 PM
4. HTTP (Jetty)
SolrDispatchFilter
Solr Webapp/solr ➔
CoreContainer
/techproducts ➔ SolrCore
/select? ➔ RequestHandler
SolrCore
foo
SolrCore
etc...
wt=json ➔ ResponseWriter
...:8983/solr/techproducts/select?...
UI:HTML,Javascript,
Images,CSS
SolrCore
techproducts
Purple: The HTTP layer, currently implemented by Jetty
Blue: Solr runs as "webapp" inside the Jetty Servlet container (but
that's just an implementation detail)
Black: The key pieces of the Solr webapp: misc "flat files" that power
the Solr UI, and the SolrDispatchFilter which is responsible
for mapping all HTTP request/responses into their internal Solr
representations and executing them
Red: CoreContainer is singleton responsible to managing the
lifecycle of SolrCores
Green: each SolrCore encapsulates the configs & data for a single
"index" (which in a SolrCloud configuration would be a replica of
some shard or some collection)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
4 of 24 10/4/17, 4:32 PM
5. SolrCore: techproducts
SolrRequestHandlers SearchComponents
QueryComponent: query
- prepare()
- df=text&q=ipod ➔ Query
- etc...
- process()
- etc...
SearchHandler: /select
- initParams
- df = text (default)
- components (implicit)
- query
- etc...
SearchHandler: /etc...
UpdateRequestHandler : /etc...
FacetComponent: facet
etc...
Green: The SolrCore used for this (HTTP) request
Black: Named instances of (plugable) SolrRequestHandlers.
SearchHandler is the most common, and it uses a configurable
list of SearchComponents
Red: Named instances of (plugable) SearchComponents,
QueryComponent is the only one used in this simple request
All SearchComponents implement prepare() & process()
methods, which are called by SearchHandler
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
5 of 24 10/4/17, 4:32 PM
6. SolrIndexSearcher
query
IndexSchema
- SchemaFields ➔ FieldTypes
QueryComponent.prepare()
+ rows=10 ➔ ok?
fl=id,name ➔ ok?
/ q ➔ LuceneQParser
LuceneQParser + (df=text ➔ text) + "ipod" ➔ TermQuery
( "inStock desc" ➔ bool ➔ BoolField.getSortField(inStock,desc)
+ "score desc" ➔ SortField.SCORE ) ➔ Sort
TextField: text
- Analyzer
- Similarity
- etc...
TextField: etc..
- Analyzer
- Similarity
- etc...
BoolField: bool
- Analyzer
- Similarity
- getSortField
- etc...
LuceneQParser
DismaxQParser
etc...
Red: QueryComponent.prepare() and it's basic logic for
validating & parsing the basic request params
Green: Named instances of (pluggable) QParserPlugins for
parsing query strings (q & fq params). Here the (implicit) default
LuceneQParser
Orange: The IndexSchema which contains...
Named SchemaFields (or dynamicFields) which map
to...
Purple: Named instances of (pluggable) FieldTypes which
dictate how the field names mapped to them are parsed,
indexed, sorted, queried, etc...
Blue: The SolrIndexSearcher is ultimately what will be
queried with these parsed queries & sort objects
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
6 of 24 10/4/17, 4:32 PM
8. Red: QueryComponent.process() which uses the
SolrIndexSearcher to execute the Query created by it's
prepare() method
Blue: the SolrIndexSearcher includes several caches in
addition to the InvertedIndex, and when executing a query, first
evaluates the start/rows requested to fit a configured "window size"
so that "page #2" type requests can result in a cache hit & re-use the
results computed for "page #1"
Orange: The low level InvertedIndex & The
queryResultCache that can be used in it's place when
executing basic searchers & the DocList containing a sorted
list of (internal) doc#s and their scores for the requested
start+rows of this query
Purple: The Stored Fields of the documents in the index & the
documentCache used by SolrIndexSearcher to
reduce disk reads when popular documents are frequently
matched by searches
Green: Named instances of (pluggable)
QueryResponseWriters which dictate how the data structures
produced once a request is processed get serialized into bytes (for
the HTTP response returned to the original client by Jetty)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
8 of 24 10/4/17, 4:32 PM
9. More Complex Query
Single Node: Single SolrCore
http://localhost:8983/solr/techproducts/select
? q = ipod
& fq = price:[* TO 1000]
& sort = div(popularity,price) asc,
score desc
& fl = id, name, why:[explain style=nl]
& facet = true
& facet.field = cat
This slightly more interesting query builds off the previous example by:
Adding a "filter query" on the (numeric) price field
Changing the primary sort criteria to be a mathematical function
against 2 fields
Requesting an additional psuedo-field explaining the score of each
document
Faceting on the "cat" (aka: category) field
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
9 of 24 10/4/17, 4:32 PM
10. HTTP (Jetty)
SolrDispatchFilter
Solr Webapp/solr ➔
CoreContainer
/techproducts ➔ SolrCore
/select? ➔ RequestHandler
SolrCore
foo
SolrCore
etc...
wt=json ➔ ResponseWriter
...:8983/solr/techproducts/select?...
UI:HTML,Javascript,
Images,CSS
SolrCore
techproducts
The HTTP, Webapp, DispatchFilter, CoreContainer, SolrCore, and
RequestHandler layers all function exactly as in our previous (simpler)
example. It's only once the SearchHandler starts looping over the
components that things get more interesting....
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
10 of 24 10/4/17, 4:32 PM
11. query
IndexSchema
- SchemaFields ➔ FieldTypes
QueryComponent.prepare()
etc...
"price:[* TO 1000]" ➔ float
➔ PointRangeQuery(...) ➔ filters[]
div(popularity,price)
➔ ValueSource(IntFieldSource,...)
FloatPointField: float
- ValueSource
- getRangeQuery()
- etc...
IntPointField: int
- ValueSource
- etc...
FacetComponent.prepare()
facet=true ✔
facet.field=cat ➔ ok?
needDocSet = true
SolrIndexSearcher
div()
sum()
etc...
Most items identical to those shown in the "simple" query are omitted for
brevity. Of the new items shown here...
Red: In addition to some additional logic in
QueryComponent.prepare() method (to parse the filter
query and more complex sort) we know also see the
FacetComponent.prepare() method, which does it's own
validation & sets a flag indicating that it needs extra info (the
DocSet) once SolrIndexSearcher is asked to execute the
Query
Green: Named instances of (pluggable) ValueSourceParsers
for parsing function strings -- used here in our sort, but could also be
used in queries
Orange: As before the IndexSchema, now showing that
FieldTypes are also responsible for providing the range query
(filter) and ValueSources (used by the functions)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
11 of 24 10/4/17, 4:32 PM
12. SolrIndexSearcher
queryQueryComponent.process()
search(...) ➔〈DocList,DocSet〉
etc...
JsonResponseWriter
DocList {
+ searcher.doc(#)
➔ Stored Fields
+ [explain ...]
}
+ Facet Counts
➔ Bytes ➔ HTTP...
ExplainAugmenter
ChildDocTransformer
queryFacetComponent.process()
For Each "cat" Index Terms:
➔ Intersect with DocSet
SubQueryAugmenter
etc...
searcher.explain(#)
documentCache
queryResultCache
filterCache
IndexReader
- InvertedIndex
- Stored Fields
Most items identical to those shown in the "simple" query are omitted for
brevity. Of the new items shown here...
Red: Now when QueryComponent.process() executes the
search, the "needsDocSet" flag set by
FacetComponent.prepare() is also used.
FacetComponent.process() can then use the resulting
DocSet (an unordered set of all matching doc# -- regardless of sort)
to compute the facet counts.
Olive: Named instances of (pluggable) DocTransformers (or
Augmenters) which can be used to annotate individual documents
returned in the results. For this query in particular we see the
ExplainAugmenter which uses the SolrIndexSearcher to
get a (debugging) data structure "explaining" how the score of each
document was computed.
Green: the JsonResponseWriter not only returns the Stored
Fields of each document, but also the results of any
DocTransformers. It also serializes the Facet Counts.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
12 of 24 10/4/17, 4:32 PM
13. Simple Query
SolrCloud: 4 Nodes, 2 Shards, 2 Replicas
bin/solr -e cloud
...
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& rows = 10
This is the same as or original simple query, still using the
techproducts sample configs & data, but from here on we'll assume
we're using a 4 node SolrCloud cluster, with the techproducts
collection configured to have 2 shards, with a replication factor of 2.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
13 of 24 10/4/17, 4:32 PM
14. SolrDispatchFilter
/techproducts ➔ tech_s1_r2
Jetty: http://host1:8983
SolrDispatchFilter
/techproducts ?➔ host4
Jetty: http://host3:8983
SolrDispatchFilter
/techproducts ?➔ tech_s2_r2
Jetty: http://host2:8983
SolrDispatchFilter
/techproducts ➔ tech_s2_r1
Jetty: http://host4:8983
techproducts
tech_s1_r2
foo
foo_s1_r1
foo
foo_s2_r1
techproducts
tech_s1_r1
techproducts
tech_s2_r1
foo
foo_s1_r2
techproducts
tech_s2_r2
foo
foo_s2_r2
Purple: 4 Jetty instances, running on (the same port 8983 of) 4
different hosts
Black: The 4 SolrDispatchFilters running inside each of
these 4 Jetty instances, and how each of them resolves requests for
the techproducts collection.
Green the individual SolrCores (which are each a replica of some
shard of a collection) running in each Solr node. Note that for the
purposes of illustrating the diff possible ways a Solr request may be
routed, host3 does not contain any SolrCores that are part of the
techproducts collection.
(Other Layers such as the Solr webapp and the CoreContainer have
been omitted to save space)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
14 of 24 10/4/17, 4:32 PM
15. coordinator shard1
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β1: ids=X,Y,Z&fl=name ➔ ...
shard2
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β2: ids=A,..,G&fl=name ➔ ...
SearchHandler: /select
Repeat until done:
query.distributedProcess
➔ ShardRequests (α,β)
Loop: ShardRequests
query.handleResponse
QueryComponent:
distributedProcess()
α: shard top10 + sort values
β: full fl for final top10 ids
FacetComponent
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
15 of 24 10/4/17, 4:32 PM
16. Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator'
node, and 2 nodes each hosting a replica of the 2 shards for the
collection
Black: SearchHandler. On the coordinator node,
SearchHandler executes new logic to execute sub-requests
created by it's SearchComponents to arbitrarily selected replicas
of each shard. On the replicas handling these sub-requests, the
SearchHandler processes these requests just as if they were
simple (single node) queries.
Red: SearchComponent methods. On the coordinator node
SearchHandler loops over every component calling
SearchComponent.distributedProcess() to
create/modify sub-requests for the individual shards, and then calls
SearchComponent.handleResponse() to merge the
results from each shard and decide if/when/what additional
information may be needed. This process repeats until all calls to
distributedProcess() on all SearchComponents
indicate that they are finished.
Green & Blue: The 2 stages (α & β) of shard sub-requests needed to
process this simple query. Note that the α-requests are identical for
both shards, but the β-requests are slightly different to request the
fl fields for the matches specific to that shard.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
16 of 24 10/4/17, 4:32 PM
17. Shard Request α
q=ipod&fl=id&fsv=true&rows=10
sort=inStock desc, score desc numFound=42+314=356
Z, Zebra
F, Frog
B, Boat
D, Deer
C, Car
X, X-Ray
G, Gong
A, Apple
Y, Yo-Yo
E, Ear
Merged
Shard 1
numFound=42
F〈true,6〉
B〈true,6〉
D〈true,5〉
C〈true,3〉
G〈true,2〉
A〈true,1〉
E〈false,5〉
Shard 2
numFound=314
Z〈true,6〉
X〈true,3〉
Y〈false,9〉 Shard Request β
q=ipod&ids=...&fl=name
Shard 1
A, Apple
B, Boat
C, Car
D, Deer
E, Ear
F, Frog
G, Gong
Shard 2
X, X-Ray
Y, Yo-Yo
Z, Zebra
Here we see hypothetical α request+responses, hypothetical β
requests+responses, & the final Merged results from both -- showing how
the IDs and sort values from the α request are used to determine which
documents will be in the final results, and in which order. For these specific
documents, the β requests+responses fill in the fl fields for the final
client.
Red & Blue: The responses from shard1 & shard2 for the α request
Green & Purple: The responses from shard1 & shard2 for the β
request
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
17 of 24 10/4/17, 4:32 PM
18. Complex Query*
SolrCloud: 4 Node, 2 Shards, 2 Replicas
http://localhost:8983/solr/techproducts/select
? q = ipod
& sort = inStock desc, score desc
& fl = id, name
& facet = true
& facet.field = cat
In the interest of time, this query is not as "Complex" as the "Complex"
Single Core query we looked at before. I've omitted things like fq params,
sorting on functions, and the use of DocTransformers in the fl
because nothing about how those are handled in a Single Core query
changes when they are requested by a coordinator node in a SolrCloud
query.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
18 of 24 10/4/17, 4:32 PM
19. coordinator shard1
QueryComponent:
prepare() + process()
α: q=ipod&fl=id&fsv=true
➔ top ids + sort values
β1: ids=X,Y,Z&fl=name ➔...
FacetComponent:
prepare() + process()
α: facet.limit=N + extra
➔ top terms w/counts
β1: ..._terms=aa,qq,... ➔...
QueryComponent:
distributedProcess()
α: shard top10 + sort values
β: full fl for final top10 ids
shard2
FacetComponent:
distributedProcess()
α: facet.field=cat
w/facet.limit overrequest
β: request missing counts
for final top terms
SearchHandler: /select
➔ ShardRequests (α, β)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
19 of 24 10/4/17, 4:32 PM
20. Purple: The HTTP Layer showing 3 hosts: an arbitrary 'coordinator'
node, and 2 nodes each hosting a replica of the 2 shards for the
collection. To save space, the (largely redundant) details of the
requests to shard2 are not shown.
Black: SearchHandler. To save space, the details (shown in
previous diagrams) regarding how SearchHandler processes
requests when acting as a coordinator have been omitted -- the key
thing to note is that even with the added complexity of the
FacetComponent, there are still only 2 stages of sub-requests to
each shard (α & β)
Red: SearchComponent methods:
QueryComponent behaves exactly as before
Now that FacetComponent is in use, it can modify the sub-
requests created by QueryComponent to "piggy back" on
them and request additional information from each shard.
Green & Blue: The 2 stages (α & β) of shard sub-requests needed to
process this query. Although the details of the requests to shard2 are
omitted for brevity, the α-requests are identical for both shards, and
(as before) the β-requests are slightly different to request both the
the fl fields for the document matches specific to that shard, as well
as the facet counts for any "candidate" terms that were not included
in the α response from that shard.
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
20 of 24 10/4/17, 4:32 PM
22. Here we see the additional information involved in α & β
requests+responses+merging for our more complex queries compared to
what we looked at before. The information requested & merged by
QueryComponent is omitted for brevity, and we focus solely on how
FacetComponent modifies those requests to "overrequest" the
original facet.limit and what it does with the results.
In the α request, over-request additional terms from each shard beyond
what the user asked for; In the β request, ask each shard for the details
about any terms that are "candidates" for the final results but where NOT
already returned by this shard in the α response.
Each term that is a candidate for the final response is shown in a unique
color. Black/Grey is used to indicate terms where incomplete information
is available to the coordinator, but enough is known to be confident that
they can't possibly be candidates for the final results. Faded terms (in
italics) show at what stage the coordinating FacetComponent knows
that particular term can be eliminated for consideration.
(While the "..." ellipses are used to denote the possibility of many
additional terms depending on the value of facet.limit=N (which
defaults to 100), viewers may find the easiest way to understand how
these results are merged & refined is to assume N=3 and imagine the
ellipses do not exist in the diagram)
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
22 of 24 10/4/17, 4:32 PM
23. Q & A
Lifecycle of a Solr Search Request https://people.apache.org/~hossman/rev2017/
23 of 24 10/4/17, 4:32 PM