This document provides an overview of Lucene and Solr. It introduces Erik Hatcher, who is a committer to Lucene and Solr projects and co-founder of Lucid Imagination, a company that provides commercial support for Lucene and Solr. It then provides brief descriptions of Lucene, its inverted index structure, segments and merging, and scoring. Finally, it discusses Solr architecture and some extension points for customizing Lucene and Solr functionality.
Think *inside* the box. Inside the *search* box, that is.
The "best"* search results incorporate many more factors than (just) textual matching and relevancy. Search experience owners manage query context rules, signals automatically feed back machine learned factors, users implicit and explicit behaviors filter and weight future interactions. Synergy emerges with several cooperating (just) searches.
This talk will showcase and detail several (just) search examples including rules, typeahead/suggest, signals, and location awareness, bringing them all together into a cohesive search experience.
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
code4lib 2011 preconference, presented by Erik Hatcher of Lucid Imagination.
Abstract: The library world is fired up about Solr. Practically every next-gen catalog is using it (via Blacklight, VuFind, or other technologies). Solr has continued improving in some dramatic ways, including geospatial support, field collapsing/grouping, extended dismax query parsing, pivot/grid/matrix/tree faceting, autosuggest, and more. This session will cover all of these new features, showcasing live examples of them all, including anything new that is implemented prior to the conference.
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
Lucene powers the search capabilities of practically all library discovery platforms, by way of Solr, etc. The Lucene project evolves rapidly, and it's a full-time job to keep up with the ever improving features and scalability. This talk will distill and showcase the most relevant(!) advancements to date.
Think *inside* the box. Inside the *search* box, that is.
The "best"* search results incorporate many more factors than (just) textual matching and relevancy. Search experience owners manage query context rules, signals automatically feed back machine learned factors, users implicit and explicit behaviors filter and weight future interactions. Synergy emerges with several cooperating (just) searches.
This talk will showcase and detail several (just) search examples including rules, typeahead/suggest, signals, and location awareness, bringing them all together into a cohesive search experience.
code4lib 2011 preconference: What's New in Solr (since 1.4.1)Erik Hatcher
code4lib 2011 preconference, presented by Erik Hatcher of Lucid Imagination.
Abstract: The library world is fired up about Solr. Practically every next-gen catalog is using it (via Blacklight, VuFind, or other technologies). Solr has continued improving in some dramatic ways, including geospatial support, field collapsing/grouping, extended dismax query parsing, pivot/grid/matrix/tree faceting, autosuggest, and more. This session will cover all of these new features, showcasing live examples of them all, including anything new that is implemented prior to the conference.
The talk presents the sfSolrPlugin which transparently integrates the Solr search engine into symfony.
The talk explains :
* the features of the solr search engine
* how to integrate the search engine into symfony
* complex search : faceted and geolocalized search
* usage example : http://www.menugourmet.com and http://resolutionfinder.org
Lucene powers the search capabilities of practically all library discovery platforms, by way of Solr, etc. The Lucene project evolves rapidly, and it's a full-time job to keep up with the ever improving features and scalability. This talk will distill and showcase the most relevant(!) advancements to date.
After a thorough overview of the main features and benefits of Apache Solr (an open source search server), the architecture of Solr and strategies to adopt it for your PHP application and data model will be presented. The main lessons learned around dealing with a mix of structured and non-structured content, multilingual aspects, tuning and the various state-of-the-art features of Solr will be shared as well
Solr 4.0 dramatically improves scalability, performance, and flexibility. An overhauled Lucene underneath sports near real-time (NRT) capabilities allowing indexed documents to be rapidly visible and searchable. Lucene’s improvements also include pluggable scoring, much faster fuzzy and wildcard querying, and vastly improved memory usage. These Lucene improvements automatically make Solr much better, and Solr magnifies these advances with “SolrCloud.” SolrCloud enables highly available and fault tolerant clusters for large scale distributed indexing and searching. There are many other changes that will be surveyed as well. This talk will cover these improvements in detail, comparing and contrasting to previous versions of Solr.
Got data? Let's make it searchable! This presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how to showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
Building your own search engine with Apache SolrBiogeeks
Andrew Clegg : Building your own search engine with Apache Solr
Apache Solr (http://lucene.apache.org/solr/) is an open-source search
engine based on the popular Lucene library with a huge variety of
features. In this talk, Andrew describes how he used it to build a
high-performance search tool for protein and domain structures at
CATH, and talks about some of the suprisingly cool things you can do
with it beyond simple searching.
This session will introduce and demonstrate several techniques for enhancing the search experience by augmenting documents during indexing. First we'll survey the analysis components available in Solr, and then we'll delve into using Solr's update processing pipeline to modify documents on the way in. The session will build on Erik's "Poor Man's Entity Extraction" blog at http://www.searchhub.org/2013/06/27/poor-mans-entity-extraction-with-solr/
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
ZendCon 2010 - Building Intelligent Search Applications with Apache Solr and PHP5. This is a presentation on how to create intelligent web-based search applications using PHP 5 and the out-of-the-box features available in Solr 1.4.1 After we finish we finish the illustration of adding, updating and removing data from the Solr index, we will discuss how to add features such as auto-completion, hit highlighting, faceted navigation, spelling suggestions etc
Tutorial on developing a Solr search component pluginsearchbox-com
In this set of slides we give a step by step tutorial on how to develop a fully functional solr search component plugin. Additionally we provide links to full source code which can be used as a template to rapidly start creating your own search components.
Solr is a highly scalable and fast open source enterprise search platform from the Apache Lucene project. Let's explore why some of the largest Internet sites in the world are giving a preference to its many exciting features.
You’re Solr powered, and needing to customize its capabilities. Apache Solr is flexibly architected, with practically everything pluggable. Under the hood, Solr is driven by the well-known Apache Lucene. Lucene for Solr Developers will guide you through the various ways in which Solr can be extended, customized, and enhanced with a bit of Lucene API know-how. We’ll delve into improving analysis with custom character mapping, tokenizing, and token filtering extensions; show why and how to implement specialized query parsing, and how to add your own search and update request handling.
Search is everywhere, and therefore so is Apache Lucene. While providing amazing out-of-the-box defaults, there’s enough projects weird enough to require custom search scoring and ranking. In this talk, I’ll walk through how to use Lucene to implement your custom scoring and search ranking. We’ll see how you can achieve both amazing power (and responsibility) over your search results. We’ll see the flexibility of Lucene’s data structures and explore the pros/cons of custom Lucene scoring vs other methods of improving search relevancy.
After a thorough overview of the main features and benefits of Apache Solr (an open source search server), the architecture of Solr and strategies to adopt it for your PHP application and data model will be presented. The main lessons learned around dealing with a mix of structured and non-structured content, multilingual aspects, tuning and the various state-of-the-art features of Solr will be shared as well
Solr 4.0 dramatically improves scalability, performance, and flexibility. An overhauled Lucene underneath sports near real-time (NRT) capabilities allowing indexed documents to be rapidly visible and searchable. Lucene’s improvements also include pluggable scoring, much faster fuzzy and wildcard querying, and vastly improved memory usage. These Lucene improvements automatically make Solr much better, and Solr magnifies these advances with “SolrCloud.” SolrCloud enables highly available and fault tolerant clusters for large scale distributed indexing and searching. There are many other changes that will be surveyed as well. This talk will cover these improvements in detail, comparing and contrasting to previous versions of Solr.
Got data? Let's make it searchable! This presentation will demonstrate getting documents into Solr quickly, will provide some tips in adjusting Solr's schema to match your needs better, and finally will discuss how to showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
Building your own search engine with Apache SolrBiogeeks
Andrew Clegg : Building your own search engine with Apache Solr
Apache Solr (http://lucene.apache.org/solr/) is an open-source search
engine based on the popular Lucene library with a huge variety of
features. In this talk, Andrew describes how he used it to build a
high-performance search tool for protein and domain structures at
CATH, and talks about some of the suprisingly cool things you can do
with it beyond simple searching.
This session will introduce and demonstrate several techniques for enhancing the search experience by augmenting documents during indexing. First we'll survey the analysis components available in Solr, and then we'll delve into using Solr's update processing pipeline to modify documents on the way in. The session will build on Erik's "Poor Man's Entity Extraction" blog at http://www.searchhub.org/2013/06/27/poor-mans-entity-extraction-with-solr/
Building Intelligent Search Applications with Apache Solr and PHP5israelekpo
ZendCon 2010 - Building Intelligent Search Applications with Apache Solr and PHP5. This is a presentation on how to create intelligent web-based search applications using PHP 5 and the out-of-the-box features available in Solr 1.4.1 After we finish we finish the illustration of adding, updating and removing data from the Solr index, we will discuss how to add features such as auto-completion, hit highlighting, faceted navigation, spelling suggestions etc
Tutorial on developing a Solr search component pluginsearchbox-com
In this set of slides we give a step by step tutorial on how to develop a fully functional solr search component plugin. Additionally we provide links to full source code which can be used as a template to rapidly start creating your own search components.
Solr is a highly scalable and fast open source enterprise search platform from the Apache Lucene project. Let's explore why some of the largest Internet sites in the world are giving a preference to its many exciting features.
You’re Solr powered, and needing to customize its capabilities. Apache Solr is flexibly architected, with practically everything pluggable. Under the hood, Solr is driven by the well-known Apache Lucene. Lucene for Solr Developers will guide you through the various ways in which Solr can be extended, customized, and enhanced with a bit of Lucene API know-how. We’ll delve into improving analysis with custom character mapping, tokenizing, and token filtering extensions; show why and how to implement specialized query parsing, and how to add your own search and update request handling.
Search is everywhere, and therefore so is Apache Lucene. While providing amazing out-of-the-box defaults, there’s enough projects weird enough to require custom search scoring and ranking. In this talk, I’ll walk through how to use Lucene to implement your custom scoring and search ranking. We’ll see how you can achieve both amazing power (and responsibility) over your search results. We’ll see the flexibility of Lucene’s data structures and explore the pros/cons of custom Lucene scoring vs other methods of improving search relevancy.
"Solr Update" at code4lib '13 - ChicagoErik Hatcher
Solr is continually improving. Solr 4 was recently released, bringing dramatic changes in the underlying Lucene library and Solr-level features. It's tough for us all to keep up with the various versions and capabilities.
This talk will blaze through the highlights of new features and improvements in Solr 4 (and up). Topics will include: SolrCloud, direct spell checking, surround query parser, and many other features. We will focus on the features library coders really need to know about.
For enterprises, it's rarely a single function causing your OSS problem, it's a combination of architecture, packages, or networks. Using three real-world examples, these slides, from our recent webinar, walk through identifying the infrastructure needs, the technology stack selection process, and the final architected solution for each environment (e-commerce, PaaS, and HPC machine learning.)
Anyone who has tried integrating search in their application knows how good and powerful Solr is but always wished it was simpler to get started and simpler to take it to production.
I will talk about the recent features added to Solr making it easier for users and some of the changes we plan on adding soon to make the experience even better.
In the big data world, our data stores communicate over an asynchronous, unreliable network to provide a facade of consistency. However, to really understand the guarantees of these systems, we must understand the realities of networks and test our data stores against them.
Jepsen is a tool which simulates network partitions in data stores and helps us understand the guarantees of our systems and its failure modes. In this talk, I will help you understand why you should care about network partitions and how can we test datastores against partitions using Jepsen. I will explain what Jepsen is and how it works and the kind of tests it lets you create. We will try to understand the subtleties of distributed consensus, the CAP theorem and demonstrate how different data stores such as MongoDB, Cassandra, Elastic and Solr behave under network partitions. Finally, I will describe the results of the tests I wrote using Jepsen for Apache Solr and discuss the kinds of rare failures which were found by this excellent tool.
Multi faceted responsive search, autocomplete, feeds engine & logginglucenerevolution
Presented by Remi Mikalsen, Search Engineer, The Norwegian Centre for ICT in Education
Learn how utdanning.no leverages open source technologies to deliver a blazing fast multi-faceted responsive search experience and a flexible and efficient feeds engine on top of Solr 3.6. Among the key open source projects that will be covered are Solr, Ajax-Solr, SolrPHPClient, Bootstrap, jQuery and Drupal. Notable highlights are ajaxified pivot facets, multiple parents hierarchical facets, ajax autocomplete with edge-n-gram and grouping, integrating our search widgets on any external website, custom Solr logging and using Solr to deliver Atom feeds. utdanning.no is a governmental website that collects, normalizes and publishes study information for related to secondary school and higher education in Norway. With 1.2 million visitors each year and 12.000 indexed documents we focus on precise information and a high degree of usability for students, potential students and counselors.
Gimme shelter: Tips on protecting proprietary and open source codeRogue Wave Software
Presented at ESC Minneapolis - September 2016. This presentation aims to train and retain by walking through real examples of the top security defects and open source liability issues for embedded systems today. Based on data from the National Vulnerability Database and recent court decisions, attendees will be exposed to lesser known but vital research on security and licensing to better prepare their teams to combat risks.
Using Apache Lucene and Solr search technologies, information and knowledge have become vastly more searchable, findable, and accessible. Because scholars and researchers are some of the most demanding users of search systems, the problems encountered by the implementers are complex. For example, many of the applications built on these technologies also thrive on intentionally designed-in serendipitous discovery capabilities, bringing to light previously unknown, yet related and potentially interesting, content.
Libraries and other public knowledge-sharing environments, such as Wikipedia, generally embrace "open source" and community improving contributions as core principles, making a lovely synergy with the power, features, and community-driven ecosystem provided by Lucene and Solr.
This talk will introduce you to several Solr powered library-related systems, detail how they work, and leave you with lessons learned that can be applied to your applications.
Faceted Search – the 120 Million Documents StorySourcesense
Upayavira's presentation at Online Information 2010 in London: the case study of an Enterprise-critical migration from custom Lucene indexes to Apache Solr, with a significant focus on scalability.
The solution needed to providing search against rapidly changing data-sets and multi-million document indexes, enabling complex queries with sub second responses and maintaining high availability.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward.
Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
How to achieve security, reliability, and productivity in less timeRogue Wave Software
This introductory session lays the foundation for boosting the effectiveness of mission-critical systems testing by covering industry best practices for code security, software reliability, and team productivity. For each area, you will learn how to mitigate the top issues by seeing real examples and understanding the tools and techniques to overcome them. This includes: The value of different testing methods; The importance of standards compliance; and understanding how DevOps and continuous integration fit in.
Search engines, and Apache Solr in particular, are quickly shifting the focus away from “big data” systems storing massive amounts of raw (but largely unharnessed) content, to “smart data” systems where the most relevant and actionable content is quickly surfaced instead. Apache Solr is the blazing-fast and fault-tolerant distributed search engine leveraged by 90% of Fortune 500 companies. As a community-driven open source project, Solr brings in diverse contributions from many of the top companies in the world, particularly those for whom returning the most relevant results is mission critical.
Out of the box, Solr includes advanced capabilities like learning to rank (machine-learned ranking), graph queries and distributed graph traversals, job scheduling for processing batch and streaming data workloads, the ability to build and deploy machine learning models, and a wide variety of query parsers and functions allowing you to very easily build highly relevant and domain-specific semantic search, recommendations, or personalized search experiences. These days, Solr even enables you to run SQL queries directly against it, mixing and matching the full power of Solr’s free-text, geospatial, and other search capabilities with the a prominent query language already known by most developers (and which many external systems can use to query Solr directly).
Due to the community-oriented nature of Solr, the ecosystem of capabilities also spans well beyond just the core project. In this talk, we’ll also cover several other projects within the larger Apache Lucene/Solr ecosystem that further enhance Solr’s smart data capabilities: bi-directional integration of Apache Spark and Solr’s capabilities, large-scale entity extraction, semantic knowledge graphs for discovering, traversing, and scoring meaningful relationships within your data, auto-generation of domain-specific ontologies, running SPARQL queries against Solr on RDF triples, probabilistic identification of key phrases within a query or document, conceptual search leveraging Word2Vec, and even Lucidworks’ own Fusion project which extends Solr to provide an enterprise-ready smart data platform out of the box.
We’ll dive into how all of these capabilities can fit within your data science toolbox, and you’ll come away with a really good feel for how to build highly relevant “smart data” applications leveraging these key technologies.
Let's Build an Inverted Index: Introduction to Apache Lucene/SolrSease
The University Seminar series aim to provide a basic understanding of Open Source Information Retrieval and its application in the real world through the Apache Lucene/Solr technologies.
Solr Recipes provides quick and easy steps for common use cases with Apache Solr. Bite-sized recipes will be presented for data ingestion, textual analysis, client integration, and each of Solr’s features including faceting, more-like-this, spell checking/suggest, and others.
Building Search & Recommendation EnginesTrey Grainger
In this talk, you'll learn how to build your own search and recommendation engine based on the open source Apache Lucene/Solr project. We'll dive into some of the data science behind how search engines work, covering multi-lingual text analysis, natural language processing, relevancy ranking algorithms, knowledge graphs, reflected intelligence, collaborative filtering, and other machine learning techniques used to drive relevant results for free-text queries. We'll also demonstrate how to build a recommendation engine leveraging the same platform and techniques that power search for most of the world's top companies. You'll walk away from this presentation with the toolbox you need to go and implement your very own search-based product using your own data.
All you need to start with Apache Solr (elastic search). This presentation includes all the information of Solr i.e. what it is, installation, indexing & searching for beginners.
Self-learned Relevancy with Apache SolrTrey Grainger
Search engines are known for "relevancy", but the relevancy models that ship out of the box (BM25, classic tf-idf, etc.) are just scratching the surface of what's needed for a truly insightful application.
What if your search engine could automatically tune its own domain-specific relevancy model based on user interactions? What if it could learn the important phrases and topics within your domain, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain? What if you could further use SQL queries to explore these relationships within your own BI tools and return results in ranked order to deliver relevance-driven analytics visualizations?
In this presentation, we'll walk through how you can leverage the myriad of capabilities in the Apache Solr ecosystem (such as the Solr Text Tagger, Semantic Knowledge Graph, Spark-Solr, Solr SQL, learning to rank, probabilistic query parsing, and Lucidworks Fusion) to build self-learning, relevance-first search, recommendations, and data analytics applications.
Iterator - a powerful but underappreciated design patternNitin Bhide
Iterator design pattern is described in GoF ‘Design Patterns’ book. It is used at many places (e.g. Sql Cursor is a ‘iterator’), C++ standard template library uses iterators heavily. .Net Linq interfaces are based IEnumerable (i.e. iterator). However, I don’t see projects creating/using ‘custom’ iterator classes. Many problems can be solved ‘elegantly’ by use of customized iterators. This talk is about ‘power of iterators’ and how custom iterators can solve common problems and help create modular/reusable code components.
Key Discussion Points
Typical examples of iterators in common use.
Kind of problems that can be ‘elegantly’ solved with iterators
When to use custom iterators?
How write custom iterators in C++/C#
From webinar I did on TechGig
http://www.techgig.com/expert-speak/Iterator-a-powerful-but-underappreciated-pattern-449
Solr now smoothly integrates with Lucene-level payloads.
Payloads provide optional per-term metadata, numeric or otherwise. Payloads help solve challenging use cases such as per-store product pricing and per-term confidence/weighting.
This session will present the payload feature from the Lucene layer up to the Solr integration, including per-store pricing, per-term weighting, and more.
In this talk, Solr's built-in query parsers will be detailed included when and how to use them. Solr has nested query parsing capability, allowing for multiple query parsers to be used to generate a single query. The nested query parsing feature will be described and demonstrated. In many domains, e-commerce in particular, parsing queries often means interpreting which entities (e.g. products, categories, vehicles) the user likely means; this talk will conclude with techniques to achieve richer query interpretation.
Apache Solr serves search requests at the enterprises and the largest companies around the world. Built on top of the top-notch Apache Lucene library, Solr makes indexing and searching integration into your applications straightforward. Solr provides faceted navigation, spell checking, highlighting, clustering, grouping, and other search features. Solr also scales query volume with replication and collection size with distributed capabilities. Solr can index rich documents such as PDF, Word, HTML, and other file types.
Come learn how you can get your content into Solr and integrate it into your applications!
Got data? Let's make it searchable! This interactive presentation will demonstrate getting documents into Solr quickly, provide some tips in adjusting Solr's schema to match your needs better, and finally showcase your data in a flexible search user interface. We'll see how to rapidly leverage faceting, highlighting, spell checking, and debugging. Even after all that, there will be enough time left to outline the next steps in developing your search application and taking it to production.
Solr Flair: Search User Interfaces Powered by Apache Solr (ApacheCon US 2009,...Erik Hatcher
Solr powers library, government, and enterprise search systems in thousands of applications. This talk showcases various technologies and techniques used to build effective user search, browse, and find interfaces on top of Solr.
Solr Flair: Search User Interfaces Powered by Apache SolrErik Hatcher
Solr powers library, government, and enterprise search systems in thousands of applications. This talk will showcase the various technologies and techniques used to build effective user search, browse, and find interfaces on top of Solr. Several of the full featured open-source library Solr front-ends will be shown, including Blacklight and VuFind. We’ll also demonstrate several front-end frameworks including:
• SolrJS - a JavaScript widget library
• Solr Flare - a Ruby on Rails plugin featuring Simile Timeline integration, Ajax suggest, and more
• Solritas - a built-in lightweight UI templating framework
Additionally, we’ll take a look under the covers of http://search.lucidimagination.com and see what makes it shine.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Lucene for Solr Developers
1. Lucene for Solr
Developers
NFJS - Raleigh, August 2011
Presented by Erik Hatcher
erik.hatcher@lucidimagination.com
Lucid Imagination
http://www.lucidimagination.com
2. About me...
• Co-author, "Lucene in Action" (and "Java
Development with Ant" / "Ant in Action"
once upon a time)
• "Apache guy" - Lucene/Solr committer;
member of Lucene PMC, member of
Apache Software Foundation
• Co-founder, evangelist, trainer, coder @
Lucid Imagination
3. About Lucid Imagination...
• Lucid Imagination provides commercial-grade
support, training, high-level consulting and value-
added software for Lucene and Solr.
• We make Lucene ‘enterprise-ready’ by offering:
• Free, certified, distributions and downloads.
• Support, training, and consulting.
• LucidWorks Enterprise, a commercial search
platform built on top of Solr.
4. What is Lucene?
• An open source search library (not an application)
• 100% Java
• Continuously improved and tuned over more than
10 years
• Compact, portable index representation
• Programmable text analyzers, spell checking and
highlighting
• Not a crawler or a text extraction tool
5. Inverted Index
• Lucene stores input data in what is known as an
inverted index
• In an inverted index each indexed term points to a
list of documents that contain the term
• Similar to the index provided at the end of a book
• In this case "inverted" simply means the list of terms
point to documents
• It is much faster to find a term in an index, than to
scan all the documents
7. Segments and Merging
• A Lucene index is a collection of one or more sub-indexes
called segments
• Each segment is a fully independent index
• A multi-way merge algorithm is used to periodically merge
segments
• New segments are created when an IndexWriter flushes new
documents and pending deletes to disk
• Trying for a balance between large-scale performance vs. small-
scale updates
• Optimization merges all segments into one
9. Segments
• When a document is deleted it still exists
in an index segment until that segment is
merged
• At certain trigger points, these Documents
are flushed to the Directory
• Can be forced by calling commit
• Segments are periodically merged
14. Lucene Scoring
• Lucene uses a similarity scoring formula to rank results by measuring the
similarity between a query and the documents that match the query. The
factors that form the scoring formula are:
• Term Frequency: tf (t in d). How often the term occurs in the document.
• Inverse Document Frequency: idf (t). A measure of how rare the term is in
the whole collection. One over the number of times the term appears in
the collection.
• Terms that are rare throughout the entire collection score higher.
15. Coord and Norms
• Coord: The coordination factor, coord (q, d).
Boosts documents that match more of the
search terms than other documents.
• If 4 of 4 terms match coord = 4/4
• If 3 of 4 terms match coord = 3/4
• Length Normalization - Adjust the score based
on length of fields in the document.
• shorter fields that match get a boost
16. Scoring Factors (cont)
• Boost: (t.field in d). A way to boost a field
or a whole document above others.
• Query Norm: (q). Normalization value
for a query, given the sum of the squared
weights of each of the query terms.
• You will often hear the Lucene scoring
simply referred to as
TF·IDF.
17. Explanation
• Lucene has a feature called Explanation
• Solr uses the debugQuery parameter to
retrieve scoring explanations
0.2987913 = (MATCH) fieldWeight(text:lucen in 688), product of:
1.4142135 = tf(termFreq(text:lucen)=2)
9.014501 = idf(docFreq=3, maxDocs=12098)
0.0234375 = fieldNorm(field=text, doc=688)
20. Customizing - Don't do it!
• Unless you need to.
• In other words... ensure you've given the built-in
capabilities a try, asked on the e-mail list, and
spelunked into at least Solr's code a bit to make
some sense of the situation.
• But we're here to roll up our sleeves, because we
need to...
21. But first...
• Look at Lucene and/or Solr source code as
appropriate
• Carefully read javadocs and wiki pages - lots of tips
there
• And, hey, search for what you're trying to do...
• Google, of course
• But try out LucidFind and other Lucene ecosystem
specific search systems -
http://www.lucidimagination.com/search/
23. Factories
• FooFactory (most) everywhere.
Sometimes there's BarPlugin style
• for sake of discussion... let's just skip the
"factory" part
• In Solr, Factories and Plugins are used by
configuration loading to parameterize and
construct
24. "Installing" plugins
• Compile .java to .class, JAR it up
• Put JAR files in either:
• <solr-home>/lib
• a shared lib when using multicore
• anywhere, and register location in
solrconfig.xml
• Hook in plugins as appropriate
30. CharFilter
• extend BaseCharFilter
• enables pre-tokenization filtering/morphing
of incoming field value
• only affects tokenization, not stored value
• Built-in CharFilters: HTMLStripCharFilter,
PatternReplaceCharFilter, and
MappingCharFilter
31. Tokenizer
• common to extend CharTokenizer
• implement -
• protected abstract boolean isTokenChar(int c);
• optionally override -
• protected int normalize(int c)
• extend Tokenizer directly for finer control
• Popular built-in Tokenizers include: WhitespaceTokenizer,
StandardTokenizer, PatternTokenizer, KeywordTokenizer,
ICUTokenizer
32. TokenFilter
• a TokenStream whose input is another
TokenStream
• Popular TokenFilters include:
LowerCaseFilter, CommonGramsFilter,
SnowballFilter, StopFilter,
WordDelimiterFilter
33. Lucene's analysis APIs
• tricky business, what with Attributes
(Source/Factory's), State, characters, code
points,Version, etc...
• Test!!!
• BaseTokenStreamTestCase
• Look at Lucene and Solr's test cases
38. Built-in QParsers
from QParserPlugin.java
/** internal use - name to class mappings of builtin parsers */
public static final Object[] standardPlugins = {
LuceneQParserPlugin.NAME, LuceneQParserPlugin.class,
OldLuceneQParserPlugin.NAME, OldLuceneQParserPlugin.class,
FunctionQParserPlugin.NAME, FunctionQParserPlugin.class,
PrefixQParserPlugin.NAME, PrefixQParserPlugin.class,
BoostQParserPlugin.NAME, BoostQParserPlugin.class,
DisMaxQParserPlugin.NAME, DisMaxQParserPlugin.class,
ExtendedDismaxQParserPlugin.NAME, ExtendedDismaxQParserPlugin.class,
FieldQParserPlugin.NAME, FieldQParserPlugin.class,
RawQParserPlugin.NAME, RawQParserPlugin.class,
TermQParserPlugin.NAME, TermQParserPlugin.class,
NestedQParserPlugin.NAME, NestedQParserPlugin.class,
FunctionRangeQParserPlugin.NAME, FunctionRangeQParserPlugin.class,
SpatialFilterQParserPlugin.NAME, SpatialFilterQParserPlugin.class,
SpatialBoxQParserPlugin.NAME, SpatialBoxQParserPlugin.class,
JoinQParserPlugin.NAME, JoinQParserPlugin.class,
};
39. Local Parameters
• {!qparser_name param=value}expression
• or
• {!qparser_name param=value v=expression}
• Can substitute $references from request
parameters
41. Custom QParser
• Implement a QParserPlugin that creates your
custom QParser
• Register in solrconfig.xml
• <queryParser name="myparser"
class="com.mycompany.MyQParserPlugin"/>
43. Built-in Update
Processors
• RunUpdateProcessor
• Actually performs the operations, such as
adding the documents to the index
• LogUpdateProcessor
• Logs each operation
• SignatureUpdateProcessor
• duplicate detection and optionally rejection
45. Update Processor
Chain
• UpdateProcessor's sequence into a chain
• Each processor can abort the entire update
or hand processing to next processor in
the chain
• Chains, of update processor factories, are
specified in solrconfig.xml
• Update requests can specify an
update.processor parameter
46. Default update
processor chain
From SolrCore.java
// construct the default chain
UpdateRequestProcessorFactory[] factories =
new UpdateRequestProcessorFactory[]{
new RunUpdateProcessorFactory(),
new LogUpdateProcessorFactory()
};
Note: these steps have been swapped on trunk recently
47. Example Update
Processor
• What are the best facets to show for a particular
query? Wouldn't it be nice to see the distribution of
document "attributes" represented across a result
set?
• Learned this trick from the Smithsonian, who were
doing it manually - add an indexed field containing the
field names of the interesting other fields on the
document.
• Facet on that field "of field names" initially, then
request facets on the top values returned.
49. FieldsUsedUpdateProcessorFactory
public class FieldsUsedUpdateProcessorFactory extends UpdateRequestProcessorFactory {
private String fieldsUsedFieldName;
private Pattern fieldNamePattern;
public UpdateRequestProcessor getInstance(SolrQueryRequest req, SolrQueryResponse rsp,
UpdateRequestProcessor next) {
return new FieldsUsedUpdateProcessor(req, rsp, this, next);
}
// ... next slide ...
}
50. FieldsUsedUpdateProcessorFactory
@Override
public void init(NamedList args) {
if (args == null) return;
SolrParams params = SolrParams.toSolrParams(args);
fieldsUsedFieldName = params.get("fieldsUsedFieldName");
if (fieldsUsedFieldName == null) {
throw new SolrException
(SolrException.ErrorCode.SERVER_ERROR,
"fieldsUsedFieldName must be specified");
}
// TODO check that fieldsUsedFieldName is a valid field name and multiValued
String fieldNameRegex = params.get("fieldNameRegex");
if (fieldNameRegex == null) {
throw new SolrException
(SolrException.ErrorCode.SERVER_ERROR,
"fieldNameRegex must be specified");
}
fieldNamePattern = Pattern.compile(fieldNameRegex);
super.init(args);
}
51. class FieldsUsedUpdateProcessor extends UpdateRequestProcessor {
public FieldsUsedUpdateProcessor(SolrQueryRequest req,
SolrQueryResponse rsp,
FieldsUsedUpdateProcessorFactory factory,
UpdateRequestProcessor next) {
super(next);
}
@Override
public void processAdd(AddUpdateCommand cmd) throws IOException {
SolrInputDocument doc = cmd.getSolrInputDocument();
Collection<String> incomingFieldNames = doc.getFieldNames();
Iterator<String> iterator = incomingFieldNames.iterator();
ArrayList<String> usedFields = new ArrayList<String>();
while (iterator.hasNext()) {
String f = iterator.next();
if (fieldNamePattern.matcher(f).matches()) {
usedFields.add(f);
}
}
doc.addField(fieldsUsedFieldName, usedFields.toArray());
super.processAdd(cmd);
}
}
54. Example - auto facet
select
• It sure would be nice if you could have Solr automatically
select field(s) for faceting based dynamically off the
profile of the results. For example, you're indexing
disparate types of products, all with varying attributes
(color, size - like for apparel, memory_size - for
electronics, subject - for books, etc), and a user searches
for "ipod" where most products match products with
color and memory_size attributes... let's automatically
facet on those fields.
• https://issues.apache.org/jira/browse/SOLR-2641
55. AutoFacetSelection
Component
• Too much code for a slide, let's take a look in
an IDE...
• Basically -
• process() gets autofacet.field and autofacet.n
request params, facets on field, takes top N
values, sets those as facet.field's
• Gotcha - need to call rb.setNeedDocSet
(true) in prepare() as faceting needs it