This is a presentation I gave to our internal SharePoint User Group on Friday, September 19th, 2014. It covered some of the basics of SharePoint searching, taking them beyond the "type in a couple of words and hope for the best" approach.
The document provides an overview of HTML and XHTML topics including:
1. It defines HTML as a markup language used to define the structure and layout of web pages using tags. XHTML is defined as a stricter version of HTML that follows XML syntax rules.
2. Key HTML topics covered include basic tags for headings, paragraphs, colors, fonts, lists, links, images and tables. It also discusses HTML forms, headers and bodies.
3. The document contrasts XHTML with HTML and outlines requirements for XHTML documents such as mandatory DOCTYPEs and proper nesting of tags.
This document provides an overview of Cascading Style Sheets (CSS) including what CSS is, how to write CSS code, and the different ways to include CSS in an HTML document. CSS allows separation of document content from page layout and visual design. CSS code uses selectors, properties, and values to style HTML elements. Styles can be included inline, internally in the <head> using <style> tags, or externally in a .css file linked via the <link> tag. Inheritance rules determine which styles take precedence.
The document discusses a CSS studio class. It provides responses to student questions about using CSS and HTML to structure and style documents. Key points include:
- CSS allows adding styles like fonts, colors, positioning to HTML and XML to control layout and presentation. It works by associating style rules with elements using selectors.
- The DOM represents elements in a tree structure that CSS can target with selectors like tags, classes, IDs. CSS declarations then specify property-value pairs to style those elements.
- External CSS files allow separating presentation from structure/content for better maintenance. CSS rules cascade, with later rules overwriting earlier ones.
The document discusses using JSON-LD and RDF to add semantic meaning to web APIs while maintaining compatibility with existing JSON formats. It explains how RDF uses triples to make statements about resources, and how JSON-LD allows embedding RDF semantics in JSON without changing the format. This allows merging data from multiple sources and facilitates data interchange and evolution of schemas over time.
Cascading Style Sheets (CSS) is used to separate a document's semantics from its presentation. CSS allows content to be displayed differently on different devices. CSS rules consist of selectors and declarations blocks. The CSS box model represents elements as boxes that can be sized and positioned with properties like width, height, padding, borders, and margins. CSS handles conflicts between rules through specificity, source order, and inheritance to determine which styles get applied.
"We want something like Google ... why do we get so many results?" : implemen...CIGScotland
Description of Durham developed unified resource discovery, and the challenges and rewards of integrating library, archival, museum and archaeological collections.
Presented at the CIG Scotland seminar 'Resource Discovery : from catalogues to discovery services' at the National Library of Scotland, Edinburgh, 21st March 2018
Responsive web design with html5 and css3Divya Tiwari
The document discusses responsive web design using HTML5 and CSS3. It begins with an introduction to CSS and its evolution. It then covers CSS syntax, selectors, and different ways to insert CSS into HTML documents. The document also discusses CSS3 features like new color properties, typography, box shadows, gradients, and transitions/animations. It provides examples to illustrate CSS3 properties and how they can be used to create stunning visual effects and responsive designs.
This is a presentation I gave to our internal SharePoint User Group on Friday, September 19th, 2014. It covered some of the basics of SharePoint searching, taking them beyond the "type in a couple of words and hope for the best" approach.
The document provides an overview of HTML and XHTML topics including:
1. It defines HTML as a markup language used to define the structure and layout of web pages using tags. XHTML is defined as a stricter version of HTML that follows XML syntax rules.
2. Key HTML topics covered include basic tags for headings, paragraphs, colors, fonts, lists, links, images and tables. It also discusses HTML forms, headers and bodies.
3. The document contrasts XHTML with HTML and outlines requirements for XHTML documents such as mandatory DOCTYPEs and proper nesting of tags.
This document provides an overview of Cascading Style Sheets (CSS) including what CSS is, how to write CSS code, and the different ways to include CSS in an HTML document. CSS allows separation of document content from page layout and visual design. CSS code uses selectors, properties, and values to style HTML elements. Styles can be included inline, internally in the <head> using <style> tags, or externally in a .css file linked via the <link> tag. Inheritance rules determine which styles take precedence.
The document discusses a CSS studio class. It provides responses to student questions about using CSS and HTML to structure and style documents. Key points include:
- CSS allows adding styles like fonts, colors, positioning to HTML and XML to control layout and presentation. It works by associating style rules with elements using selectors.
- The DOM represents elements in a tree structure that CSS can target with selectors like tags, classes, IDs. CSS declarations then specify property-value pairs to style those elements.
- External CSS files allow separating presentation from structure/content for better maintenance. CSS rules cascade, with later rules overwriting earlier ones.
The document discusses using JSON-LD and RDF to add semantic meaning to web APIs while maintaining compatibility with existing JSON formats. It explains how RDF uses triples to make statements about resources, and how JSON-LD allows embedding RDF semantics in JSON without changing the format. This allows merging data from multiple sources and facilitates data interchange and evolution of schemas over time.
Cascading Style Sheets (CSS) is used to separate a document's semantics from its presentation. CSS allows content to be displayed differently on different devices. CSS rules consist of selectors and declarations blocks. The CSS box model represents elements as boxes that can be sized and positioned with properties like width, height, padding, borders, and margins. CSS handles conflicts between rules through specificity, source order, and inheritance to determine which styles get applied.
"We want something like Google ... why do we get so many results?" : implemen...CIGScotland
Description of Durham developed unified resource discovery, and the challenges and rewards of integrating library, archival, museum and archaeological collections.
Presented at the CIG Scotland seminar 'Resource Discovery : from catalogues to discovery services' at the National Library of Scotland, Edinburgh, 21st March 2018
Responsive web design with html5 and css3Divya Tiwari
The document discusses responsive web design using HTML5 and CSS3. It begins with an introduction to CSS and its evolution. It then covers CSS syntax, selectors, and different ways to insert CSS into HTML documents. The document also discusses CSS3 features like new color properties, typography, box shadows, gradients, and transitions/animations. It provides examples to illustrate CSS3 properties and how they can be used to create stunning visual effects and responsive designs.
HTML is a markup language used to structure and present content on the web. It uses tags to mark elements like headings, paragraphs, lists, links, images and more. Forms allow collecting user input with different controls like text fields, checkboxes, radio buttons and more. Tables arrange data into rows and columns. Links connect pages together and frames divide pages into sections.
This presentation outlines introduction to lucene , solr. It also provides overview of powerful Solr Search features and different types of queries.
This would be useful to get an idea while working on search application development in initial phases.
1. The document discusses different topics in CSS including the basics of CSS, background properties, fonts, text properties, the box model, lists, styling links, and positioning.
2. It provides examples and explanations of key CSS concepts like selectors, declarations, background images and colors, fonts, padding, borders, margins, and different positioning techniques.
3. The document is intended to teach the fundamentals of CSS through clear explanations, syntax examples, and diagrams of the box model.
SPARQL is a query language for retrieving and manipulating data stored in RDF format. It allows users to write queries against remote SPARQL endpoints to query RDF triples stored in a database. SPARQL queries are composed of triple patterns, similar to RDF triples, that can include variables to retrieve variable bindings from the queried data. Query results are returned as solutions that assign values to the variables. Common queries include SELECT, ASK, CONSTRUCT, and DESCRIBE. SPARQL endpoints provide programmatic access to issue SPARQL queries against remote SPARQL-accessible stores.
Tutorial on SPARQL 1.1 given at SWAT4LS 2012 in paris to a full room. This material covers enough to get started and includes working with Topbraid Composer.
The document provides an overview of basic CSS (Cascading Style Sheets) concepts including what CSS is, why it is used, CSS syntax, selectors like element, class, ID and pseudo selectors, and common CSS properties for styling elements like color, background, fonts, text, lists, and borders. CSS is used to control the presentation and layout of HTML documents and is linked to HTML pages through <link> or <style> tags in the <head> section.
This document provides an overview of XML, including its basic structure and components. XML documents use elements to structure and tag content. Elements must be properly nested within a single root element and can have attributes. The relationships between these elements form a tree structure. XML documents also support comments, processing instructions, and character encoding. CSS and XSLT can be used to display and transform XML for web users. While databases are better for structured data, XML is well suited for loosely structured or large records.
Understanding Taxonomy, Drupal Camp Colorado, June 2009David Lanier
The power and flexibility of Drupal's taxonomy (classification) system is one thing that sets it apart from other CMSs. Yet many Drupal builders fail to fully harness what Drupal gives them in taxonomy. This session will help you get the most from it.
Session Overview:
- Introduction to taxonomy: what it is.
- Clearing up some terminology: what it means.
- Current uses, by Drupal core modules, by contributed modules, and on live sites.
- How taxonomy relates to the rest of the Drupal framework.
- When to use taxonomy and when to use something else, such as custom fields or custom content types.
- Modules that further expand the usefulness of taxonomy.
This document provides an introduction to CSS (Cascading Style Sheets), covering topics such as:
- What CSS is and why it's used
- How to reference a CSS stylesheet from an HTML document
- CSS syntax including selectors, properties, and values
- Common CSS tags, properties, and positioning techniques
- Tools for inspecting and debugging CSS
This document introduces common Ruby data types including integers, floats, strings, arrays, and hashes. It explains that everything in Ruby is an object. Strings are defined as any text between quotes. Arrays are ordered collections that use integers as indexes starting from 0. Hashes are collections of unique key-value pairs. The document recommends checking the official Ruby documentation for methods available for each data type.
Sage Research Method Database is an online tool that provides access to over 640 books, dictionaries, encyclopedias, handbooks, and journal articles related to research methods. It contains a taxonomy of over 1,400 research methods terms linked to authoritative content. Users can search or browse content using basic or advanced search options, filters by type, subject, date and more. Results provide citation tools, options to view or download full details and contents.
The document discusses various document management features in SharePoint including content types, document sets, managed metadata, and drop off libraries. It provides step-by-step instructions for setting up each feature, including how to create and publish content types and document sets, build term sets and taxonomies for metadata, and configure a drop off library and content organizer rules to route documents. Contact information is provided for the author to ask additional questions.
CSS (Cascading Style Sheets) is a style sheet language used to describe the presentation of HTML documents, including how elements should be rendered on screen, paper, or in other media. CSS saves a lot of work by enabling web developers to change the appearance and layout of multiple pages at once by editing just one CSS file. CSS solves the problem of formatting documents that originally arose with HTML by separating document content from document presentation.
HTML (Hypertext Markup Language) is used to create web pages and define their structure. It uses tags like <html> and <body> to define overall page structure. Other common tags include <h1> - <h6> for headings, <p> for paragraphs, <img> for images, <a> for links, and <table> for tables. HTML forms can collect user input using tags like <input>, <select>, and <textarea>. Various tags are available to format text and add multimedia content to pages.
This document discusses DOIs (Digital Object Identifiers) and CrossRef's role in registering DOIs for book publishers. It provides an overview of DOIs and the International DOI Foundation (IDF) which oversees the DOI system. CrossRef is introduced as the largest DOI registration agency. The benefits of assigning CrossRef DOIs to books are described, including persistent linking between books and other scholarly content. Best practices for registering book DOIs at CrossRef are outlined, covering metadata requirements, linking, and displaying DOIs in citations.
Overview of how book publishers can improve discoverability of their content by assigning and linking CrossRef DOIs. Presented to the American Association of University Presses (AAUP) June 2014, New Orleans, LA, United States
Zotero is a great reference tool. However, finding out 'all the things' can be a challenge. Here I've attempted to collate 'all the things' that I know about it.
El documento describe dos compañías de danza contemporánea mexicanas: El Ballet Nacional de México y el Ballet Teatro del Espacio. El Ballet Nacional de México fue fundado en 1948 y se dedicó a llevar la danza moderna al público mexicano hasta su disolución en 2006. El Ballet Teatro del Espacio es una compañía dirigida por Gladiola Orozco y Michel Descombey con el objetivo de crear un arte abierto que refleje la realidad de México. Ambas compañías han tenido una importante influencia en la danza mexic
The document discusses the concept of power and the will to power. It defines power as the ability to influence others and produce effects. Power can be held through various means such as delegated authority, social class, charisma, expertise, knowledge, force, group dynamics, and resources. There are different types and categorizations of power. Five common bases of power are discussed: legitimate power, referent power, expert power, reward power, and coercive power. The document also explores why people want power, discussing theories from Nietzsche, Machiavelli, Maslow's hierarchy of needs, and perspectives on using power responsibly versus through force and deception.
HTML is a markup language used to structure and present content on the web. It uses tags to mark elements like headings, paragraphs, lists, links, images and more. Forms allow collecting user input with different controls like text fields, checkboxes, radio buttons and more. Tables arrange data into rows and columns. Links connect pages together and frames divide pages into sections.
This presentation outlines introduction to lucene , solr. It also provides overview of powerful Solr Search features and different types of queries.
This would be useful to get an idea while working on search application development in initial phases.
1. The document discusses different topics in CSS including the basics of CSS, background properties, fonts, text properties, the box model, lists, styling links, and positioning.
2. It provides examples and explanations of key CSS concepts like selectors, declarations, background images and colors, fonts, padding, borders, margins, and different positioning techniques.
3. The document is intended to teach the fundamentals of CSS through clear explanations, syntax examples, and diagrams of the box model.
SPARQL is a query language for retrieving and manipulating data stored in RDF format. It allows users to write queries against remote SPARQL endpoints to query RDF triples stored in a database. SPARQL queries are composed of triple patterns, similar to RDF triples, that can include variables to retrieve variable bindings from the queried data. Query results are returned as solutions that assign values to the variables. Common queries include SELECT, ASK, CONSTRUCT, and DESCRIBE. SPARQL endpoints provide programmatic access to issue SPARQL queries against remote SPARQL-accessible stores.
Tutorial on SPARQL 1.1 given at SWAT4LS 2012 in paris to a full room. This material covers enough to get started and includes working with Topbraid Composer.
The document provides an overview of basic CSS (Cascading Style Sheets) concepts including what CSS is, why it is used, CSS syntax, selectors like element, class, ID and pseudo selectors, and common CSS properties for styling elements like color, background, fonts, text, lists, and borders. CSS is used to control the presentation and layout of HTML documents and is linked to HTML pages through <link> or <style> tags in the <head> section.
This document provides an overview of XML, including its basic structure and components. XML documents use elements to structure and tag content. Elements must be properly nested within a single root element and can have attributes. The relationships between these elements form a tree structure. XML documents also support comments, processing instructions, and character encoding. CSS and XSLT can be used to display and transform XML for web users. While databases are better for structured data, XML is well suited for loosely structured or large records.
Understanding Taxonomy, Drupal Camp Colorado, June 2009David Lanier
The power and flexibility of Drupal's taxonomy (classification) system is one thing that sets it apart from other CMSs. Yet many Drupal builders fail to fully harness what Drupal gives them in taxonomy. This session will help you get the most from it.
Session Overview:
- Introduction to taxonomy: what it is.
- Clearing up some terminology: what it means.
- Current uses, by Drupal core modules, by contributed modules, and on live sites.
- How taxonomy relates to the rest of the Drupal framework.
- When to use taxonomy and when to use something else, such as custom fields or custom content types.
- Modules that further expand the usefulness of taxonomy.
This document provides an introduction to CSS (Cascading Style Sheets), covering topics such as:
- What CSS is and why it's used
- How to reference a CSS stylesheet from an HTML document
- CSS syntax including selectors, properties, and values
- Common CSS tags, properties, and positioning techniques
- Tools for inspecting and debugging CSS
This document introduces common Ruby data types including integers, floats, strings, arrays, and hashes. It explains that everything in Ruby is an object. Strings are defined as any text between quotes. Arrays are ordered collections that use integers as indexes starting from 0. Hashes are collections of unique key-value pairs. The document recommends checking the official Ruby documentation for methods available for each data type.
Sage Research Method Database is an online tool that provides access to over 640 books, dictionaries, encyclopedias, handbooks, and journal articles related to research methods. It contains a taxonomy of over 1,400 research methods terms linked to authoritative content. Users can search or browse content using basic or advanced search options, filters by type, subject, date and more. Results provide citation tools, options to view or download full details and contents.
The document discusses various document management features in SharePoint including content types, document sets, managed metadata, and drop off libraries. It provides step-by-step instructions for setting up each feature, including how to create and publish content types and document sets, build term sets and taxonomies for metadata, and configure a drop off library and content organizer rules to route documents. Contact information is provided for the author to ask additional questions.
CSS (Cascading Style Sheets) is a style sheet language used to describe the presentation of HTML documents, including how elements should be rendered on screen, paper, or in other media. CSS saves a lot of work by enabling web developers to change the appearance and layout of multiple pages at once by editing just one CSS file. CSS solves the problem of formatting documents that originally arose with HTML by separating document content from document presentation.
HTML (Hypertext Markup Language) is used to create web pages and define their structure. It uses tags like <html> and <body> to define overall page structure. Other common tags include <h1> - <h6> for headings, <p> for paragraphs, <img> for images, <a> for links, and <table> for tables. HTML forms can collect user input using tags like <input>, <select>, and <textarea>. Various tags are available to format text and add multimedia content to pages.
This document discusses DOIs (Digital Object Identifiers) and CrossRef's role in registering DOIs for book publishers. It provides an overview of DOIs and the International DOI Foundation (IDF) which oversees the DOI system. CrossRef is introduced as the largest DOI registration agency. The benefits of assigning CrossRef DOIs to books are described, including persistent linking between books and other scholarly content. Best practices for registering book DOIs at CrossRef are outlined, covering metadata requirements, linking, and displaying DOIs in citations.
Overview of how book publishers can improve discoverability of their content by assigning and linking CrossRef DOIs. Presented to the American Association of University Presses (AAUP) June 2014, New Orleans, LA, United States
Zotero is a great reference tool. However, finding out 'all the things' can be a challenge. Here I've attempted to collate 'all the things' that I know about it.
El documento describe dos compañías de danza contemporánea mexicanas: El Ballet Nacional de México y el Ballet Teatro del Espacio. El Ballet Nacional de México fue fundado en 1948 y se dedicó a llevar la danza moderna al público mexicano hasta su disolución en 2006. El Ballet Teatro del Espacio es una compañía dirigida por Gladiola Orozco y Michel Descombey con el objetivo de crear un arte abierto que refleje la realidad de México. Ambas compañías han tenido una importante influencia en la danza mexic
The document discusses the concept of power and the will to power. It defines power as the ability to influence others and produce effects. Power can be held through various means such as delegated authority, social class, charisma, expertise, knowledge, force, group dynamics, and resources. There are different types and categorizations of power. Five common bases of power are discussed: legitimate power, referent power, expert power, reward power, and coercive power. The document also explores why people want power, discussing theories from Nietzsche, Machiavelli, Maslow's hierarchy of needs, and perspectives on using power responsibly versus through force and deception.
Ringkasan dokumen tersebut adalah sebagai berikut:
1. Friedrich Nietzsche membahas konsep kehendak untuk berkuasa sebagai dorongan alamiah manusia untuk tumbuh dan mendominasi.
2. Ia menyarankan untuk menerima kehendak berkuasa sebagai bagian dari diri manusia alih-alih mengutuknya.
3. Nietzsche melihat kehidupan secara positif dan mengajak manusia untuk merayakan ke
Saint Thomas Aquinas was a 13th century philosopher and Catholic saint who studied under Albert the Great. He wrote monumental works like the Summa Theologica that explored man's ultimate destiny. Aquinas believed the goal of human existence is union with God through the beatific vision after death, which provides perfect happiness. On earth, individuals must order their will toward charity, peace, and holiness to experience happiness and properly orient themselves toward their final goal of union with God. Aquinas saw man as having a divine purpose - to become what God is by being born into God's family through Christ.
This document discusses setting up Drupal 8, Elasticsearch, and Docker. It provides instructions for pulling relevant Docker images, running Elasticsearch and Drupal containers, and configuring them to integrate. Elasticsearch is configured as the search backend in Drupal. Views is used to create a search page that indexes Elasticsearch. The demo shows Drupal and Elasticsearch running in Docker and communicating to enable search capabilities in Drupal.
An Analysis and Interpretation of Plato's Allegory of the Caveguest71fae1
The document discusses Plato's Allegory of the Cave and provides three alternative interpretations:
1. G.M.A. Grube's interpretation focuses on education and the four stages of transcending desires.
2. J.G. Ingersoll's Hindu interpretation views the cave as the mind and maps the allegory to Hindu concepts.
3. Simone Weil presents an alternative view where the cave represents the world and shadows symbolize imagination, which she believes society sees as a prison.
Plato's Allegory of the Cave describes people living chained in an underground den seeing only shadows projected on a wall in front of them. The prisoners believe the shadows to be reality. If released, the prisoners' eyes would be pained by the sunlight outside the cave. Over time, they would understand their previous limited view.
Plato uses this allegory to represent ignorance (the cave) versus education and enlightenment (outside the cave). Even 2500 years later, the allegory remains relevant - people today may be like the prisoners, ignorant of reality beyond what they see on television. Modern politics can also manipulate the ill-informed masses like shadows on the cave wall.
Plato uses the Allegory of the Cave to demonstrate the flawed existence of human beings who are trapped perceiving shadows rather than true forms of reality. In the cave, people are in the lowest forms of knowledge, but some may be freed and experience a journey through higher forms of knowledge by moving from shadows to reflections to objects to the sun itself. This represents the potential ascent from lower to higher epistemic states and the acquisition of true knowledge and understanding, allowing one to become a philosopher.
Friedrich Nietzsche fue un filósofo alemán nacido en 1844 que desarrolló una filosofía poco convencional que criticaba la cultura occidental y la moral tradicional. Tuvo una educación amplia en literatura clásica y estudió teología y filología clásica antes de convertirse en profesor universitario. A los 44 años comenzó a sufrir crisis mentales y murió en 1900. Su filosofía propuso la muerte de Dios, el perspectivismo y la ética del superhombre como alternativas a
Friedrich Nietzsche was a 19th century German philosopher known for his radical questioning of traditional Western values and criticism of Christianity. Some key aspects of Nietzsche's thought discussed in the document include his views on master and slave morality, the Übermensch or Superman, the revaluation of all values, and his critique of religion, morality, and modern society. The document provides sample essay questions on Nietzsche's philosophy and lists some important terms related to his thought like the will to power, Dionysian/Apollonian, and the noble/herd mentality.
Saint Thomas Aquinas was a 13th century Roman Catholic philosopher and theologian. He believed that knowledge begins with sense perception and can grow through reason applied to experience. He used Aristotle's theories of perception and knowledge through the senses to write his influential work Summa Theological. While disagreeing with Plato's rationalism, Aquinas was considered an Aristotelian empiricist, believing empirical sense experience was the foundation of knowledge.
El documento presenta una biografía y obra de Friedrich Nietzsche (1844-1900). Resume su crítica a la cultura occidental, incluyendo la metafísica, la ciencia y la moral tradicional. Argumenta que la cultura occidental se basa en una gran mentira de pretender conocer la verdad a través de la razón, y propone una moral natural basada en la vida frente a la moral contranatural de Occidente.
Friedrich Nietzsche was a 19th century German philosopher known for his influential philosophies including master-slave morality, nihilism, the Übermensch, the will to power, and eternal recurrence. Some of his most famous published books where Thus Spoke Zarathustra, Beyond Good and Evil, and The Antichrist. Nietzsche believed that traditional Western morality arose from the resentment of the weak rather than representing objective truths, and proposed the Übermensch as a new model of humanity that could overcome nihilism.
The Greek Triumvirate of Socrates, Plato, and Aristotle is considered the golden era of Greek philosophy. During this period, philosophy reached its highest level of perfection which coincided with Greece's political dominance. These philosophers shifted focus from material substances to inquiries about human virtues, justice, happiness, and the state. Known as the Socratic method, Socrates engaged in questioning others to expose ignorance and inconsistencies in their beliefs, establishing the foundation of philosophical reasoning that influenced both Plato and Aristotle.
This document discusses the teachings and quotes of Socrates, the famous Greek philosopher. It notes that Socrates believed wisdom begins with wonder and questioning. A key teaching of Socrates was that if a person ruins their soul through wrong actions, then achieving worldly success is meaningless. The document ends by quoting Socrates' last words about departing from life, with only God knowing whether death or life is better.
Saint Thomas Aquinas was a 13th century Catholic priest and philosopher who synthesized Aristotle's philosophy with Catholic theology. His two most influential works were the Summa Theologica and Summa Contra Gentiles. In the Summa Theologica, he extensively discusses man, arguing that man is substantially both body and soul, with the soul being the principle of life and action in the body. Aquinas' philosophy is considered highly influential to this day.
Solr is an open source enterprise search platform built on Apache Lucene. It allows full-text search across various data formats through its REST-like API. Documents are indexed and stored in JSON, XML, CSV or binary formats. Queries are sent via HTTP and results are returned in the same formats. Solr features include fuzzy and proximity search, filtering, faceting, highlighting, statistics, spellchecking, grouping, and an admin panel. It is commonly used by enterprises for search capabilities on their websites.
This document provides a summary of the Solr search platform. It begins with introductions from the presenter and about Lucid Imagination. It then discusses what Solr is, how it works, who uses it, and its main features. The rest of the document dives deeper into topics like how Solr is configured, how to index and search data, and how to debug and customize Solr implementations. It promotes downloading and experimenting with Solr to learn more.
The document provides an overview and agenda for an Apache Solr crash course. It discusses topics such as information retrieval, inverted indexes, metrics for evaluating IR systems, Apache Lucene, the Lucene and Solr APIs, indexing, searching, querying, filtering, faceting, highlighting, spellchecking, geospatial search, and Solr architectures including single core, multi-core, replication, and sharding. It also provides tips on performance tuning, using plugins, and developing a Solr-based search engine.
This document summarizes a Solr Recipes Workshop presented by Erik Hatcher of Lucid Imagination. It introduces Lucene and Solr, describes how to index different content sources into Solr including CSV, XML, rich documents, and databases, and provides an overview of using the DataImportHandler to index from a relational database.
All you need to start with Apache Solr (elastic search). This presentation includes all the information of Solr i.e. what it is, installation, indexing & searching for beginners.
Introduction to Solr. A brief introduction to Solr for the resources who wants to get trained on Solr.
1. Introduction to Solr
2. Solr Terminologies
3.Installation and Configuration
4. Configuration files schema.xml and solrconfig.xml
5. Features of SOLR
a. Hit Highlighting
Auto Complete / Suggester
Stop words
Synonyms
SpellCheck
Geo Spatial Search
Result Grouping
Query Syntax
Query Boosting
Content Spotlighting
Block Record / Remove URL Feature
Content Spotlighting / Merchandising / Banner / Elevate
Block Record / Remove URL Feature
6. Indexing the Data
7. Search Queries
8. DataImportHandler - DIH
9. Plugins to index various types of Data (XML, CSV, DB, Filesystem)
10. Solr Client APIs
11. Overview of SOLRJ API
12. Running Solr on Tomcat
13. Enabling SSL on Solr
14. Zookeeper Configuration
15. Solr Cloud Deployment
16. Production Indexing Architecture
17. Production Serving Architecture
18. Solr Upgradation
19. References
This document provides an introduction to Apache Lucene and Solr. It begins with an overview of information retrieval and some basic concepts like term frequency-inverse document frequency. It then describes Lucene as a fast, scalable search library and discusses its inverted index and indexing pipeline. Solr is introduced as an enterprise search platform built on Lucene that provides features like faceting, scalability and real-time indexing. The document concludes with examples of how Lucene and Solr are used in applications and websites for search, analytics, auto-suggestion and more.
This document provides an introduction to Apache Solr, an open-source enterprise search platform built on Apache Lucene. It discusses how Solr indexes content, processes search queries, and returns results with features like faceting, spellchecking, and scaling. The document also outlines how Solr works, how to configure and use it, and examples of large companies that employ Solr for search.
This document provides an introduction to Lucene and related projects. It discusses how Lucene allows for indexing and searching of text documents, with features like relevance sorting, wildcard searches, and range queries. The document also introduces related projects like Solr, which provides a search server and REST API on top of Lucene, and Tika, which allows extracting text and metadata from various file formats. Overall, the document gives a high-level overview of Lucene and its ecosystem for text search and information retrieval.
This document provides an overview of library web mashups and APIs. It defines mashups as web applications that combine data from multiple sources. Some examples of library mashups are presented. The key technologies that power mashups, such as web services, JSON, XML, and scripting languages are described. Several specific library vendor and general web services APIs are also outlined, including the WorldCat and Serial Solutions APIs. Finally, the document discusses creating simple mashups with widgets and Yahoo Pipes and provides code walkthroughs for sample mashups.
This document provides an overview and introduction to Apache Solr, including:
- What Solr is and its main features like being based on Lucene, using inverted indexes, and having REST APIs.
- The basics of indexing and searching in Solr.
- An overview of SolrCloud which allows distributing a Solr index across multiple servers for scalability.
Schema.org: What It Means For You and Your LibraryRichard Wallis
This document summarizes a presentation about Schema.org given to the LITA Forum in Albuquerque, NM on November 7th, 2014. The presentation discussed what Schema.org is, the SchemaBibEx extension for bibliographic data, and examples of Schema.org being used. It also covered the challenges involved in mapping library metadata to Schema.org and proposals made by SchemaBibEx to address these challenges.
Solr Recipes provides quick and easy steps for common use cases with Apache Solr. Bite-sized recipes will be presented for data ingestion, textual analysis, client integration, and each of Solr’s features including faceting, more-like-this, spell checking/suggest, and others.
Introduction to Lucene & Solr and UsecasesRahul Jain
Rahul Jain gave a presentation on Lucene and Solr. He began with an overview of information retrieval and the inverted index. He then discussed Lucene, describing it as an open source information retrieval library for indexing and searching. He discussed Solr, describing it as an enterprise search platform built on Lucene that provides distributed indexing, replication, and load balancing. He provided examples of how Solr is used for search, analytics, auto-suggest, and more by companies like eBay, Netflix, and Twitter.
This document discusses building distributed search applications using Apache Solr. It provides an overview of Solr architecture and components like schema, indexing, querying etc. It also describes hands-on activities to index sample data from disk, database using Data Import Handler and SolrJ client. Query syntax for different types of queries and configuration of search handlers is also covered.
Apache solr is an enterprise search engine. It facilitates indexing of large number of documents of any size and provides very robust search techniques. This ppt provides brief introduction of it.
The document provides an overview of full text search and different approaches to implementing it including wild card database queries, using database-specific full text search functionality, leveraging third party search engines, and using text indexing libraries. It focuses on using Lucene, describing how to index and search text data with Lucene including the key classes, steps, and options involved. It also demonstrates Lucene functionality through code examples and mentions other search technologies that can be used beyond Lucene like Solr, Compass and ElasticSearch.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfMalak Abu Hammad
Discover how MongoDB Atlas and vector search technology can revolutionize your application's search capabilities. This comprehensive presentation covers:
* What is Vector Search?
* Importance and benefits of vector search
* Practical use cases across various industries
* Step-by-step implementation guide
* Live demos with code snippets
* Enhancing LLM capabilities with vector search
* Best practices and optimization strategies
Perfect for developers, AI enthusiasts, and tech leaders. Learn how to leverage MongoDB Atlas to deliver highly relevant, context-aware search results, transforming your data retrieval process. Stay ahead in tech innovation and maximize the potential of your applications.
#MongoDB #VectorSearch #AI #SemanticSearch #TechInnovation #DataScience #LLM #MachineLearning #SearchTechnology
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
In his public lecture, Christian Timmerer provides insights into the fascinating history of video streaming, starting from its humble beginnings before YouTube to the groundbreaking technologies that now dominate platforms like Netflix and ORF ON. Timmerer also presents provocative contributions of his own that have significantly influenced the industry. He concludes by looking at future challenges and invites the audience to join in a discussion.
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
5. What is “Search”?
• Information/Document Retrieval
• Basic Definition:
• Finding previously seen documents that are
related to some user-supplied terms.
6. What is “Search”?
• Information/Document Retrieval
• Basic Definition:
• Finding previously seen documents that are
related to some user-supplied terms.
• Advanced Definition:
7. What is “Search”?
• Information/Document Retrieval
• Basic Definition:
• Finding previously seen documents that are
related to some user-supplied terms.
• Advanced Definition:
• Finding relevant content for some query by
understanding the contextual meaning of
terms in the search index and query.
8. What is “Search”?
• Information/Document Retrieval
• Basic Definition:
• Finding previously seen documents that are
related to some user-supplied terms.
• Advanced Definition:
• Finding relevant content for some query by
understanding the contextual meaning of
terms in the search index and query.
• Semantic Search
26. Solr Documents
• A document represents a distinct piece of
content that can be stored/retrieved
27. Solr Documents
• A document represents a distinct piece of
content that can be stored/retrieved
• Bible Verse
28. Solr Documents
• A document represents a distinct piece of
content that can be stored/retrieved
• Bible Verse
• Journal Article
29. Solr Documents
• A document represents a distinct piece of
content that can be stored/retrieved
• Bible Verse
• Journal Article
• Commentary Chapter/Section
30. Solr Documents
• A document represents a distinct piece of
content that can be stored/retrieved
• Bible Verse
• Journal Article
• Commentary Chapter/Section
• Web Page
48. Solr Fields
• The “String” Field Type
• <fieldType
name="string"
class="solr.StrField" />
49. Solr Fields
• The “String” Field Type
• <fieldType
name="string"
class="solr.StrField" />
• No Filter; No Tokenizer
50. Solr Fields
• The “String” Field Type
• <fieldType
name="string"
class="solr.StrField" />
• No Filter; No Tokenizer
• Field content won’t be split or changed
55. Put Data in Solr
• Remember, Solr communicates using XML
over HTTP
56. Put Data in Solr
• Remember, Solr communicates using XML
over HTTP
• No concept of updating a document -
delete, then add
57. Put Data in Solr
• Remember, Solr communicates using XML
over HTTP
• No concept of updating a document -
delete, then add
• To add, POST XML to update handler
58. Put Data in Solr
• Remember, Solr communicates using XML
over HTTP
• No concept of updating a document -
delete, then add
• To add, POST XML to update handler
• http://localhost:8080/solr/bible/update
59. Add XML
<add>
<doc>
<id>1</id>
<net>In the beginning God created the heavens and
the earth.</net>
</doc>
</add>
60. PHP API
• No XML!
• $client = new SolrClient($options);
$doc = new SolrInputDocument();
$doc->addField('id', 1); //Must be Integer
$doc->addField('net', ‘In the beginning God
created the heavens and the earth.’);
$client->addDocument($doc);
63. Querying Solr
• HTTP GET Request
• http://localhost:8080/solr/bible3/select?q=god
64. Querying Solr
• HTTP GET Request
• http://localhost:8080/solr/bible3/select?q=god
• | Path to Solr ||Core||Handler||Query |
65. Querying Solr
• HTTP GET Request
• http://localhost:8080/solr/bible3/select?q=god
• | Path to Solr ||Core||Handler||Query |
• Returns XML By Default
66. Querying Solr
• HTTP GET Request
• http://localhost:8080/solr/bible3/select?q=god
• | Path to Solr ||Core||Handler||Query |
• Returns XML By Default
• Can return JSON and more
69. Querying Solr
• Queries the defaultSearchField by default
• <defaultSearchField>all_index</defaultSearchField>
70. Querying Solr
• Queries the defaultSearchField by default
• <defaultSearchField>all_index</defaultSearchField>
• Can query other fields by using the syntax:field:value
71. Querying Solr
• Queries the defaultSearchField by default
• <defaultSearchField>all_index</defaultSearchField>
• Can query other fields by using the syntax:field:value
• http://localhost:8080/solr/bible3/select?q=id:27974
72. Querying Solr
• Queries the defaultSearchField by default
• <defaultSearchField>all_index</defaultSearchField>
• Can query other fields by using the syntax:field:value
• http://localhost:8080/solr/bible3/select?q=id:27974
• Multiple queries / Booleans
73. Querying Solr
• Queries the defaultSearchField by default
• <defaultSearchField>all_index</defaultSearchField>
• Can query other fields by using the syntax:field:value
• http://localhost:8080/solr/bible3/select?q=id:27974
• Multiple queries / Booleans
• http://localhost:8080/solr/bible3/select?q=god AND book:40
81. Search Multiple
Translations
• + Quasi Synonym term/phrase injection
82. Search Multiple
Translations
• + Quasi Synonym term/phrase injection
• + Less variation across translations leads to stronger
possible matches
83. Search Multiple
Translations
• + Quasi Synonym term/phrase injection
• + Less variation across translations leads to stronger
possible matches
• + Matches verses when the source translation isn’t
known
84. Search Multiple
Translations
• + Quasi Synonym term/phrase injection
• + Less variation across translations leads to stronger
possible matches
• + Matches verses when the source translation isn’t
known
• - No control over which translation gets more weight
85. Search Multiple
Translations
• + Quasi Synonym term/phrase injection
• + Less variation across translations leads to stronger
possible matches
• + Matches verses when the source translation isn’t
known
• - No control over which translation gets more weight
• - No control over scoring of matches
86. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
• We’re searching the net_index and kjv_index fields, each with
a boost/weight of 1.
• net_index^6 kjv_index^.5
• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=.
1&qf=net_index^1%20kjv_index^1&fl=score
• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=.
1&qf=net_index^6%20kjv_index^.5&fl=score
88. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
89. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
90. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
91. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
• Term Frequency in Corpus (↓ is Better)
92. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
• Term Frequency in Corpus (↓ is Better)
• Length of matching document (↓ is Better)
93. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
• Term Frequency in Corpus (↓ is Better)
• Length of matching document (↓ is Better)
• “Jesus Wept” - John 11:35
94. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
• Term Frequency in Corpus (↓ is Better)
• Length of matching document (↓ is Better)
• “Jesus Wept” - John 11:35
• http://localhost:8080/solr/bible3/select?q=wept
95. Scoring
• score(q,d) =
coord(q,d)· queryNorm(q)· ∑ ( tf(t in d)· idf(t)2· norm(t,d))
t in q
• Basic Factors
• Term Frequency in a document (↑ is better)
• Term Frequency in Corpus (↓ is Better)
• Length of matching document (↓ is Better)
• “Jesus Wept” - John 11:35
• http://localhost:8080/solr/bible3/select?q=wept
• http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/
Similarity.html
98. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
99. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
100. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
101. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
• We’re searching the net_index and kjv_index fields, each with
a boost/weight of 1.
102. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
• We’re searching the net_index and kjv_index fields, each with
a boost/weight of 1.
• net_index^6 kjv_index^.5
103. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
• We’re searching the net_index and kjv_index fields, each with
a boost/weight of 1.
• net_index^6 kjv_index^.5
• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=.
1&qf=net_index^1%20kjv_index^1&fl=score
104. Search Multiple
Translations
• Another way: Dismax
• Can score a document (verse) match based on scores/matches
from multiple fields.
• net_index^1 kjv_index^1
• Not exponents - weights
• We’re searching the net_index and kjv_index fields, each with
a boost/weight of 1.
• net_index^6 kjv_index^.5
• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=.
1&qf=net_index^1%20kjv_index^1&fl=score
• http://localhost:8080/solr/bible4/select?q=respect%20for%20god&defType=dismax&tie=.
1&qf=net_index^6%20kjv_index^.5&fl=score
106. Topic Tagging
• Use a topically-tagged Bible/concordance to mark-
up each verse, or just key verses
107. Topic Tagging
• Use a topically-tagged Bible/concordance to mark-
up each verse, or just key verses
• Helpful for “theme” based queries.
108. Topic Tagging
• Use a topically-tagged Bible/concordance to mark-
up each verse, or just key verses
• Helpful for “theme” based queries.
• “Social Justice” - no good matches
109. Topic Tagging
• Use a topically-tagged Bible/concordance to mark-
up each verse, or just key verses
• Helpful for “theme” based queries.
• “Social Justice” - no good matches
• “Satan” - Many Names
110. Topic Tagging
• Use a topically-tagged Bible/concordance to mark-
up each verse, or just key verses
• Helpful for “theme” based queries.
• “Social Justice” - no good matches
• “Satan” - Many Names
• Name Tagging in general can be very helpful
114. Searching Strong’s
• Add a field for Strong’s: strongs_index
• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756
2316 3498 235 2198
• Most of the benefits of text searching
115. Searching Strong’s
• Add a field for Strong’s: strongs_index
• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756
2316 3498 235 2198
• Most of the benefits of text searching
• “Word” frequency
116. Searching Strong’s
• Add a field for Strong’s: strongs_index
• 1473 1510 2316 11 2316 2464 2532 2316 2384 1510 3756
2316 3498 235 2198
• Most of the benefits of text searching
• “Word” frequency
• Document vs. corpus frequency of search terms
122. Searching Articles
• Similar approach to text-based queries
• Stem words
• Use Synonyms
• Remove Stop Words
• Without manual tagging, there’s no automatic way
to index/search by Bible Reference
126. Searching Articles
• Article contains reference: “John 3”
• User searches for “John 3:16” or “John 2-4”
• Results: no meaningful matches at best
(unless the documents match the query
“John”
130. Searching Articles
• Solr-based Solutions:
• Identify and index references and their
composite verses using a grammar.
• John 1:1-3 -> John 1:1; John 1:2; John 1:3
131. Searching Articles
• Solr-based Solutions:
• Identify and index references and their
composite verses using a grammar.
• John 1:1-3 -> John 1:1; John 1:2; John 1:3
• Store in a multivalued field - each
reference is a “term”
132. Searching Articles
• Solr-based Solutions:
• Identify and index references and their
composite verses using a grammar.
• John 1:1-3 -> John 1:1; John 1:2; John 1:3
• Store in a multivalued field - each
reference is a “term”
• Must also parse and expand references in
queries in order to match
135. Searching Articles
• Relational database-based solution:
• Assign an id to every verse
136. Searching Articles
• Relational database-based solution:
• Assign an id to every verse
• Store: id, articleId, verseId
137. Searching Articles
• Relational database-based solution:
• Assign an id to every verse
• Store: id, articleId, verseId
• Parse user query to ids.
138. Searching Articles
• Relational database-based solution:
• Assign an id to every verse
• Store: id, articleId, verseId
• Parse user query to ids.
• SELECT COUNT(id)
WHERE verseId IN (ID_LIST)
GROUP BY articleId
139. Searching Articles
• Relational database-based solution:
• Assign an id to every verse
• Store: id, articleId, verseId
• Parse user query to ids.
• SELECT COUNT(id)
WHERE verseId IN (ID_LIST)
GROUP BY articleId
• Higher count -> Article is most likely to me more
about that reference than other articles with a
lower count
143. Searching Articles
• Relational database-based solution:
• Large amount of rows.
• 15,000 Journal articles have > 9,000,000 rows
(verse occurrences)
144. Searching Articles
• Relational database-based solution:
• Large amount of rows.
• 15,000 Journal articles have > 9,000,000 rows
(verse occurrences)
• Can store id, articleId, verseId, count
145. Searching Articles
• Relational database-based solution:
• Large amount of rows.
• 15,000 Journal articles have > 9,000,000 rows
(verse occurrences)
• Can store id, articleId, verseId, count
• Then SUM() the counts for each articleId.
146. Searching Articles
• Relational database-based solution:
• Large amount of rows.
• 15,000 Journal articles have > 9,000,000 rows
(verse occurrences)
• Can store id, articleId, verseId, count
• Then SUM() the counts for each articleId.
• Negligibly faster.
147. Searching Articles
• Relational database-based solution:
• Large amount of rows.
• 15,000 Journal articles have > 9,000,000 rows
(verse occurrences)
• Can store id, articleId, verseId, count
• Then SUM() the counts for each articleId.
• Negligibly faster.
• Only approx. 3,000,000 rows
150. Heterogeneous Indexes
• All content is not created equally.
• Content quality and its affect on the quality of
your results becomes a factor when you move
from one resource to > one
151. Heterogeneous Indexes
• All content is not created equally.
• Content quality and its affect on the quality of
your results becomes a factor when you move
from one resource to > one
• One Bible, One website, One Journal
152. Heterogeneous Indexes
• All content is not created equally.
• Content quality and its affect on the quality of
your results becomes a factor when you move
from one resource to > one
• One Bible, One website, One Journal
• Apply a field or document boost to help
normalize results
153. Heterogeneous Indexes
• All content is not created equally.
• Content quality and its affect on the quality of
your results becomes a factor when you move
from one resource to > one
• One Bible, One website, One Journal
• Apply a field or document boost to help
normalize results
• Some content gets bumped up and some down