SlideShare a Scribd company logo
1 of 24
Download to read offline
www.edureka.co/apache-solr
New-Age Search through Apache Solr
View Apache Solr course details at www.edureka.co/apache-solr
For Queries:
Post on Twitter @edurekaIN: #askEdureka
Post on Facebook /edurekaIN
For more details please contact us:
US : 1800 275 9730 (toll free)
INDIA : +91 88808 62004
Email Us : sales@edureka.co
Slide 2
LIVE Online Class
Class Recording in LMS
24/7 Post Class Support
Module Wise Quiz
Project Work
Verifiable Certificate
www.edureka.co/apache-solr
How it Works?
Slide 3 www.edureka.co/apache-solr
Objectives
At the end of this module, you will be able to understand:
The need for search engine for enterprise grade applications
The objectives & challenges of search engine
How is Indexing & Searching Handled in Lucene
Solr and its Architecture
Near Real Time Search with Solr
Leveraging Solr Capabilities with Hadoop
Solr with YARN
About job opportunity for Solr Developers
Slide 4Slide 4Slide 4 www.edureka.co/apache-solr
Why Do I Need Search Engines ?
Slide 5Slide 5Slide 5 www.edureka.co/apache-solr
Search Engine: Why do I need them?
1. Text Based Search
2. Filter
3. Documents
1
2
3
Slide 6Slide 6Slide 6 www.edureka.co/apache-solr
Search Engine – What it should be?
If you need a storage engine to search records / documents using text-based keywords it should support following
features:
1. Should be optimized for faster text searches
2. Should have flexible schema
3. Should support sorting of documents
4. Web Scale - Should be optimized for reads
5. Should be document oriented
Slide 7Slide 7Slide 7 www.edureka.co/apache-solr
Cleartrip Spatial Search
Slide 8Slide 8Slide 8 www.edureka.co/apache-solr
What is Lucene ?
 Lucene is a powerful Java search library that lets you easily add search or Information Retrieval (IR) to applications
 Used by LinkedIn, Twitter, … and many more (see http://wiki.apache.org/lucene-java/PoweredBy )
 Scalable & High-performance Indexing
 Powerful, Accurate and Efficient Search Algorithms
 Cross-Platform Solution
» Open Source & 100% pure Java
» Implementations in other programming languages available that are index-compatible
Doug Cutting “Creator”
Slide 9Slide 9Slide 9 www.edureka.co/apache-solr
Indexing – How it works?
I like edureka courses
Edureka teaches big
data courses
Edureka helps learn new
technologies easily
Document - 1 (“D1”) Document - 2 (“D2”) Document - 3 (“D3”)
“edureka” = {D1, D2, D3}
“courses” = {D1, D2}
“teaches” = {D2}
“big” = {D2}
“data” = {D2}
“helps” = {D3}
“edureka”
Slide 10Slide 10Slide 10 www.edureka.co/apache-solr
Lucene – Writing to Index
Field
Field
Field
Field
Analyzer IndexWriter Directory
Document
Classes used when indexing documents with Lucene
Slide 11Slide 11Slide 11 www.edureka.co/apache-solr
Lucene – Searching In Index
QueryParser
Analyzer
IndexSearcherExpression
Query object
Text fragments
 Query Parser translates a textual expression from the end into an arbitrarily complex query for searching
Slide 12Slide 12Slide 12 www.edureka.co/apache-solr
Solr is an open source enterprise search server / web application
Solr Uses the Lucene Search Library and extends it
Solr exposes lucene Java API’s as RESTful services
You put documents in it (called "indexing") via XML, JSON, CSV or binary over HTTP
You query it via HTTP GET and receive XML, JSON, CSV or binary results
What is Solr ?
Slide 13Slide 13Slide 13 www.edureka.co/apache-solr
Advanced Full-Text Search Capabilities
Optimized for High Volume Web Traffic
Standards Based Open Interfaces - XML, JSON and HTTP
Comprehensive HTML Administration Interfaces
Server statistics exposed over JMX for monitoring
Near Real-time indexing and Adaptable with XML Configuration
Linearly scalable, auto index replication, auto, Extensible Plugin Architecture
Solr: Key Features
Slide 14Slide 14Slide 14 www.edureka.co/apache-solr
Solr: Architecture
Slide 15Slide 15Slide 15 www.edureka.co/apache-solr
Request
Handler
Query Parser
Response
Writer
Index
qt: selects a RequestHandler for a query using/select(by default, the DisMaxRequestHandler is used)
defType : selects a query parser for the query
(by default, uses whatever has been
configured for the RequestHandler)
qf: selects which fields to query
in the index(by default, all fields
are required)
wt: selects a response writer
for formatting the query
response
fq: filters query by applying an additional query to
the initial query’s results, caches the results
Rows:
specifies the
number of rows
to be displayed
at one time
Start: specifies an
offset(by default 0)
into the query results
where the returned
response should begin
Solr: Search Process
Slide 16Slide 16Slide 16 www.edureka.co/apache-solr
Near Real-Time Search
 Near Real Time (NRT) search means that documents are available for search almost immediately after being
indexed: additions and updates to documents are seen in 'near' real time
http://localhost:8983/solr/update?stream.body=<add><doc><fieldname="id">testdoc</field></doc></add>&co
mmit=true
Slide 17Slide 17Slide 17 www.edureka.co/apache-solr
Real-Time Get
 The realtime get feature allows retrieval (by unique-key) of the latest version of any documents without the
associated cost of reopening a searcher
 This is primarily useful when using Solr as a NoSQL data store and not just a search index
Slide 18Slide 18Slide 18 www.edureka.co/apache-solr
Leveraging Solr Capabilities with Hadoop
 Solr provides us fast, efficient, powerful full-text search and near real-time indexing and SolrCloud is flexible
distributed search and indexing, and will do things like automatic fail over etc.
 Hence its very suitable as NoSQL replacement for traditional databases in many situations, especially when the size of
the data exceeds what is reasonable with a typical RDBMS
 We can do scalable indexing using Hadoop MapReduce or PIG job and then load the indexed data in Solr
 In all the major Hadoop distribution like Cloudera, Hortonworks, MapR you can integrate Solr easily
Slide 19Slide 19Slide 19 www.edureka.co/apache-solr
PDF
Word
HTML
.
.
.
Raw Files
Lucene
SolR SolR SolR
Query Response
Search
Web App
MapReduce
Indexing Job
Raw Files Indexed
HDFS
(Hadoop Distributed File System)
Scalable Indexing
Input Data
Slide 20Slide 20Slide 20 www.edureka.co/apache-solr
Solr with YARN
Slide 21Slide 21Slide 21 www.edureka.co/apache-solr
Job trends for Apache Solr
Slide 22Slide 22Slide 22 www.edureka.co/apache-solr
Disclaimer
Criteria and guidelines mentioned in this presentation may change. Please visit our website for
latest and additional information on Apache Solr
Slide 23Slide 23Slide 23 www.edureka.co/apache-solr
Course Topics
 Module 5
» Solr Searching
 Module 6
» Solr Extended Features
 Module 7
» Solr Cloud & Administration
 Module 8
» Final Project
 Module 1
» Introduction to Apache Lucene
 Module 2
» Exploring Lucene
 Module 3
» Introduction to Apache Solr
 Module 4
» Solr Indexing
New-Age Search through Apache Solr

More Related Content

What's hot

Oracle Autonomous Database for Developers
Oracle Autonomous Database for DevelopersOracle Autonomous Database for Developers
Oracle Autonomous Database for DevelopersTércio Costa
 
Oracle SQL Developer Tips & Tricks
Oracle SQL Developer Tips & TricksOracle SQL Developer Tips & Tricks
Oracle SQL Developer Tips & TricksJeff Smith
 
PLSQL Developer tips and tricks
PLSQL Developer tips and tricksPLSQL Developer tips and tricks
PLSQL Developer tips and tricksPatrick Barel
 
Search engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGSearch engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGVignesh sitaraman
 
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.Jim Czuprynski
 
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Jim Czuprynski
 
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)Juan Fabian
 
Persitance Data with sqlite
Persitance Data with sqlitePersitance Data with sqlite
Persitance Data with sqliteArif Huda
 
Oracle Endeca Developer's Guide
Oracle Endeca Developer's GuideOracle Endeca Developer's Guide
Oracle Endeca Developer's GuideKeyur Shah
 
Enterprise Level Application Architecture with Web APIs using Entity Framewor...
Enterprise Level Application Architecture with Web APIs using Entity Framewor...Enterprise Level Application Architecture with Web APIs using Entity Framewor...
Enterprise Level Application Architecture with Web APIs using Entity Framewor...Akhil Mittal
 
PL/SQL Interview Questions
PL/SQL Interview QuestionsPL/SQL Interview Questions
PL/SQL Interview QuestionsSrinimf-Slides
 
SharePoint 2010 Application Development Overview
SharePoint 2010 Application Development OverviewSharePoint 2010 Application Development Overview
SharePoint 2010 Application Development OverviewRob Windsor
 
SQLite Database Tutorial In Android
SQLite Database Tutorial In AndroidSQLite Database Tutorial In Android
SQLite Database Tutorial In AndroidAndroid 5
 
Introduction to SQLite: The Most Popular Database in the World
Introduction to SQLite: The Most Popular Database in the WorldIntroduction to SQLite: The Most Popular Database in the World
Introduction to SQLite: The Most Popular Database in the Worldjkreibich
 

What's hot (19)

Oracle Autonomous Database for Developers
Oracle Autonomous Database for DevelopersOracle Autonomous Database for Developers
Oracle Autonomous Database for Developers
 
Oracle SQL Developer Tips & Tricks
Oracle SQL Developer Tips & TricksOracle SQL Developer Tips & Tricks
Oracle SQL Developer Tips & Tricks
 
PLSQL Developer tips and tricks
PLSQL Developer tips and tricksPLSQL Developer tips and tricks
PLSQL Developer tips and tricks
 
Search engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATGSearch engine optimization (seo) from Endeca & ATG
Search engine optimization (seo) from Endeca & ATG
 
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
What's Your Super-Power? Mine is Machine Learning with Oracle Autonomous DB.
 
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
Fast and Furious: Handling Edge Computing Data With Oracle 19c Fast Ingest an...
 
Intro To Asp
Intro To AspIntro To Asp
Intro To Asp
 
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)
SQL Server for Dynamics 365 Finance and Operations (LBD - On Premises)
 
Persitance Data with sqlite
Persitance Data with sqlitePersitance Data with sqlite
Persitance Data with sqlite
 
Oracle Endeca Developer's Guide
Oracle Endeca Developer's GuideOracle Endeca Developer's Guide
Oracle Endeca Developer's Guide
 
Enterprise Level Application Architecture with Web APIs using Entity Framewor...
Enterprise Level Application Architecture with Web APIs using Entity Framewor...Enterprise Level Application Architecture with Web APIs using Entity Framewor...
Enterprise Level Application Architecture with Web APIs using Entity Framewor...
 
PL/SQL Interview Questions
PL/SQL Interview QuestionsPL/SQL Interview Questions
PL/SQL Interview Questions
 
SharePoint 2010 Application Development Overview
SharePoint 2010 Application Development OverviewSharePoint 2010 Application Development Overview
SharePoint 2010 Application Development Overview
 
Metadata API
Metadata  APIMetadata  API
Metadata API
 
SQLite Database Tutorial In Android
SQLite Database Tutorial In AndroidSQLite Database Tutorial In Android
SQLite Database Tutorial In Android
 
Introduction to SQLite: The Most Popular Database in the World
Introduction to SQLite: The Most Popular Database in the WorldIntroduction to SQLite: The Most Popular Database in the World
Introduction to SQLite: The Most Popular Database in the World
 
Sqlite
SqliteSqlite
Sqlite
 
SQLite database in android
SQLite database in androidSQLite database in android
SQLite database in android
 
Android Database
Android DatabaseAndroid Database
Android Database
 

Viewers also liked

Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th juneEdureka!
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solrsagar chaturvedi
 
An Introduction to Solr
An Introduction to SolrAn Introduction to Solr
An Introduction to Solrtomhill
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucenelucenerevolution
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginsearchbox-com
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of LightErik Hatcher
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaDropsolid
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksLucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-WebinarEdureka!
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache SolrChristos Manios
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy SokolenkoProvectus
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solrTrey Grainger
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?lucenerevolution
 

Viewers also liked (19)

Python webinar 4th june
Python webinar 4th junePython webinar 4th june
Python webinar 4th june
 
Enterprise Search Using Apache Solr
Enterprise Search Using Apache SolrEnterprise Search Using Apache Solr
Enterprise Search Using Apache Solr
 
An Introduction to Solr
An Introduction to SolrAn Introduction to Solr
An Introduction to Solr
 
Faceted Search with Lucene
Faceted Search with LuceneFaceted Search with Lucene
Faceted Search with Lucene
 
Tutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component pluginTutorial on developing a Solr search component plugin
Tutorial on developing a Solr search component plugin
 
Solr: Search at the Speed of Light
Solr: Search at the Speed of LightSolr: Search at the Speed of Light
Solr: Search at the Speed of Light
 
Apache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 AcquiaApache Solr Search Course Drupal 7 Acquia
Apache Solr Search Course Drupal 7 Acquia
 
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, LucidworksState of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
State of Solr Security 2016: Presented by Ishan Chattopadhyaya, Lucidworks
 
Apache Solr-Webinar
Apache Solr-WebinarApache Solr-Webinar
Apache Solr-Webinar
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Apache Solr/Lucene Internals by Anatoliy Sokolenko
Apache Solr/Lucene Internals  by Anatoliy SokolenkoApache Solr/Lucene Internals  by Anatoliy Sokolenko
Apache Solr/Lucene Internals by Anatoliy Sokolenko
 
Lucene for Solr Developers
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
 
Python webinar 2nd july
Python webinar 2nd julyPython webinar 2nd july
Python webinar 2nd july
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Scaling Solr with Solr Cloud
Scaling Solr with Solr CloudScaling Solr with Solr Cloud
Scaling Solr with Solr Cloud
 
Building a real time big data analytics platform with solr
Building a real time big data analytics platform with solrBuilding a real time big data analytics platform with solr
Building a real time big data analytics platform with solr
 
What is in a Lucene index?
What is in a Lucene index?What is in a Lucene index?
What is in a Lucene index?
 

Similar to New-Age Search through Apache Solr

New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialSourcesense
 
Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...jaxLondonConference
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkC4Media
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)dnaber
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher lucenerevolution
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersLucidworks
 
1 extreme performance - part i
1   extreme performance - part i1   extreme performance - part i
1 extreme performance - part isqlserver.co.il
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Paulo Gutierrez
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solrlucenerevolution
 
Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingPaco Nathan
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsAndrew Morgan
 

Similar to New-Age Search through Apache Solr (20)

New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Dev8d Apache Solr Tutorial
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
 
Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...Large scale, interactive ad-hoc queries over different datastores with Apache...
Large scale, interactive ad-hoc queries over different datastores with Apache...
 
Unified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache SparkUnified Big Data Processing with Apache Spark
Unified Big Data Processing with Apache Spark
 
Apache Lucene Searching The Web
Apache Lucene Searching The WebApache Lucene Searching The Web
Apache Lucene Searching The Web
 
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)Apache Lucene: Searching the Web and Everything Else (Jazoon07)
Apache Lucene: Searching the Web and Everything Else (Jazoon07)
 
Solr Architecture
Solr ArchitectureSolr Architecture
Solr Architecture
 
Solr 101
Solr 101Solr 101
Solr 101
 
Rapid Prototyping with Solr
Rapid Prototyping with SolrRapid Prototyping with Solr
Rapid Prototyping with Solr
 
Rapid prototyping with solr - By Erik Hatcher
Rapid prototyping with solr -  By Erik Hatcher Rapid prototyping with solr -  By Erik Hatcher
Rapid prototyping with solr - By Erik Hatcher
 
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, LucidworksngineersSQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
SQL Analytics for Search Engineers - Timothy Potter, Lucidworksngineers
 
1 extreme performance - part i
1   extreme performance - part i1   extreme performance - part i
1 extreme performance - part i
 
Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要Spark + AI Summit 2020 イベント概要
Spark + AI Summit 2020 イベント概要
 
Make your gui shine with ajax solr
Make your gui shine with ajax solrMake your gui shine with ajax solr
Make your gui shine with ajax solr
 
Neo4j Vision and Roadmap
Neo4j Vision and Roadmap Neo4j Vision and Roadmap
Neo4j Vision and Roadmap
 
Agile Data Science
Agile Data ScienceAgile Data Science
Agile Data Science
 
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark StreamingTiny Batches, in the wine: Shiny New Bits in Spark Streaming
Tiny Batches, in the wine: Shiny New Bits in Spark Streaming
 
Symphony Driver Essay
Symphony Driver EssaySymphony Driver Essay
Symphony Driver Essay
 
20170126 big data processing
20170126 big data processing20170126 big data processing
20170126 big data processing
 
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worldsOUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
 

More from Edureka!

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaEdureka!
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaEdureka!
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaEdureka!
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaEdureka!
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaEdureka!
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaEdureka!
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaEdureka!
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaEdureka!
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaEdureka!
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaEdureka!
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | EdurekaEdureka!
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEdureka!
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEdureka!
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaEdureka!
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaEdureka!
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaEdureka!
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Edureka!
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaEdureka!
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaEdureka!
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | EdurekaEdureka!
 

More from Edureka! (20)

What to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | EdurekaWhat to learn during the 21 days Lockdown | Edureka
What to learn during the 21 days Lockdown | Edureka
 
Top 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | EdurekaTop 10 Dying Programming Languages in 2020 | Edureka
Top 10 Dying Programming Languages in 2020 | Edureka
 
Top 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | EdurekaTop 5 Trending Business Intelligence Tools | Edureka
Top 5 Trending Business Intelligence Tools | Edureka
 
Tableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | EdurekaTableau Tutorial for Data Science | Edureka
Tableau Tutorial for Data Science | Edureka
 
Python Programming Tutorial | Edureka
Python Programming Tutorial | EdurekaPython Programming Tutorial | Edureka
Python Programming Tutorial | Edureka
 
Top 5 PMP Certifications | Edureka
Top 5 PMP Certifications | EdurekaTop 5 PMP Certifications | Edureka
Top 5 PMP Certifications | Edureka
 
Top Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | EdurekaTop Maven Interview Questions in 2020 | Edureka
Top Maven Interview Questions in 2020 | Edureka
 
Linux Mint Tutorial | Edureka
Linux Mint Tutorial | EdurekaLinux Mint Tutorial | Edureka
Linux Mint Tutorial | Edureka
 
How to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| EdurekaHow to Deploy Java Web App in AWS| Edureka
How to Deploy Java Web App in AWS| Edureka
 
Importance of Digital Marketing | Edureka
Importance of Digital Marketing | EdurekaImportance of Digital Marketing | Edureka
Importance of Digital Marketing | Edureka
 
RPA in 2020 | Edureka
RPA in 2020 | EdurekaRPA in 2020 | Edureka
RPA in 2020 | Edureka
 
Email Notifications in Jenkins | Edureka
Email Notifications in Jenkins | EdurekaEmail Notifications in Jenkins | Edureka
Email Notifications in Jenkins | Edureka
 
EA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | EdurekaEA Algorithm in Machine Learning | Edureka
EA Algorithm in Machine Learning | Edureka
 
Cognitive AI Tutorial | Edureka
Cognitive AI Tutorial | EdurekaCognitive AI Tutorial | Edureka
Cognitive AI Tutorial | Edureka
 
AWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | EdurekaAWS Cloud Practitioner Tutorial | Edureka
AWS Cloud Practitioner Tutorial | Edureka
 
Blue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | EdurekaBlue Prism Top Interview Questions | Edureka
Blue Prism Top Interview Questions | Edureka
 
Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka Big Data on AWS Tutorial | Edureka
Big Data on AWS Tutorial | Edureka
 
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | EdurekaA star algorithm | A* Algorithm in Artificial Intelligence | Edureka
A star algorithm | A* Algorithm in Artificial Intelligence | Edureka
 
Kubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | EdurekaKubernetes Installation on Ubuntu | Edureka
Kubernetes Installation on Ubuntu | Edureka
 
Introduction to DevOps | Edureka
Introduction to DevOps | EdurekaIntroduction to DevOps | Edureka
Introduction to DevOps | Edureka
 

Recently uploaded

WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data SciencePaolo Missier
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...caitlingebhard1
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMKumar Satyam
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuidePixlogix Infotech
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 

Recently uploaded (20)

Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
WSO2 Micro Integrator for Enterprise Integration in a Decentralized, Microser...
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...Stronger Together: Developing an Organizational Strategy for Accessible Desig...
Stronger Together: Developing an Organizational Strategy for Accessible Desig...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Decarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational PerformanceDecarbonising Commercial Real Estate: The Role of Operational Performance
Decarbonising Commercial Real Estate: The Role of Operational Performance
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

New-Age Search through Apache Solr

  • 1. www.edureka.co/apache-solr New-Age Search through Apache Solr View Apache Solr course details at www.edureka.co/apache-solr For Queries: Post on Twitter @edurekaIN: #askEdureka Post on Facebook /edurekaIN For more details please contact us: US : 1800 275 9730 (toll free) INDIA : +91 88808 62004 Email Us : sales@edureka.co
  • 2. Slide 2 LIVE Online Class Class Recording in LMS 24/7 Post Class Support Module Wise Quiz Project Work Verifiable Certificate www.edureka.co/apache-solr How it Works?
  • 3. Slide 3 www.edureka.co/apache-solr Objectives At the end of this module, you will be able to understand: The need for search engine for enterprise grade applications The objectives & challenges of search engine How is Indexing & Searching Handled in Lucene Solr and its Architecture Near Real Time Search with Solr Leveraging Solr Capabilities with Hadoop Solr with YARN About job opportunity for Solr Developers
  • 4. Slide 4Slide 4Slide 4 www.edureka.co/apache-solr Why Do I Need Search Engines ?
  • 5. Slide 5Slide 5Slide 5 www.edureka.co/apache-solr Search Engine: Why do I need them? 1. Text Based Search 2. Filter 3. Documents 1 2 3
  • 6. Slide 6Slide 6Slide 6 www.edureka.co/apache-solr Search Engine – What it should be? If you need a storage engine to search records / documents using text-based keywords it should support following features: 1. Should be optimized for faster text searches 2. Should have flexible schema 3. Should support sorting of documents 4. Web Scale - Should be optimized for reads 5. Should be document oriented
  • 7. Slide 7Slide 7Slide 7 www.edureka.co/apache-solr Cleartrip Spatial Search
  • 8. Slide 8Slide 8Slide 8 www.edureka.co/apache-solr What is Lucene ?  Lucene is a powerful Java search library that lets you easily add search or Information Retrieval (IR) to applications  Used by LinkedIn, Twitter, … and many more (see http://wiki.apache.org/lucene-java/PoweredBy )  Scalable & High-performance Indexing  Powerful, Accurate and Efficient Search Algorithms  Cross-Platform Solution » Open Source & 100% pure Java » Implementations in other programming languages available that are index-compatible Doug Cutting “Creator”
  • 9. Slide 9Slide 9Slide 9 www.edureka.co/apache-solr Indexing – How it works? I like edureka courses Edureka teaches big data courses Edureka helps learn new technologies easily Document - 1 (“D1”) Document - 2 (“D2”) Document - 3 (“D3”) “edureka” = {D1, D2, D3} “courses” = {D1, D2} “teaches” = {D2} “big” = {D2} “data” = {D2} “helps” = {D3} “edureka”
  • 10. Slide 10Slide 10Slide 10 www.edureka.co/apache-solr Lucene – Writing to Index Field Field Field Field Analyzer IndexWriter Directory Document Classes used when indexing documents with Lucene
  • 11. Slide 11Slide 11Slide 11 www.edureka.co/apache-solr Lucene – Searching In Index QueryParser Analyzer IndexSearcherExpression Query object Text fragments  Query Parser translates a textual expression from the end into an arbitrarily complex query for searching
  • 12. Slide 12Slide 12Slide 12 www.edureka.co/apache-solr Solr is an open source enterprise search server / web application Solr Uses the Lucene Search Library and extends it Solr exposes lucene Java API’s as RESTful services You put documents in it (called "indexing") via XML, JSON, CSV or binary over HTTP You query it via HTTP GET and receive XML, JSON, CSV or binary results What is Solr ?
  • 13. Slide 13Slide 13Slide 13 www.edureka.co/apache-solr Advanced Full-Text Search Capabilities Optimized for High Volume Web Traffic Standards Based Open Interfaces - XML, JSON and HTTP Comprehensive HTML Administration Interfaces Server statistics exposed over JMX for monitoring Near Real-time indexing and Adaptable with XML Configuration Linearly scalable, auto index replication, auto, Extensible Plugin Architecture Solr: Key Features
  • 14. Slide 14Slide 14Slide 14 www.edureka.co/apache-solr Solr: Architecture
  • 15. Slide 15Slide 15Slide 15 www.edureka.co/apache-solr Request Handler Query Parser Response Writer Index qt: selects a RequestHandler for a query using/select(by default, the DisMaxRequestHandler is used) defType : selects a query parser for the query (by default, uses whatever has been configured for the RequestHandler) qf: selects which fields to query in the index(by default, all fields are required) wt: selects a response writer for formatting the query response fq: filters query by applying an additional query to the initial query’s results, caches the results Rows: specifies the number of rows to be displayed at one time Start: specifies an offset(by default 0) into the query results where the returned response should begin Solr: Search Process
  • 16. Slide 16Slide 16Slide 16 www.edureka.co/apache-solr Near Real-Time Search  Near Real Time (NRT) search means that documents are available for search almost immediately after being indexed: additions and updates to documents are seen in 'near' real time http://localhost:8983/solr/update?stream.body=<add><doc><fieldname="id">testdoc</field></doc></add>&co mmit=true
  • 17. Slide 17Slide 17Slide 17 www.edureka.co/apache-solr Real-Time Get  The realtime get feature allows retrieval (by unique-key) of the latest version of any documents without the associated cost of reopening a searcher  This is primarily useful when using Solr as a NoSQL data store and not just a search index
  • 18. Slide 18Slide 18Slide 18 www.edureka.co/apache-solr Leveraging Solr Capabilities with Hadoop  Solr provides us fast, efficient, powerful full-text search and near real-time indexing and SolrCloud is flexible distributed search and indexing, and will do things like automatic fail over etc.  Hence its very suitable as NoSQL replacement for traditional databases in many situations, especially when the size of the data exceeds what is reasonable with a typical RDBMS  We can do scalable indexing using Hadoop MapReduce or PIG job and then load the indexed data in Solr  In all the major Hadoop distribution like Cloudera, Hortonworks, MapR you can integrate Solr easily
  • 19. Slide 19Slide 19Slide 19 www.edureka.co/apache-solr PDF Word HTML . . . Raw Files Lucene SolR SolR SolR Query Response Search Web App MapReduce Indexing Job Raw Files Indexed HDFS (Hadoop Distributed File System) Scalable Indexing Input Data
  • 20. Slide 20Slide 20Slide 20 www.edureka.co/apache-solr Solr with YARN
  • 21. Slide 21Slide 21Slide 21 www.edureka.co/apache-solr Job trends for Apache Solr
  • 22. Slide 22Slide 22Slide 22 www.edureka.co/apache-solr Disclaimer Criteria and guidelines mentioned in this presentation may change. Please visit our website for latest and additional information on Apache Solr
  • 23. Slide 23Slide 23Slide 23 www.edureka.co/apache-solr Course Topics  Module 5 » Solr Searching  Module 6 » Solr Extended Features  Module 7 » Solr Cloud & Administration  Module 8 » Final Project  Module 1 » Introduction to Apache Lucene  Module 2 » Exploring Lucene  Module 3 » Introduction to Apache Solr  Module 4 » Solr Indexing