SlideShare a Scribd company logo
1 of 44
Download to read offline
Using SOLR as Open-Source Search
Platform for Organizational Research
Experts Retrieval
Dr. Gan Keng Hoon
School of Computer Sciences
Universiti Sains Malaysia
GKH USM 1
7th Annual Open Source Summit 2020
Day 5 - 15th December
Talk Outline
Text Search with SOLR
What is SOLR
What is Text Search
Text Search in an Organization
Information Retrieval Concept and SOLR in Use
Demo of Organizational Research Expert Retrieval
GKH USM 2
Part I: Text Search with SOLR
GKH USM 3
What is SOLR?
GKH USM 4
SOLR
Solr is the
enterprise search platform built on
Apache Lucene™.
GKH USM 5
popular
blazing-fast
open source
Solr Resources
Official Website: https://lucene.apache.org/solr/
Documentation: https://lucene.apache.org/solr/resources.html
GKH USM 6
What is Text Search?
Are you thinking about Google?
GKH USM 7
Let’s start with something we are familiar
with …
Users want to type in a few simple keywords and get back great results.
Involved matching query terms to documents
GKH USM 8
Ranked Results
Return “ranked” documents for a query.
A search engine returns documents sorted in descending order by a
score that indicates the strength of the match of the document to the
query.
Ranking by relevancy is important.
Side Question: How to determine the relevancy?
GKH USM 9
What Else
Search engine also
Returns Images, Videos, Social Feeds, Products etc.
Provides search suggestions, auto complete etc.
Gives answer, facts, files etc.
GKH USM 10
Text Search Solution in Your Organization
How do you perform the search within your organization?
documents finding tool in your operating system.
querying of data stored in your database using Sql.
Will you implement a (similar to) Google Search Engine for
your organization?
GKH USM 11
Questions for Audience
1. List one TEXT or DOCUMENT
LOOKUP/SEARCH/NAVIGATION PROBLEM that you have
encountered in your organization.
2. Is there any existing search tool or feature implemented
within your organization that you can adapt to address the
problem?
GKH USM 12
Enterprise Search
Managing search solutions within an organization or for the benefit of an
organization …
GKH USM 13
IR and Search Engines
Relevance
-Effective ranking
Evaluation
-Testing and
measuring
Information needs
-User interaction
Performance
-Efficient search and indexing
Incorporating new data
-Coverage and freshness
Scalability
-Growing with data and users
Adaptability
-Tuning for applications
Specific problems
-e.g. Spam
Information
Retrieval
Search
Engines
14
Enterprise
Search
Engines
GKH USM
Enterprise Search Features
1. Not only Texts.
Unifying structured and unstructured data.
2. Not only Search.
Search + Analytics.
3. Not for Everyone.
Not addressing a general problem, but specific to
application/domain/business needs.
GKH USM 15
Part II: IR Concept & SOLR in Use
GKH USM 16
High Level Information Retrieval Concept
GKH USM 17
Documents
Document
Representation
Information Needs
Query
Retrieved
Documents
Indexing Formulation
Retrieval Function
Relevance Feedback
Diagram
of
the
main
components
of
Solr
4
GKH USM 18
Image Source: Solr in
Action, Graiger & Potter
Documents
Indexing
Document
Representation
Query
Retrieval Function
Retrieved
Documents
Glimpse of Solr In Use
GKH USM 19
SOLR Downloads Site
GKH USM 20
Unzipping SOLR into Directory
For Windows users, we highly recommend that you extract Solr to a directory that
doesn’t have spaces in the name; that is, avoid extracting Solr into directories like
C:Documents and Settings or C:Program Files. For example, use path like
c:solr-8.2.0 instead.
For Linux users, choose a location like /opt/solr/.
GKH USM 21
View SOLR in Your Directory
Example directory listing of the solr-8.2.0 installation after extracting the
downloaded archive on your computer. We’ll refer to the top-level
directory as $SOLR_INSTALL/ throughout the rest of the slides.
GKH USM 22
Start Solr
To start Solr, you need to run solr script located at the bin folder.
For example, if your placed solr at c:solr-8.2.0
Open a command line, and enter the following:
$ cd #this gets you to the base directory
$ cd $SOLR_INSTALL #this gets you to your solr folder
$ cd bin #this gets you to your bin folder
$ bin/solr start
Note: cd – change directory
GKH USM 23
Start Solr
GKH USM 24
Admin Console - http://localhost:8983/solr/
GKH USM 25
Create Your First Core
Your running server is empty.
Create your first core, called “techproduct”.
Go to command prompt,
$ bin/solr create –c techproduct
GKH USM 26
Meet Your First Core
GKH USM 27
View Properties of The Core
* You can also Add/Rename Core at the admin page.
GKH USM 28
Add Some Example Documents
When you first start Solr, there are no documents in the index. It’s an empty server
waiting to be filled with data to search.
Let’s add some documents from exampleexampledocs directory.
What are the example file types in your
$ SOLR_INSTALLexampleexampledocs ??
GKH USM 29
Use Post Tool to Add Documents
For Unix user, you can call Post Tool from bin
$ bin/post -c techproduct example/exampledocs/*.xml
For Window user, navigate to the exampleexampledocs folder
$ cd
$ cd $SOLR_INSTALLexampleexampledocs
$ java -jar –Dc=techproduct post.jar *.xml
specify core
the files to be added. In this case, we
are adding all files with .xml type
GKH USM 30
Status of Added Files
GKH USM 31
What is the speed of indexing?
Let’s Search
Go to http://localhost:8983/solr/
Select techproduct core
Select Query tab
Enter *:* at the query form.
GKH USM 32
GKH USM 33
Search results from
executing
the find of all
documents query.
View More Search Results
In the query form,
• Change start to 0
• Change rows to 32
GKH USM 34
14 files were indexed, but the search *:*
found 32 documents.
GKH USM 35
Part III: Demo of Organizational
Research Expert Retrieval
GKH USM 36
Organization Research Experts Retrieval
Target Users: Students, Researchers, Collaborators looking for expertise
from School of Computer Sciences, Universiti Sains Malaysia.
Data Set: Scopus publication data for all academics at the school.
Status: Prototype.
Purpose: In house solution.
Focused search and analytics capabilities.
GKH USM 37
1. Design Document/Retrieval Unit
GKH USM 38
2. Create Solr Core
GKH USM 39
Create a new core to
store the collection
3. Perform Indexing
Perform indexing at the
backend
using addDocuments()
by Solarium PHP library
GKH USM 40
https://github.com/solariumphp
4. Implement Search Front End
GKH USM 41
5. Format the Response into Results Page
GKH USM 42
6. Demo
Visit the prototype at
http://ir.cs.usm.my/exsearch4/
Try search “cryptography”.
GKH USM 43
Thank you
GKH USM 44
Visit our school at cs.usm.my
The work by IR research at ir.cs.usm.my
Drop me an email at khganATusm.my

More Related Content

Similar to OSS 2020 Using SOLR as Open-Source Search Platform.pdf

Object oriented software_engg
Object oriented software_enggObject oriented software_engg
Object oriented software_enggAnnie Thomas
 
Whats new-in-solr-8-typo3-camp
Whats new-in-solr-8-typo3-campWhats new-in-solr-8-typo3-camp
Whats new-in-solr-8-typo3-camptimohund
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrLucidworks (Archived)
 
Content search api in sitecore 8.1
Content search api in sitecore 8.1Content search api in sitecore 8.1
Content search api in sitecore 8.1Anindita Bhattacharya
 
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )Assamese search engine using SOLR by Moinuddin Ahmed ( moin )
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )'Moinuddin Ahmed
 
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve content
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve contentOpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve content
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve contentAlkacon Software GmbH & Co. KG
 
Knolidge - Discover What You Have
Knolidge - Discover What You HaveKnolidge - Discover What You Have
Knolidge - Discover What You Haveknolidge
 
Wordpress as a framework
Wordpress as a frameworkWordpress as a framework
Wordpress as a frameworkAggelos Synadakis
 
Getting Started with Splunk Enterprise Hands-On Breakout Session
Getting Started with Splunk Enterprise Hands-On Breakout SessionGetting Started with Splunk Enterprise Hands-On Breakout Session
Getting Started with Splunk Enterprise Hands-On Breakout SessionSplunk
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchLucidworks
 
Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Robin O'Brien
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.inovex GmbH
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache SolrEdureka!
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampKais Hassan, PhD
 

Similar to OSS 2020 Using SOLR as Open-Source Search Platform.pdf (20)

Object oriented software_engg
Object oriented software_enggObject oriented software_engg
Object oriented software_engg
 
Whats new-in-solr-8-typo3-camp
Whats new-in-solr-8-typo3-campWhats new-in-solr-8-typo3-camp
Whats new-in-solr-8-typo3-camp
 
ICONUK 2015 - Gradle Up!
ICONUK 2015 - Gradle Up!ICONUK 2015 - Gradle Up!
ICONUK 2015 - Gradle Up!
 
Knolidge
KnolidgeKnolidge
Knolidge
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Indexing Text and HTML Files with Solr
Indexing Text and HTML Files with SolrIndexing Text and HTML Files with Solr
Indexing Text and HTML Files with Solr
 
Content search api in sitecore 8.1
Content search api in sitecore 8.1Content search api in sitecore 8.1
Content search api in sitecore 8.1
 
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )Assamese search engine using SOLR by Moinuddin Ahmed ( moin )
Assamese search engine using SOLR by Moinuddin Ahmed ( moin )
 
3 google hacking
3 google hacking3 google hacking
3 google hacking
 
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve content
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve contentOpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve content
OpenCms Days 2012 - OpenCms 8.5: Using Apache Solr to retrieve content
 
Knolidge - Discover What You Have
Knolidge - Discover What You HaveKnolidge - Discover What You Have
Knolidge - Discover What You Have
 
Wordpress as a framework
Wordpress as a frameworkWordpress as a framework
Wordpress as a framework
 
Getting Started with Splunk Enterprise Hands-On Breakout Session
Getting Started with Splunk Enterprise Hands-On Breakout SessionGetting Started with Splunk Enterprise Hands-On Breakout Session
Getting Started with Splunk Enterprise Hands-On Breakout Session
 
Reduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective SearchReduce Query Time Up to 60% with Selective Search
Reduce Query Time Up to 60% with Selective Search
 
Introduction to Behavior Driven Development
Introduction to Behavior Driven Development Introduction to Behavior Driven Development
Introduction to Behavior Driven Development
 
Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.Suche mit Apache Lucene & Co.
Suche mit Apache Lucene & Co.
 
New-Age Search through Apache Solr
New-Age Search through Apache SolrNew-Age Search through Apache Solr
New-Age Search through Apache Solr
 
Information Retrieval - Data Science Bootcamp
Information Retrieval - Data Science BootcampInformation Retrieval - Data Science Bootcamp
Information Retrieval - Data Science Bootcamp
 
Basic Android
Basic AndroidBasic Android
Basic Android
 

More from Gan Keng Hoon

A View of Text Analytics from Word, Sentence and Document Levels
A View of Text Analytics from Word, Sentence and Document Levels A View of Text Analytics from Word, Sentence and Document Levels
A View of Text Analytics from Word, Sentence and Document Levels Gan Keng Hoon
 
Keywords Discovery with Simple Text Mining using R
Keywords Discovery with Simple Text Mining using RKeywords Discovery with Simple Text Mining using R
Keywords Discovery with Simple Text Mining using RGan Keng Hoon
 
Procrastination and Phd.pdf
Procrastination and Phd.pdfProcrastination and Phd.pdf
Procrastination and Phd.pdfGan Keng Hoon
 
Guest Lecture for Principles of Data Analytics.pdf
Guest Lecture for Principles of Data Analytics.pdfGuest Lecture for Principles of Data Analytics.pdf
Guest Lecture for Principles of Data Analytics.pdfGan Keng Hoon
 
Knowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdfKnowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdfGan Keng Hoon
 
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...Gan Keng Hoon
 
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...Gan Keng Hoon
 
Text and Sentiment Analytics for Business Intelligence
Text and Sentiment Analytics for Business IntelligenceText and Sentiment Analytics for Business Intelligence
Text and Sentiment Analytics for Business IntelligenceGan Keng Hoon
 
Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Gan Keng Hoon
 
Semantics in Retrieval
Semantics in Retrieval Semantics in Retrieval
Semantics in Retrieval Gan Keng Hoon
 
Concepts and Challenges of Text Retrieval for Search Engine
Concepts and Challenges of Text Retrieval for Search EngineConcepts and Challenges of Text Retrieval for Search Engine
Concepts and Challenges of Text Retrieval for Search EngineGan Keng Hoon
 
Faceted Search for Finding Expertise Bibliographies
Faceted Search for Finding Expertise BibliographiesFaceted Search for Finding Expertise Bibliographies
Faceted Search for Finding Expertise BibliographiesGan Keng Hoon
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challengeGan Keng Hoon
 
ACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchGan Keng Hoon
 
A Brief Introduction to Knowledge Acquisition, Representation and Publishing
A Brief Introduction to Knowledge Acquisition, Representation and PublishingA Brief Introduction to Knowledge Acquisition, Representation and Publishing
A Brief Introduction to Knowledge Acquisition, Representation and PublishingGan Keng Hoon
 
Wi 2015 demo_preview
Wi 2015 demo_previewWi 2015 demo_preview
Wi 2015 demo_previewGan Keng Hoon
 
An overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support SystemAn overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support SystemGan Keng Hoon
 

More from Gan Keng Hoon (17)

A View of Text Analytics from Word, Sentence and Document Levels
A View of Text Analytics from Word, Sentence and Document Levels A View of Text Analytics from Word, Sentence and Document Levels
A View of Text Analytics from Word, Sentence and Document Levels
 
Keywords Discovery with Simple Text Mining using R
Keywords Discovery with Simple Text Mining using RKeywords Discovery with Simple Text Mining using R
Keywords Discovery with Simple Text Mining using R
 
Procrastination and Phd.pdf
Procrastination and Phd.pdfProcrastination and Phd.pdf
Procrastination and Phd.pdf
 
Guest Lecture for Principles of Data Analytics.pdf
Guest Lecture for Principles of Data Analytics.pdfGuest Lecture for Principles of Data Analytics.pdf
Guest Lecture for Principles of Data Analytics.pdf
 
Knowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdfKnowledge Representation Reasoning and Acquisition.pdf
Knowledge Representation Reasoning and Acquisition.pdf
 
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...
Project: Interfacing Chatbot with Data Retrieval and Analytics Queries for De...
 
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...
Interfacing Chatbot with Data Retrieval and Analytics Queries for Decision Ma...
 
Text and Sentiment Analytics for Business Intelligence
Text and Sentiment Analytics for Business IntelligenceText and Sentiment Analytics for Business Intelligence
Text and Sentiment Analytics for Business Intelligence
 
Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...Category & Training Texts Selection for Scientific Article Categorization in ...
Category & Training Texts Selection for Scientific Article Categorization in ...
 
Semantics in Retrieval
Semantics in Retrieval Semantics in Retrieval
Semantics in Retrieval
 
Concepts and Challenges of Text Retrieval for Search Engine
Concepts and Challenges of Text Retrieval for Search EngineConcepts and Challenges of Text Retrieval for Search Engine
Concepts and Challenges of Text Retrieval for Search Engine
 
Faceted Search for Finding Expertise Bibliographies
Faceted Search for Finding Expertise BibliographiesFaceted Search for Finding Expertise Bibliographies
Faceted Search for Finding Expertise Bibliographies
 
Information retrieval concept, practice and challenge
Information retrieval   concept, practice and challengeInformation retrieval   concept, practice and challenge
Information retrieval concept, practice and challenge
 
ACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise SearchACIS 2015 Bibliographical-based Facets for Expertise Search
ACIS 2015 Bibliographical-based Facets for Expertise Search
 
A Brief Introduction to Knowledge Acquisition, Representation and Publishing
A Brief Introduction to Knowledge Acquisition, Representation and PublishingA Brief Introduction to Knowledge Acquisition, Representation and Publishing
A Brief Introduction to Knowledge Acquisition, Representation and Publishing
 
Wi 2015 demo_preview
Wi 2015 demo_previewWi 2015 demo_preview
Wi 2015 demo_preview
 
An overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support SystemAn overview of text mining and sentiment analysis for Decision Support System
An overview of text mining and sentiment analysis for Decision Support System
 

Recently uploaded

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 

OSS 2020 Using SOLR as Open-Source Search Platform.pdf

  • 1. Using SOLR as Open-Source Search Platform for Organizational Research Experts Retrieval Dr. Gan Keng Hoon School of Computer Sciences Universiti Sains Malaysia GKH USM 1 7th Annual Open Source Summit 2020 Day 5 - 15th December
  • 2. Talk Outline Text Search with SOLR What is SOLR What is Text Search Text Search in an Organization Information Retrieval Concept and SOLR in Use Demo of Organizational Research Expert Retrieval GKH USM 2
  • 3. Part I: Text Search with SOLR GKH USM 3
  • 5. SOLR Solr is the enterprise search platform built on Apache Lucene™. GKH USM 5 popular blazing-fast open source
  • 6. Solr Resources Official Website: https://lucene.apache.org/solr/ Documentation: https://lucene.apache.org/solr/resources.html GKH USM 6
  • 7. What is Text Search? Are you thinking about Google? GKH USM 7
  • 8. Let’s start with something we are familiar with … Users want to type in a few simple keywords and get back great results. Involved matching query terms to documents GKH USM 8
  • 9. Ranked Results Return “ranked” documents for a query. A search engine returns documents sorted in descending order by a score that indicates the strength of the match of the document to the query. Ranking by relevancy is important. Side Question: How to determine the relevancy? GKH USM 9
  • 10. What Else Search engine also Returns Images, Videos, Social Feeds, Products etc. Provides search suggestions, auto complete etc. Gives answer, facts, files etc. GKH USM 10
  • 11. Text Search Solution in Your Organization How do you perform the search within your organization? documents finding tool in your operating system. querying of data stored in your database using Sql. Will you implement a (similar to) Google Search Engine for your organization? GKH USM 11
  • 12. Questions for Audience 1. List one TEXT or DOCUMENT LOOKUP/SEARCH/NAVIGATION PROBLEM that you have encountered in your organization. 2. Is there any existing search tool or feature implemented within your organization that you can adapt to address the problem? GKH USM 12
  • 13. Enterprise Search Managing search solutions within an organization or for the benefit of an organization … GKH USM 13
  • 14. IR and Search Engines Relevance -Effective ranking Evaluation -Testing and measuring Information needs -User interaction Performance -Efficient search and indexing Incorporating new data -Coverage and freshness Scalability -Growing with data and users Adaptability -Tuning for applications Specific problems -e.g. Spam Information Retrieval Search Engines 14 Enterprise Search Engines GKH USM
  • 15. Enterprise Search Features 1. Not only Texts. Unifying structured and unstructured data. 2. Not only Search. Search + Analytics. 3. Not for Everyone. Not addressing a general problem, but specific to application/domain/business needs. GKH USM 15
  • 16. Part II: IR Concept & SOLR in Use GKH USM 16
  • 17. High Level Information Retrieval Concept GKH USM 17 Documents Document Representation Information Needs Query Retrieved Documents Indexing Formulation Retrieval Function Relevance Feedback
  • 18. Diagram of the main components of Solr 4 GKH USM 18 Image Source: Solr in Action, Graiger & Potter Documents Indexing Document Representation Query Retrieval Function Retrieved Documents
  • 19. Glimpse of Solr In Use GKH USM 19
  • 21. Unzipping SOLR into Directory For Windows users, we highly recommend that you extract Solr to a directory that doesn’t have spaces in the name; that is, avoid extracting Solr into directories like C:Documents and Settings or C:Program Files. For example, use path like c:solr-8.2.0 instead. For Linux users, choose a location like /opt/solr/. GKH USM 21
  • 22. View SOLR in Your Directory Example directory listing of the solr-8.2.0 installation after extracting the downloaded archive on your computer. We’ll refer to the top-level directory as $SOLR_INSTALL/ throughout the rest of the slides. GKH USM 22
  • 23. Start Solr To start Solr, you need to run solr script located at the bin folder. For example, if your placed solr at c:solr-8.2.0 Open a command line, and enter the following: $ cd #this gets you to the base directory $ cd $SOLR_INSTALL #this gets you to your solr folder $ cd bin #this gets you to your bin folder $ bin/solr start Note: cd – change directory GKH USM 23
  • 25. Admin Console - http://localhost:8983/solr/ GKH USM 25
  • 26. Create Your First Core Your running server is empty. Create your first core, called “techproduct”. Go to command prompt, $ bin/solr create –c techproduct GKH USM 26
  • 27. Meet Your First Core GKH USM 27
  • 28. View Properties of The Core * You can also Add/Rename Core at the admin page. GKH USM 28
  • 29. Add Some Example Documents When you first start Solr, there are no documents in the index. It’s an empty server waiting to be filled with data to search. Let’s add some documents from exampleexampledocs directory. What are the example file types in your $ SOLR_INSTALLexampleexampledocs ?? GKH USM 29
  • 30. Use Post Tool to Add Documents For Unix user, you can call Post Tool from bin $ bin/post -c techproduct example/exampledocs/*.xml For Window user, navigate to the exampleexampledocs folder $ cd $ cd $SOLR_INSTALLexampleexampledocs $ java -jar –Dc=techproduct post.jar *.xml specify core the files to be added. In this case, we are adding all files with .xml type GKH USM 30
  • 31. Status of Added Files GKH USM 31 What is the speed of indexing?
  • 32. Let’s Search Go to http://localhost:8983/solr/ Select techproduct core Select Query tab Enter *:* at the query form. GKH USM 32
  • 33. GKH USM 33 Search results from executing the find of all documents query.
  • 34. View More Search Results In the query form, • Change start to 0 • Change rows to 32 GKH USM 34
  • 35. 14 files were indexed, but the search *:* found 32 documents. GKH USM 35
  • 36. Part III: Demo of Organizational Research Expert Retrieval GKH USM 36
  • 37. Organization Research Experts Retrieval Target Users: Students, Researchers, Collaborators looking for expertise from School of Computer Sciences, Universiti Sains Malaysia. Data Set: Scopus publication data for all academics at the school. Status: Prototype. Purpose: In house solution. Focused search and analytics capabilities. GKH USM 37
  • 38. 1. Design Document/Retrieval Unit GKH USM 38
  • 39. 2. Create Solr Core GKH USM 39 Create a new core to store the collection
  • 40. 3. Perform Indexing Perform indexing at the backend using addDocuments() by Solarium PHP library GKH USM 40 https://github.com/solariumphp
  • 41. 4. Implement Search Front End GKH USM 41
  • 42. 5. Format the Response into Results Page GKH USM 42
  • 43. 6. Demo Visit the prototype at http://ir.cs.usm.my/exsearch4/ Try search “cryptography”. GKH USM 43
  • 44. Thank you GKH USM 44 Visit our school at cs.usm.my The work by IR research at ir.cs.usm.my Drop me an email at khganATusm.my