SlideShare a Scribd company logo
1 of 28
HOW TO BUILD YOUR
  SEARCH ENGINE
     by Searchbox.com
INTRODUCTION
About Searchbox:             About Solr:
 Highly configurable search     Blazing fast open source
 framework on top of Solr      enterprise search platform

 Search frontend / UI          Lucene-based search server

 Available as a Service        Has REST-like HTTP/XML
                               and JSON APIs
SOLR IS GREAT BUT...

 It remains a search server

    XML, JSON or CSV output

    Solaritas frontend / Velocity templates for quick
    prototype




Users expect a lot when it comes to Search Experience
SOLR OUTPUT SAMPLE
Solaritas   XML Output
LOTS OF WORK IN
       PERSPECTIVE
Building your own UI
on top of Solr will be a
BIG project. Think
about:

  Advanced filters,

  Presets (datasources),

  Facetted search,

  Result highlight, ...
INTRODUCING SEARCHBOX




Can be seen on: http://www.opportunity-finder.com
A SEARCH PROJECT
      WITH SEARCHBOX
1. Identify your information sources

2. Index those sources into our Solr Backend using:

  2.1. Our Connector framework (RSS, WEB, XML,
     CMIS / Sharepoint, TYPO3, ...)

  2.2.The Standard Solr API with a client library

  2.3. Custom DataImportHandlers for large datasets

3. Configure / Shape the search experience
CONFIGURE / SHAPE THE
 SEARCH EXPERIENCE
AGENDA

Here we assume you signed up to a free trial and you indexed some data



1. Look at the available fields
2. Define a search preset
3. Define required fields / search criterias
4. Create a visualization template for your data
5. Configure user filters / facets
This is the search
framework (searchbox.com)
          backend
In this example we
have 204 documents
A preset
can’t work without a
  unique key and a
        title
Our Prest




We didn’t
define any
 field yet
We weight the title
more than the rest
We now have three
     fields
Now we
have three fields on the
      result page



                          ... But no template
This is a pretty basic
template, in that case the id is
             a url
Query completion +
                          live search




We now have a basic
 search experience
Now we
create a facet based on the
          “source”
Sticky facets based on
    the data source
We only want the documents
     for source “site”
We renamed the preset




We no longer have
   the facets
THIS WAS A PRETTY
SIMPLE SEARCHBOX
NOW LET’S LOOK AT
  SOME SAMPLES
             Demos can be found on
http://www.searchbox.com/resources/online-demos/
“Sort by”




              “Clickable
                tags”


   Range
facets with
 histogram
Semantically
  related
  content




                 Basic
               dynamic
               template
6 Presets
              with distinct
              parameters




    Left
  template
column with
    meta

                     Related
                content from a
                 different data
                     source
WHAT’S NEXT

Check our online documentation

  http://help.searchbox.com

Check our website

  http://www.searchbox.com

Sign up to a free trial

  http://www.searchbox.com/free-trial/

More Related Content

What's hot

SharePoint 2010 Search
SharePoint 2010 SearchSharePoint 2010 Search
SharePoint 2010 Search
Regroove
 
Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet
guest32ae6
 
Content by query web part
Content by query web partContent by query web part
Content by query web part
IslamKhattab
 

What's hot (15)

Keycloak theme customization
Keycloak theme customizationKeycloak theme customization
Keycloak theme customization
 
KLC Workshop
KLC WorkshopKLC Workshop
KLC Workshop
 
Parsing strange v1.1
Parsing strange v1.1Parsing strange v1.1
Parsing strange v1.1
 
2015 SAE Digital Library Tour
2015 SAE Digital Library Tour2015 SAE Digital Library Tour
2015 SAE Digital Library Tour
 
Switching search to SOLR
Switching search to SOLRSwitching search to SOLR
Switching search to SOLR
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
SharePoint 2010 Search
SharePoint 2010 SearchSharePoint 2010 Search
SharePoint 2010 Search
 
13 ref works 2.0 advanced search and lookups
13 ref works 2.0 advanced search and lookups13 ref works 2.0 advanced search and lookups
13 ref works 2.0 advanced search and lookups
 
Google searching techniques
Google searching techniquesGoogle searching techniques
Google searching techniques
 
Google Search Engine
Google Search Engine Google Search Engine
Google Search Engine
 
Searching the Internet
Searching the Internet Searching the Internet
Searching the Internet
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Content by query web part
Content by query web partContent by query web part
Content by query web part
 
Advanced Search with Solr & django-haystack
Advanced Search with Solr & django-haystackAdvanced Search with Solr & django-haystack
Advanced Search with Solr & django-haystack
 
Zotero Competencies Workshop
Zotero Competencies WorkshopZotero Competencies Workshop
Zotero Competencies Workshop
 

Viewers also liked (8)

Google custom search engine
Google custom search engineGoogle custom search engine
Google custom search engine
 
Google Custom Search Engine Implementation: Issues for Librarians
Google Custom Search Engine Implementation: Issues for LibrariansGoogle Custom Search Engine Implementation: Issues for Librarians
Google Custom Search Engine Implementation: Issues for Librarians
 
How google search engine work
How google search engine workHow google search engine work
How google search engine work
 
How Google Search Engine Works
How Google Search Engine Works How Google Search Engine Works
How Google Search Engine Works
 
Google assignment
Google assignmentGoogle assignment
Google assignment
 
Google assignment
Google assignmentGoogle assignment
Google assignment
 
Google Custom Search Engine
Google Custom Search EngineGoogle Custom Search Engine
Google Custom Search Engine
 
Google Custom Search Engine (GCSE): A Tool For Subject Librarians
Google Custom Search Engine (GCSE): A Tool For Subject LibrariansGoogle Custom Search Engine (GCSE): A Tool For Subject Librarians
Google Custom Search Engine (GCSE): A Tool For Subject Librarians
 

Similar to How to build a custom search engine

Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010
Agnes Molnar
 
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
Nik Patel
 
PyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in pythonPyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in python
Chetan Giridhar
 
TechDays11 Geneva - Going Further with SharePoint 2010 Search
TechDays11 Geneva - Going Further with SharePoint 2010 SearchTechDays11 Geneva - Going Further with SharePoint 2010 Search
TechDays11 Geneva - Going Further with SharePoint 2010 Search
Marius Constantinescu [MVP]
 

Similar to How to build a custom search engine (20)

Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010Implementing Enterprise Search in SharePoint 2010
Implementing Enterprise Search in SharePoint 2010
 
Search Server 2010
Search Server 2010Search Server 2010
Search Server 2010
 
Search Engines: Best Practice
Search Engines: Best PracticeSearch Engines: Best Practice
Search Engines: Best Practice
 
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
Building Search Driven Applications in SharePoint 2010 - SharePoint Fest 2012
 
Exploring the New Search in SharePoint 2013 - What can you do now?
Exploring the New Search in SharePoint 2013 - What can you do now?Exploring the New Search in SharePoint 2013 - What can you do now?
Exploring the New Search in SharePoint 2013 - What can you do now?
 
PyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in pythonPyCon India 2012: Rapid development of website search in python
PyCon India 2012: Rapid development of website search in python
 
5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search5 Reasons Your Site Needs Acquia Search
5 Reasons Your Site Needs Acquia Search
 
SharePoint Search Zero to Search Hero
SharePoint Search Zero to Search HeroSharePoint Search Zero to Search Hero
SharePoint Search Zero to Search Hero
 
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
Overview of Search in SharePoint Server 2013 - Australian SharePoint Conferen...
 
TechDays11 Geneva - Going Further with SharePoint 2010 Search
TechDays11 Geneva - Going Further with SharePoint 2010 SearchTechDays11 Geneva - Going Further with SharePoint 2010 Search
TechDays11 Geneva - Going Further with SharePoint 2010 Search
 
SPC Master Power User SharePoint & Office 365
SPC Master Power User SharePoint & Office 365SPC Master Power User SharePoint & Office 365
SPC Master Power User SharePoint & Office 365
 
PoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick OverviewPoolParty Thesaurus Management Quick Overview
PoolParty Thesaurus Management Quick Overview
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Real World Challenges in Enterprise Search
Real World Challenges in Enterprise SearchReal World Challenges in Enterprise Search
Real World Challenges in Enterprise Search
 
SEARCH API: TIPS AND TRICKS - FROM BEGINNING TO CUSTOM SOLUTIONS
SEARCH API: TIPS AND TRICKS - FROM BEGINNING TO CUSTOM SOLUTIONSSEARCH API: TIPS AND TRICKS - FROM BEGINNING TO CUSTOM SOLUTIONS
SEARCH API: TIPS AND TRICKS - FROM BEGINNING TO CUSTOM SOLUTIONS
 
Essentials for the SharePoint Power User - NACollabSummit
Essentials for the SharePoint Power User - NACollabSummitEssentials for the SharePoint Power User - NACollabSummit
Essentials for the SharePoint Power User - NACollabSummit
 
Elastic Web Mining
Elastic Web MiningElastic Web Mining
Elastic Web Mining
 
Search Server Presentation
Search Server PresentationSearch Server Presentation
Search Server Presentation
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
In search of: A meetup about Liferay and Search 2016-04-20
In search of: A meetup about Liferay and Search   2016-04-20In search of: A meetup about Liferay and Search   2016-04-20
In search of: A meetup about Liferay and Search 2016-04-20
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

How to build a custom search engine

  • 1. HOW TO BUILD YOUR SEARCH ENGINE by Searchbox.com
  • 2. INTRODUCTION About Searchbox: About Solr: Highly configurable search Blazing fast open source framework on top of Solr enterprise search platform Search frontend / UI Lucene-based search server Available as a Service Has REST-like HTTP/XML and JSON APIs
  • 3. SOLR IS GREAT BUT... It remains a search server XML, JSON or CSV output Solaritas frontend / Velocity templates for quick prototype Users expect a lot when it comes to Search Experience
  • 5. LOTS OF WORK IN PERSPECTIVE Building your own UI on top of Solr will be a BIG project. Think about: Advanced filters, Presets (datasources), Facetted search, Result highlight, ...
  • 6. INTRODUCING SEARCHBOX Can be seen on: http://www.opportunity-finder.com
  • 7. A SEARCH PROJECT WITH SEARCHBOX 1. Identify your information sources 2. Index those sources into our Solr Backend using: 2.1. Our Connector framework (RSS, WEB, XML, CMIS / Sharepoint, TYPO3, ...) 2.2.The Standard Solr API with a client library 2.3. Custom DataImportHandlers for large datasets 3. Configure / Shape the search experience
  • 8. CONFIGURE / SHAPE THE SEARCH EXPERIENCE
  • 9. AGENDA Here we assume you signed up to a free trial and you indexed some data 1. Look at the available fields 2. Define a search preset 3. Define required fields / search criterias 4. Create a visualization template for your data 5. Configure user filters / facets
  • 10. This is the search framework (searchbox.com) backend
  • 11. In this example we have 204 documents
  • 12. A preset can’t work without a unique key and a title
  • 14. We weight the title more than the rest
  • 15. We now have three fields
  • 16. Now we have three fields on the result page ... But no template
  • 17. This is a pretty basic template, in that case the id is a url
  • 18. Query completion + live search We now have a basic search experience
  • 19. Now we create a facet based on the “source”
  • 20. Sticky facets based on the data source
  • 21. We only want the documents for source “site”
  • 22. We renamed the preset We no longer have the facets
  • 23. THIS WAS A PRETTY SIMPLE SEARCHBOX
  • 24. NOW LET’S LOOK AT SOME SAMPLES Demos can be found on http://www.searchbox.com/resources/online-demos/
  • 25. “Sort by” “Clickable tags” Range facets with histogram
  • 26. Semantically related content Basic dynamic template
  • 27. 6 Presets with distinct parameters Left template column with meta Related content from a different data source
  • 28. WHAT’S NEXT Check our online documentation http://help.searchbox.com Check our website http://www.searchbox.com Sign up to a free trial http://www.searchbox.com/free-trial/