SlideShare a Scribd company logo
1 of 23
Download to read offline
Elasticsearch
Scalable Full-Text Search Engine

Thursday, February 27, 14
Goals for this talk

Thursday, February 27, 14
Outline
• What’s full text search and why do we use
it?

• What can you do with Elasticsearch?
• Why is Elasticsearch different?
• DEMO TIME!
Thursday, February 27, 14
Text Search
do I really need to explain it?

Thursday, February 27, 14
%LIKE%
• In the beginning there was:
SELECT * FROM tweets WHERE content
LIKE ‘%zuckerberg%’

Thursday, February 27, 14
But that’s not what you usually search for!

• You want:
Search by author
Search by time
Search by sentiment
Search by location
Search by everything!

Thursday, February 27, 14
That’s a lot of metadata!

• You can’t search through all that on the fly
if you want realtime results

• You need to index it first!

Thursday, February 27, 14
Inverted Index
• Some documents:sells Facebook’ [Monday]
1: ‘Mark Zuckerberg

2: ‘Facebook buys WhatsApp’ [Tuesday]
3: ‘Mark’s Facebook buys Instagram’[Monday]

• Inverted index for them:{ 1, 2, 3}
Facebook:
Mark: {1, 3}
Instagram: {2}
WhatsApp: {2}
[Monday]: {1, 3}

Thursday, February 27, 14
Ok, now that we have data, we also want some
numbers behind it!

• In our previous example:
• Facebook is mentioned 3 times
• There are 2 posts on [Monday]
• The most frequent words are
Facebook and Mark

Thursday, February 27, 14
All 3 put together
Elasticsearch
=
Search(Content & Metadata) + Analytics
(oversimplified)

Thursday, February 27, 14
Let’s look at some
search features of
Elasticsearch

Thursday, February 27, 14
Features: Complex Queries

• Boolean Operators:

(apple OR pumpkin) AND pie

• Wildcards:
app*: apple, apples, appliance
appl?: apple, apply

• Fuzzy:
back~: back, pack, black, bank

• Ranged:
Thursday, February 27, 14
Features: Complex Queries

• Attribute filtering:
apple AND pie AND location:california

• Range filtering:
apple AND published:[1393100055 TO 1393427055]

Thursday, February 27, 14
Features:Geo Queries
Bounding Box Queries
Queries

Thursday, February 27, 14

Distance Range
Feature: built in analytics

Thursday, February 27, 14
Feature: Built in tagcloud

Thursday, February 27, 14
What’s special about
Elasticsearch?

Thursday, February 27, 14
Distributed

• Clustering data into multiple servers is easy
and abstracted away from the developer

Thursday, February 27, 14
Performance/Scalability

• Add and take nodes on the fly without ever
stopping the search service

Thursday, February 27, 14
Performance/Scalability

• Can scale independently both indexing and
searching

Thursday, February 27, 14
Performance/Scalability

• With few nodes you can do complex
queries on billions of documents

• 3 nodes: 20 mil documents with 2 replicas
each

Thursday, February 27, 14
Easy to back up
• Elasticsearch has a built in backup solution
so that you don’t have to worry about
implementing one

Thursday, February 27, 14
Demo time!

Thursday, February 27, 14

More Related Content

Similar to Intro to Elaticsearch - Elasticsearch Bucharest Group @ Softbinator

Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsSarah Joy Arnold
 
Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsSarah Joy Arnold
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoopRussell Jurney
 
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)Introduction to Object-Oriented Programming & Design Principles (TCF 2014)
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)Michael Redlich
 
Ladies Be Architects: Integration Study Group: Kick Off Slides
Ladies Be Architects: Integration Study Group: Kick Off SlidesLadies Be Architects: Integration Study Group: Kick Off Slides
Ladies Be Architects: Integration Study Group: Kick Off Slidesgemziebeth
 
Ab(Using) the MetaCPAN API for Fun and Profit v2013
Ab(Using) the MetaCPAN API for Fun and Profit v2013Ab(Using) the MetaCPAN API for Fun and Profit v2013
Ab(Using) the MetaCPAN API for Fun and Profit v2013Olaf Alders
 
Post-it Up: Qualitative Data Analysis of a Test Fest
Post-it Up: Qualitative Data Analysis of a Test FestPost-it Up: Qualitative Data Analysis of a Test Fest
Post-it Up: Qualitative Data Analysis of a Test FestSarah Joy Arnold
 
Our path to apache spark
Our path to apache sparkOur path to apache spark
Our path to apache sparkppetr82
 
Puppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet
 

Similar to Intro to Elaticsearch - Elasticsearch Bucharest Group @ Softbinator (15)

Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-its
 
Test Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-itsTest Fest and the Tale of Too Many Post-its
Test Fest and the Tale of Too Many Post-its
 
Write the Docs 2014, EU
Write the Docs 2014, EUWrite the Docs 2014, EU
Write the Docs 2014, EU
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoop
 
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)Introduction to Object-Oriented Programming & Design Principles (TCF 2014)
Introduction to Object-Oriented Programming & Design Principles (TCF 2014)
 
TSEM Spring 2012 - Wood
TSEM Spring 2012 - WoodTSEM Spring 2012 - Wood
TSEM Spring 2012 - Wood
 
Ladies Be Architects: Integration Study Group: Kick Off Slides
Ladies Be Architects: Integration Study Group: Kick Off SlidesLadies Be Architects: Integration Study Group: Kick Off Slides
Ladies Be Architects: Integration Study Group: Kick Off Slides
 
Ab(Using) the MetaCPAN API for Fun and Profit v2013
Ab(Using) the MetaCPAN API for Fun and Profit v2013Ab(Using) the MetaCPAN API for Fun and Profit v2013
Ab(Using) the MetaCPAN API for Fun and Profit v2013
 
My Varnish Setup
My Varnish SetupMy Varnish Setup
My Varnish Setup
 
Lean UX
Lean UXLean UX
Lean UX
 
Post-it Up: Qualitative Data Analysis of a Test Fest
Post-it Up: Qualitative Data Analysis of a Test FestPost-it Up: Qualitative Data Analysis of a Test Fest
Post-it Up: Qualitative Data Analysis of a Test Fest
 
Our path to apache spark
Our path to apache sparkOur path to apache spark
Our path to apache spark
 
DevTools at Etsy
DevTools at EtsyDevTools at Etsy
DevTools at Etsy
 
Puppet Camp London 2014: Keynote
Puppet Camp London 2014: KeynotePuppet Camp London 2014: Keynote
Puppet Camp London 2014: Keynote
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Intro to Elaticsearch - Elasticsearch Bucharest Group @ Softbinator

  • 1. Elasticsearch Scalable Full-Text Search Engine Thursday, February 27, 14
  • 2. Goals for this talk Thursday, February 27, 14
  • 3. Outline • What’s full text search and why do we use it? • What can you do with Elasticsearch? • Why is Elasticsearch different? • DEMO TIME! Thursday, February 27, 14
  • 4. Text Search do I really need to explain it? Thursday, February 27, 14
  • 5. %LIKE% • In the beginning there was: SELECT * FROM tweets WHERE content LIKE ‘%zuckerberg%’ Thursday, February 27, 14
  • 6. But that’s not what you usually search for! • You want: Search by author Search by time Search by sentiment Search by location Search by everything! Thursday, February 27, 14
  • 7. That’s a lot of metadata! • You can’t search through all that on the fly if you want realtime results • You need to index it first! Thursday, February 27, 14
  • 8. Inverted Index • Some documents:sells Facebook’ [Monday] 1: ‘Mark Zuckerberg 2: ‘Facebook buys WhatsApp’ [Tuesday] 3: ‘Mark’s Facebook buys Instagram’[Monday] • Inverted index for them:{ 1, 2, 3} Facebook: Mark: {1, 3} Instagram: {2} WhatsApp: {2} [Monday]: {1, 3} Thursday, February 27, 14
  • 9. Ok, now that we have data, we also want some numbers behind it! • In our previous example: • Facebook is mentioned 3 times • There are 2 posts on [Monday] • The most frequent words are Facebook and Mark Thursday, February 27, 14
  • 10. All 3 put together Elasticsearch = Search(Content & Metadata) + Analytics (oversimplified) Thursday, February 27, 14
  • 11. Let’s look at some search features of Elasticsearch Thursday, February 27, 14
  • 12. Features: Complex Queries • Boolean Operators: (apple OR pumpkin) AND pie • Wildcards: app*: apple, apples, appliance appl?: apple, apply • Fuzzy: back~: back, pack, black, bank • Ranged: Thursday, February 27, 14
  • 13. Features: Complex Queries • Attribute filtering: apple AND pie AND location:california • Range filtering: apple AND published:[1393100055 TO 1393427055] Thursday, February 27, 14
  • 14. Features:Geo Queries Bounding Box Queries Queries Thursday, February 27, 14 Distance Range
  • 15. Feature: built in analytics Thursday, February 27, 14
  • 16. Feature: Built in tagcloud Thursday, February 27, 14
  • 18. Distributed • Clustering data into multiple servers is easy and abstracted away from the developer Thursday, February 27, 14
  • 19. Performance/Scalability • Add and take nodes on the fly without ever stopping the search service Thursday, February 27, 14
  • 20. Performance/Scalability • Can scale independently both indexing and searching Thursday, February 27, 14
  • 21. Performance/Scalability • With few nodes you can do complex queries on billions of documents • 3 nodes: 20 mil documents with 2 replicas each Thursday, February 27, 14
  • 22. Easy to back up • Elasticsearch has a built in backup solution so that you don’t have to worry about implementing one Thursday, February 27, 14