Faceted Search – the 120 Million Documents Story

Sourcesense
SourcesenseSourcesense
Faceted Search – the 120 Million Documents Story
Who am I? ,[object Object],[object Object],[object Object]
Who are Sourcesense? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Committers and Contributors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Who is the customer? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Their story? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
The Solution:
The Solution: Apache Solr ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How Solr Works ,[object Object],[object Object],[object Object]
How Solr Works ,[object Object],[object Object],[object Object]
How Solr Works ,[object Object],[object Object],[object Object],[object Object]
How Solr Works ,[object Object],[object Object],[object Object],[object Object],[object Object]
How Solr Works Index
How Solr Works Index Index Snapshot Active Index Reader Searches
How Solr Works Index Index Snapshot Active Index Reader Searches New Content Active  Index Writer
How Solr Works Index Index Snapshot Active Index Reader Searches New Content Active  Index Writer commit
How Solr Works Index Index Snapshot Index Snapshot Index Reader Active Index Reader Searches New Content Active  Index Writer
How Solr Works Index Index Snapshot Index Snapshot Index Reader Active Index Reader Searches New Content Active  Index Writer
How Solr Works Index Index Snapshot Index Reader Searches New Content Active  Index Writer
How Solr Distributes ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Solr Host Configuration shard 1 shard 2 shard   3 searches
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Solr at The Customer ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Oops. OutOfMemoryError
Solr: a Java web application ,[object Object],[object Object],[object Object],[object Object]
How Solr Works Index Index Snapshot Searches New Content Active  Index Writer Active Index Reader
How Solr Works Index Index Snapshot Searches New Content Active  Index Writer cache Active Index Reader
How Solr Works Index Index Snapshot Index Snapshot Index Reader Searches New Content cache Active Index Reader cache commit Active  Index Writer
How Solr Works Index Index Snapshot Index Snapshot Index Reader Searches New Content Active  Index Writer cache Active Index Reader cache
How Solr Works Index Index Snapshot Index Reader Searches New Content Active  Index Writer cache
Optimisation #1: autowarm < listener   event = &quot;newSearcher&quot;   class = &quot;solr.QuerySenderListener&quot; > < arr   name = &quot;queries&quot; > < lst >   < str   name = &quot;q&quot; > solr </ str >   < str   name = &quot;relf&quot; > 4 </ str > < str   name = &quot;facet.field&quot; > sourceCountryCS </ str > < str   name = &quot;facet.field&quot; > entityCSPerson </ str >   < str   name = &quot;facet.field&quot; > entityCSCompany </ str >   < str   name = &quot;facet.field&quot; > entityCSProduct </ str > < str   name = &quot;facet.field&quot; > sourceCS </ str > < str   name = &quot;facet.field&quot; > authorCS </ str > < str   name = &quot;facet.field&quot; > stockTickerCS </ str > < str   name = &quot;facet.field&quot; > feedClassCS </ str > < str   name = &quot;facet.field&quot; > entityCSOrganization </ str > < str   name = &quot;facet.field&quot; > platformCS </ str > < str   name = &quot;facet.field&quot; > eventOrFactCS </ str > < str   name = &quot;facet.field&quot; > sourceRank </ str > < str   name = &quot;facet&quot; > true </ str > < str   name = &quot;facet.date&quot; > harvestDate </ str > < str   name = &quot;facet.date.start&quot; > NOW-1MONTH </ str > < str   name = &quot;facet.date.end&quot; > NOW </ str > < str   name = &quot;facet.date.gap&quot; > +24HOURS </ str > < str   name = &quot;qt&quot; > /duplicate </ str > < str   name = &quot;duplicateOrder&quot; > latest </ str > < str   name = &quot;collapseFields&quot; > duplicateGroup titleForDuplicates </ str > </ lst > </ arr > </ listener >
#2: Garbage collection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
#3: Profiling ,[object Object],[object Object],[object Object],[object Object],[object Object]
Managing So Many Hosts ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator 35Gb 35Gb 35Gb
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator Entire row: 40 minutes
Content Archiving ,[object Object],[object Object],[object Object],[object Object]
Being Dynamic ,[object Object],[object Object],[object Object],[object Object]
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator shard 1 shard 2 shard   3 co-ordinator archive ingestion
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator shard 1 shard 2 shard   3 co-ordinator
Solr Host Configuration shard 1 shard 2 shard   3 co-ordinator load balancer shard 1 shard 2 shard   3 co-ordinator
Conclusion ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
thank you [email_address]
1 of 51

Recommended

Cracking Salted Hashes by
Cracking Salted HashesCracking Salted Hashes
Cracking Salted Hashesn|u - The Open Security Community
501 views14 slides
Salt Cryptography & Cracking Salted Hashes by fb1h2s by
Salt Cryptography & Cracking Salted Hashes by fb1h2sSalt Cryptography & Cracking Salted Hashes by fb1h2s
Salt Cryptography & Cracking Salted Hashes by fb1h2sn|u - The Open Security Community
4.3K views22 slides
AWS re:Invent 2018 re:cap for Gaming (Amazon Game Developers Day) by
AWS re:Invent 2018 re:cap for Gaming (Amazon Game Developers Day)AWS re:Invent 2018 re:cap for Gaming (Amazon Game Developers Day)
AWS re:Invent 2018 re:cap for Gaming (Amazon Game Developers Day)Amazon Web Services Japan
2.2K views75 slides
OSTU - Sake Blok on TShark Statistics by
OSTU - Sake Blok on TShark StatisticsOSTU - Sake Blok on TShark Statistics
OSTU - Sake Blok on TShark StatisticsDenny K
1K views8 slides
Эксплуатируем неэксплуатируемые уязвимости SAP by
Эксплуатируем неэксплуатируемые уязвимости SAPЭксплуатируем неэксплуатируемые уязвимости SAP
Эксплуатируем неэксплуатируемые уязвимости SAPPositive Hack Days
853 views121 slides
Sangam 18 - Database Development: Return of the SQL Jedi by
Sangam 18 - Database Development: Return of the SQL JediSangam 18 - Database Development: Return of the SQL Jedi
Sangam 18 - Database Development: Return of the SQL JediConnor McDonald
900 views140 slides

More Related Content

Viewers also liked

Open source applied: Real-world uses by
Open source applied: Real-world usesOpen source applied: Real-world uses
Open source applied: Real-world usesRogue Wave Software
242 views46 slides
What's New in Solr 3.x / 4.0 by
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
3.8K views35 slides
Solr Indexing and Analysis Tricks by
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis TricksErik Hatcher
4.3K views13 slides
Lucene for Solr Developers by
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr DevelopersErik Hatcher
1.1K views61 slides
Hackathon by
HackathonHackathon
HackathonProvectus
485 views9 slides
Solr 6 Feature Preview by
Solr 6 Feature PreviewSolr 6 Feature Preview
Solr 6 Feature PreviewYonik Seeley
3K views33 slides

Viewers also liked(20)

What's New in Solr 3.x / 4.0 by Erik Hatcher
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
Erik Hatcher3.8K views
Solr Indexing and Analysis Tricks by Erik Hatcher
Solr Indexing and Analysis TricksSolr Indexing and Analysis Tricks
Solr Indexing and Analysis Tricks
Erik Hatcher4.3K views
Lucene for Solr Developers by Erik Hatcher
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher1.1K views
Lucene's Latest (for Libraries) by Erik Hatcher
Lucene's Latest (for Libraries)Lucene's Latest (for Libraries)
Lucene's Latest (for Libraries)
Erik Hatcher3K views
Multi faceted responsive search, autocomplete, feeds engine & logging by lucenerevolution
Multi faceted responsive search, autocomplete, feeds engine & loggingMulti faceted responsive search, autocomplete, feeds engine & logging
Multi faceted responsive search, autocomplete, feeds engine & logging
lucenerevolution4K views
Gimme shelter: Tips on protecting proprietary and open source code by Rogue Wave Software
Gimme shelter: Tips on protecting proprietary and open source codeGimme shelter: Tips on protecting proprietary and open source code
Gimme shelter: Tips on protecting proprietary and open source code
Сергей Моренец: "Gradle. Write once, build everywhere" by Provectus
Сергей Моренец: "Gradle. Write once, build everywhere"Сергей Моренец: "Gradle. Write once, build everywhere"
Сергей Моренец: "Gradle. Write once, build everywhere"
Provectus 632 views
Why I want to Kazan by Provectus
Why I want to KazanWhy I want to Kazan
Why I want to Kazan
Provectus 484 views
Solr Black Belt Pre-conference by Erik Hatcher
Solr Black Belt Pre-conferenceSolr Black Belt Pre-conference
Solr Black Belt Pre-conference
Erik Hatcher4.4K views
Lucene for Solr Developers by Erik Hatcher
Lucene for Solr DevelopersLucene for Solr Developers
Lucene for Solr Developers
Erik Hatcher1.6K views
Apache Solr Changes the Way You Build Sites by Peter
Apache Solr Changes the Way You Build SitesApache Solr Changes the Way You Build Sites
Apache Solr Changes the Way You Build Sites
Peter2.1K views
Meet Solr For The Tirst Again by Varun Thacker
Meet Solr For The Tirst AgainMeet Solr For The Tirst Again
Meet Solr For The Tirst Again
Varun Thacker727 views
"Solr Update" at code4lib '13 - Chicago by Erik Hatcher
"Solr Update" at code4lib '13 - Chicago"Solr Update" at code4lib '13 - Chicago
"Solr Update" at code4lib '13 - Chicago
Erik Hatcher2.8K views

Similar to Faceted Search – the 120 Million Documents Story

Code4Lib 2007: MyResearch Portal by
Code4Lib 2007: MyResearch PortalCode4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch Portaleby
502 views29 slides
An Introduction to Solr by
An Introduction to SolrAn Introduction to Solr
An Introduction to Solrtomhill
12.7K views82 slides
RESTFul IDEAS by
RESTFul IDEASRESTFul IDEAS
RESTFul IDEASJoel Amoussou
1.1K views32 slides
Shrimp: A Rather Practical Example Of Application Development With RESTinio a... by
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...Yauheni Akhotnikau
395 views84 slides
Solr Presentation by
Solr PresentationSolr Presentation
Solr PresentationGaurav Verma
9.1K views33 slides
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea... by
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Jean-Paul Calbimonte
895 views35 slides

Similar to Faceted Search – the 120 Million Documents Story(20)

Code4Lib 2007: MyResearch Portal by eby
Code4Lib 2007: MyResearch PortalCode4Lib 2007: MyResearch Portal
Code4Lib 2007: MyResearch Portal
eby502 views
An Introduction to Solr by tomhill
An Introduction to SolrAn Introduction to Solr
An Introduction to Solr
tomhill12.7K views
Shrimp: A Rather Practical Example Of Application Development With RESTinio a... by Yauheni Akhotnikau
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
Shrimp: A Rather Practical Example Of Application Development With RESTinio a...
Yauheni Akhotnikau395 views
Solr Presentation by Gaurav Verma
Solr PresentationSolr Presentation
Solr Presentation
Gaurav Verma9.1K views
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea... by Jean-Paul Calbimonte
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Tutorial ESWC2011 Building Semantic Sensor Web - 04 - Querying_semantic_strea...
Searching the Now by lucasjosh
Searching the NowSearching the Now
Searching the Now
lucasjosh540 views
LOGBack and SLF4J by jkumaranc
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
jkumaranc3 views
LOGBack and SLF4J by jkumaranc
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
jkumaranc3.2K views
LOGBack and SLF4J by jkumaranc
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
jkumaranc560 views
LOGBack and SLF4J by jkumaranc
LOGBack and SLF4JLOGBack and SLF4J
LOGBack and SLF4J
jkumaranc1.2K views
Improving the performance of Odoo deployments by Odoo
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deployments
Odoo108.5K views
10reasons by Li Huan
10reasons10reasons
10reasons
Li Huan508 views
SPARQLing Services by Leigh Dodds
SPARQLing ServicesSPARQLing Services
SPARQLing Services
Leigh Dodds822 views
Clustering Made Easier: Using Terracotta with Hibernate and/or EHCache by Cris Holdorph
Clustering Made Easier: Using Terracotta with Hibernate and/or EHCacheClustering Made Easier: Using Terracotta with Hibernate and/or EHCache
Clustering Made Easier: Using Terracotta with Hibernate and/or EHCache
Cris Holdorph1.5K views
Sinatra and JSONQuery Web Service by vvatikiotis
Sinatra and JSONQuery Web ServiceSinatra and JSONQuery Web Service
Sinatra and JSONQuery Web Service
vvatikiotis1.9K views

More from Sourcesense

Atlassian Roadshow 2016 - Vlad Cavalcanti by
Atlassian Roadshow 2016 - Vlad CavalcantiAtlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad CavalcantiSourcesense
887 views38 slides
Atlassian Roadshow 2016 - DevOps Session by
Atlassian Roadshow 2016 - DevOps SessionAtlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps SessionSourcesense
3K views12 slides
Atlassian Roadshow 2016 - Sourcesense References by
Atlassian Roadshow 2016 - Sourcesense ReferencesAtlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense ReferencesSourcesense
295 views20 slides
Atlassian Roadshow 2016 intro by
Atlassian Roadshow 2016 introAtlassian Roadshow 2016 intro
Atlassian Roadshow 2016 introSourcesense
566 views7 slides
Liferay Symposium – Italy 2015 by
Liferay Symposium – Italy 2015Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015Sourcesense
475 views13 slides
Sourcesense - Alfresco Day Roma 2015 by
Sourcesense - Alfresco Day Roma 2015Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015Sourcesense
389 views8 slides

More from Sourcesense(9)

Atlassian Roadshow 2016 - Vlad Cavalcanti by Sourcesense
Atlassian Roadshow 2016 - Vlad CavalcantiAtlassian Roadshow 2016 - Vlad Cavalcanti
Atlassian Roadshow 2016 - Vlad Cavalcanti
Sourcesense887 views
Atlassian Roadshow 2016 - DevOps Session by Sourcesense
Atlassian Roadshow 2016 - DevOps SessionAtlassian Roadshow 2016 - DevOps Session
Atlassian Roadshow 2016 - DevOps Session
Sourcesense3K views
Atlassian Roadshow 2016 - Sourcesense References by Sourcesense
Atlassian Roadshow 2016 - Sourcesense ReferencesAtlassian Roadshow 2016 - Sourcesense References
Atlassian Roadshow 2016 - Sourcesense References
Sourcesense295 views
Atlassian Roadshow 2016 intro by Sourcesense
Atlassian Roadshow 2016 introAtlassian Roadshow 2016 intro
Atlassian Roadshow 2016 intro
Sourcesense566 views
Liferay Symposium – Italy 2015 by Sourcesense
Liferay Symposium – Italy 2015Liferay Symposium – Italy 2015
Liferay Symposium – Italy 2015
Sourcesense475 views
Sourcesense - Alfresco Day Roma 2015 by Sourcesense
Sourcesense - Alfresco Day Roma 2015Sourcesense - Alfresco Day Roma 2015
Sourcesense - Alfresco Day Roma 2015
Sourcesense389 views
Dev8d Apache Solr Tutorial by Sourcesense
Dev8d Apache Solr TutorialDev8d Apache Solr Tutorial
Dev8d Apache Solr Tutorial
Sourcesense2.6K views
Small wins in a small time with Apache Solr by Sourcesense
Small wins in a small time with Apache SolrSmall wins in a small time with Apache Solr
Small wins in a small time with Apache Solr
Sourcesense2.1K views
Sharded Solr setup with master by Sourcesense
Sharded Solr setup with masterSharded Solr setup with master
Sharded Solr setup with master
Sourcesense2.3K views

Recently uploaded

Cencora Executive Symposium by
Cencora Executive SymposiumCencora Executive Symposium
Cencora Executive Symposiummarketingcommunicati21
160 views14 slides
"Surviving highload with Node.js", Andrii Shumada by
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada Fwdays
58 views29 slides
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsShapeBlue
247 views13 slides
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOsPriyanka Aash
162 views59 slides
Future of Indian ConsumerTech by
Future of Indian ConsumerTechFuture of Indian ConsumerTech
Future of Indian ConsumerTechKapil Khandelwal (KK)
36 views68 slides
The Role of Patterns in the Era of Large Language Models by
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language ModelsYunyao Li
91 views65 slides

Recently uploaded(20)

"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays58 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue247 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash162 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li91 views
LLMs in Production: Tooling, Process, and Team Structure by Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage57 views
Optimizing Communication to Optimize Human Behavior - LCBM by Yaman Kumar
Optimizing Communication to Optimize Human Behavior - LCBMOptimizing Communication to Optimize Human Behavior - LCBM
Optimizing Communication to Optimize Human Behavior - LCBM
Yaman Kumar38 views
"Package management in monorepos", Zoltan Kochan by Fwdays
"Package management in monorepos", Zoltan Kochan"Package management in monorepos", Zoltan Kochan
"Package management in monorepos", Zoltan Kochan
Fwdays34 views
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023 by BookNet Canada
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
Redefining the book supply chain: A glimpse into the future - Tech Forum 2023
BookNet Canada44 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software184 views
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue by ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlueCloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
CloudStack Object Storage - An Introduction - Vladimir Petrov - ShapeBlue
ShapeBlue139 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE84 views
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading... by The Digital Insurer
Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...Webinar : Desperately Seeking Transformation - Part 2:  Insights from leading...
Webinar : Desperately Seeking Transformation - Part 2: Insights from leading...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue199 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu437 views
Transcript: Redefining the book supply chain: A glimpse into the future - Tec... by BookNet Canada
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
Transcript: Redefining the book supply chain: A glimpse into the future - Tec...
BookNet Canada41 views

Faceted Search – the 120 Million Documents Story

  • 1. Faceted Search – the 120 Million Documents Story
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13. How Solr Works Index
  • 14. How Solr Works Index Index Snapshot Active Index Reader Searches
  • 15. How Solr Works Index Index Snapshot Active Index Reader Searches New Content Active Index Writer
  • 16. How Solr Works Index Index Snapshot Active Index Reader Searches New Content Active Index Writer commit
  • 17. How Solr Works Index Index Snapshot Index Snapshot Index Reader Active Index Reader Searches New Content Active Index Writer
  • 18. How Solr Works Index Index Snapshot Index Snapshot Index Reader Active Index Reader Searches New Content Active Index Writer
  • 19. How Solr Works Index Index Snapshot Index Reader Searches New Content Active Index Writer
  • 20.
  • 21. Solr Host Configuration shard 1 shard 2 shard 3 searches
  • 22. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator
  • 23. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer
  • 24. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 25. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 26.
  • 28.
  • 29. How Solr Works Index Index Snapshot Searches New Content Active Index Writer Active Index Reader
  • 30. How Solr Works Index Index Snapshot Searches New Content Active Index Writer cache Active Index Reader
  • 31. How Solr Works Index Index Snapshot Index Snapshot Index Reader Searches New Content cache Active Index Reader cache commit Active Index Writer
  • 32. How Solr Works Index Index Snapshot Index Snapshot Index Reader Searches New Content Active Index Writer cache Active Index Reader cache
  • 33. How Solr Works Index Index Snapshot Index Reader Searches New Content Active Index Writer cache
  • 34. Optimisation #1: autowarm < listener event = &quot;newSearcher&quot; class = &quot;solr.QuerySenderListener&quot; > < arr name = &quot;queries&quot; > < lst > < str name = &quot;q&quot; > solr </ str > < str name = &quot;relf&quot; > 4 </ str > < str name = &quot;facet.field&quot; > sourceCountryCS </ str > < str name = &quot;facet.field&quot; > entityCSPerson </ str > < str name = &quot;facet.field&quot; > entityCSCompany </ str > < str name = &quot;facet.field&quot; > entityCSProduct </ str > < str name = &quot;facet.field&quot; > sourceCS </ str > < str name = &quot;facet.field&quot; > authorCS </ str > < str name = &quot;facet.field&quot; > stockTickerCS </ str > < str name = &quot;facet.field&quot; > feedClassCS </ str > < str name = &quot;facet.field&quot; > entityCSOrganization </ str > < str name = &quot;facet.field&quot; > platformCS </ str > < str name = &quot;facet.field&quot; > eventOrFactCS </ str > < str name = &quot;facet.field&quot; > sourceRank </ str > < str name = &quot;facet&quot; > true </ str > < str name = &quot;facet.date&quot; > harvestDate </ str > < str name = &quot;facet.date.start&quot; > NOW-1MONTH </ str > < str name = &quot;facet.date.end&quot; > NOW </ str > < str name = &quot;facet.date.gap&quot; > +24HOURS </ str > < str name = &quot;qt&quot; > /duplicate </ str > < str name = &quot;duplicateOrder&quot; > latest </ str > < str name = &quot;collapseFields&quot; > duplicateGroup titleForDuplicates </ str > </ lst > </ arr > </ listener >
  • 35.
  • 36.
  • 37.
  • 38. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer
  • 39. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 40. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator 35Gb 35Gb 35Gb
  • 41. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 42. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator Entire row: 40 minutes
  • 43.
  • 44.
  • 45. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 46. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator shard 1 shard 2 shard 3 co-ordinator
  • 47. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator shard 1 shard 2 shard 3 co-ordinator archive ingestion
  • 48. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator shard 1 shard 2 shard 3 co-ordinator
  • 49. Solr Host Configuration shard 1 shard 2 shard 3 co-ordinator load balancer shard 1 shard 2 shard 3 co-ordinator
  • 50.