SlideShare a Scribd company logo
1 of 16
Download to read offline
Oslo Solr Community   March 20th 2012




                What is new in Solr 4.0ß
                     Jan Høydahl




Sponsors:
Agenda                                        2




–   Solr/Lucene 4 ß, what, when?
–   Near-Realtime-Search
–   SolrCloud
–   Better Spellchecker
–   Flex – smaller index
–   Pluggable Ranking
–   Sort by Function
–   Result field aliasing and pseudo fields
–   Pivot facets
–   Join query
–   New Admin GUI

– And what about Solr 3.6 ?
4.0 beta?                                 3




–   Never released a public beta before
–   So many changes, it makes sense
–   Time frame??
–   Stability
Near-Realtime-Search                                                              4




– Before:
      • Add, add add add (not searchable)
      • Commit (new segment written → searchable)
– 4.0:
      •    In-memory index
      •    Add
      •    Soft-commit-(within/auto)
      •    Real-time GET:
<!-- realtime get handler, guaranteed to return the latest stored fields of
          any document, without the need to commit or open a new searcher.   The
          current implementation relies on the updateLog feature being enabled.
-->
 <requestHandler name="/get" class="solr.RealTimeGetHandler">
      <lst name="defaults">
          <str name="omitHeader">true</str>
      </lst>
 </requestHandler>
Solr Cloud                                                          5




         – Solr Cloud is the popular name for an initiative to make Solr
           more easily scalable and managable in a distributed world
         – Enables centralized configuration and cluster status
           monitoring
         – Solr 4.0ß contains the first features
               •   Apache ZooKeeper support, including built-in ZK
               •   Support for auto distributed/LB query (by means of ZK)
               •   Fault tolerant indexing and recovery
               •   Add a new node and let it discover its role and sync up
         – Expected features to come
               • Tools to manage the config in ZK
               • Re-balancing of shards



http://wiki.apache.org/solr/SolrCloud
6

      Solr Cloud...

– New concepts:
   • Collection: Cores making up one data set
   • ZooKeeper: Central coordination server
– Easier distributed search:
   • /solr/web/select?q=*:*&distrib=true
       – This queries all cores in same "collection"
– Easier distributed indexing:
   • http://<any.server>/solr/web/update...
Solr Cloud on the index side...   7




http://wiki.apache.org/solr/SolrCloud
Better spellchecker         8




– Direct SpellChecker
– Automaton based
  (no extra lucene-index)
– No long build times
– Better performance
– Better accuracy (?)
Flex – smaller index              9




– Lucene's Flex APIs
– Lets you plug in your own
  codecs
– Greater flexibility in how
  you can represent the
  binary index
– Opens up for many new
  features
   •   DocValues
   •   Pluggable ranking
   •   TEXT index
   •   Store as UTF-8[]
   •   Or other encoding for
       space saving for Chinese
Pluggable Ranking                              10




– Lucene uses TF/IDF and VSM
– Now support for BM25




– Plug your own!
– Hopefully attracts researchers
– Also, pluggable Similarity class per field
Sort by Function                        11




– q=foo&sort=sub(price,discount) desc
– q=foo&sort=dist(2, x, y, 0, 0) asc
Result field aliasing and pseudo fields                            12




– Aliasing:
   • q=foo&fl=score,tittel:title,rabattpris:sub(price,discount)
– Field name globbing:
   • q=foo&fl=score,t*
– Pseudo fields:
   • q=foo&fl=score,[explain],[docid],[shard],[value v=42 t=int]
Pivot facets                       13




– Multi dimensional facets
   • &facet.pivot=cat,popularity
Join query                            14




– Simple Join feature (inner join)
– &q={!join from=manu_id to=id}ipod
New Admin GUI   15
Solr 3.6                                                16




– SOLR-2764*: NorwegianLightStemmer,
              NorwegianMinimalStemmer
– SOLR-2202*: Money/Currency FieldType
– SOLR-2826*: URLClassify Update Processor
– SOLR-3056: Japanese field type in schema.xml
– SOLR-3026*: eDismax user fields
– SOLR-3140*: omitNorms default for all numeric field types
– SOLR-2901*: Upgrade Solr to Tika 1.0
– SOLR-1709: Distributed Date and Range Faceting
– SOLR-2487*: Do not include slf4j-jdk14 jar in WAR
– SOLR-2509*: spellcheck StringIndexOutOfBoundsException

* Committed by Jan Høydahl

More Related Content

What's hot

Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsAnshum Gupta
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Shalin Shekhar Mangar
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scaleAnshum Gupta
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Lucidworks
 
How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterlucenerevolution
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMBuilding and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMLucidworks
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)searchbox-com
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environmentlucenerevolution
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkitthelabdude
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudAnshum Gupta
 
What's New on AWS and What it Means to You
What's New on AWS and What it Means to YouWhat's New on AWS and What it Means to You
What's New on AWS and What it Means to YouAmazon Web Services
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Lucidworks
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solrthelabdude
 
Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr Tommaso Teofili
 

What's hot (20)

Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of Collections
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
 
How to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr clusterHow to make a simple cheap high availability self-healing solr cluster
How to make a simple cheap high availability self-healing solr cluster
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMBuilding and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
 
Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)Solr cluster with SolrCloud at lucenerevolution (tutorial)
Solr cluster with SolrCloud at lucenerevolution (tutorial)
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environment
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
Oak / Solr integration
Oak / Solr integrationOak / Solr integration
Oak / Solr integration
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
What's New on AWS and What it Means to You
What's New on AWS and What it Means to YouWhat's New on AWS and What it Means to You
What's New on AWS and What it Means to You
 
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
Tuning Solr and its Pipeline for Logs: Presented by Rafał Kuć & Radu Gheorghe...
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
Scaling search in Oak with Solr
Scaling search in Oak with Solr Scaling search in Oak with Solr
Scaling search in Oak with Solr
 

Viewers also liked

Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KBenjamin Good
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmmguest0233e9d0
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Benjamin Good
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptableszubin71
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidataBenjamin Good
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfBenjamin Good
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorsesmfox
 
Light steel villa catalogue log
Light steel villa catalogue logLight steel villa catalogue log
Light steel villa catalogue logeishimachinery
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giantsBenjamin Good
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCAlex Faynin
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueeishimachinery
 
Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Benjamin Good
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingBenjamin Good
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first weekBenjamin Good
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative SpiritBenjamin Good
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbioBenjamin Good
 

Viewers also liked (20)

Gene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2KGene Wiki and Mark2Cure update for BD2K
Gene Wiki and Mark2Cure update for BD2K
 
The National Society For The Protection Of Hmmm
The National Society For The Protection Of HmmmThe National Society For The Protection Of Hmmm
The National Society For The Protection Of Hmmm
 
IMSafer Angel Round
IMSafer Angel RoundIMSafer Angel Round
IMSafer Angel Round
 
2016 mem good
2016 mem good2016 mem good
2016 mem good
 
Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3Bio Logical Mass Collaboration3
Bio Logical Mass Collaboration3
 
Fedora Iptables
Fedora IptablesFedora Iptables
Fedora Iptables
 
2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata2016 bd2k bgood_wikidata
2016 bd2k bgood_wikidata
 
Citizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdfCitizen sciencepanel2015 pdf
Citizen sciencepanel2015 pdf
 
Buyer Remorse
Buyer RemorseBuyer Remorse
Buyer Remorse
 
Light steel villa catalogue log
Light steel villa catalogue logLight steel villa catalogue log
Light steel villa catalogue log
 
Computing on the shoulders of giants
Computing on the shoulders of giantsComputing on the shoulders of giants
Computing on the shoulders of giants
 
Welcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLCWelcome to Ukraine - SunCity Travel LLC
Welcome to Ukraine - SunCity Travel LLC
 
EISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogueEISHI CO. main eps machine catalogue
EISHI CO. main eps machine catalogue
 
Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010Mark Hopper Product And Marketing Exec 2010
Mark Hopper Product And Marketing Exec 2010
 
Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2Scripps bioinformatics seminar_day_2
Scripps bioinformatics seminar_day_2
 
Gene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meetingGene Wiki at Phenotype RCN annual meeting
Gene Wiki at Phenotype RCN annual meeting
 
Short update on The Cure game first week
Short update on The Cure game first weekShort update on The Cure game first week
Short update on The Cure game first week
 
(Bio)Hackathons
(Bio)Hackathons(Bio)Hackathons
(Bio)Hackathons
 
Channeling Collaborative Spirit
Channeling Collaborative SpiritChanneling Collaborative Spirit
Channeling Collaborative Spirit
 
2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio2015 6 bd2k_biobranch_knowbio
2015 6 bd2k_biobranch_knowbio
 

Similar to Oslo Solr MeetUp March 2012 - Solr4 alpha

What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xGrant Ingersoll
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relationJay Bharat
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmaplucenerevolution
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road maplucenerevolution
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0Erik Hatcher
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platformTommaso Teofili
 
From Lucene to Solr 4 Trunk
From Lucene to Solr 4 TrunkFrom Lucene to Solr 4 Trunk
From Lucene to Solr 4 Trunktdthomassld
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to SolrErik Hatcher
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0Anshum Gupta
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in AlfrescoAngel Borroy López
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Murshed Ahmmad Khan
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks
 
Meet Solr For The Tirst Again
Meet Solr For The Tirst AgainMeet Solr For The Tirst Again
Meet Solr For The Tirst AgainVarun Thacker
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrlucenerevolution
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" DataArt
 

Similar to Oslo Solr MeetUp March 2012 - Solr4 alpha (20)

What's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.xWhat's new in Lucene and Solr 4.x
What's new in Lucene and Solr 4.x
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr search engine with multiple table relation
Solr search engine with multiple table relationSolr search engine with multiple table relation
Solr search engine with multiple table relation
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote   Yonik Seeley & Steve Rowe lucene solr roadmapKeynote   Yonik Seeley & Steve Rowe lucene solr roadmap
Keynote Yonik Seeley & Steve Rowe lucene solr roadmap
 
KEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road mapKEYNOTE: Lucene / Solr road map
KEYNOTE: Lucene / Solr road map
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Apache Solr - Enterprise search platform
Apache Solr - Enterprise search platformApache Solr - Enterprise search platform
Apache Solr - Enterprise search platform
 
From Lucene to Solr 4 Trunk
From Lucene to Solr 4 TrunkFrom Lucene to Solr 4 Trunk
From Lucene to Solr 4 Trunk
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!Apache Solr! Enterprise Search Solutions at your Fingertips!
Apache Solr! Enterprise Search Solutions at your Fingertips!
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search Hortonworks Technical Workshop - HDP Search
Hortonworks Technical Workshop - HDP Search
 
Meet Solr For The Tirst Again
Meet Solr For The Tirst AgainMeet Solr For The Tirst Again
Meet Solr For The Tirst Again
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
Solr 101
Solr 101Solr 101
Solr 101
 
IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys" IT talk SPb "Full text search for lazy guys"
IT talk SPb "Full text search for lazy guys"
 

More from Cominvent AS

Solr's missing plugin ecosystem
Solr's missing plugin ecosystemSolr's missing plugin ecosystem
Solr's missing plugin ecosystemCominvent AS
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkCominvent AS
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Cominvent AS
 
Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwiseCominvent AS
 
Frokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent asFrokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent asCominvent AS
 
Migrating Fast to Solr
Migrating Fast to SolrMigrating Fast to Solr
Migrating Fast to SolrCominvent AS
 
Cominvent AS company Presentation
Cominvent AS company PresentationCominvent AS company Presentation
Cominvent AS company PresentationCominvent AS
 

More from Cominvent AS (7)

Solr's missing plugin ecosystem
Solr's missing plugin ecosystemSolr's missing plugin ecosystem
Solr's missing plugin ecosystem
 
Dagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søkDagens Næringslivs overgang til Lucene/Solr søk
Dagens Næringslivs overgang til Lucene/Solr søk
 
Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010Key topics when migrating from FAST to Solr, EuroCon 2010
Key topics when migrating from FAST to Solr, EuroCon 2010
 
Open source breakfast norge findwise
Open source breakfast norge findwiseOpen source breakfast norge findwise
Open source breakfast norge findwise
 
Frokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent asFrokostseminar mai 2010 solr open source cominvent as
Frokostseminar mai 2010 solr open source cominvent as
 
Migrating Fast to Solr
Migrating Fast to SolrMigrating Fast to Solr
Migrating Fast to Solr
 
Cominvent AS company Presentation
Cominvent AS company PresentationCominvent AS company Presentation
Cominvent AS company Presentation
 

Recently uploaded

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Oslo Solr MeetUp March 2012 - Solr4 alpha

  • 1. Oslo Solr Community March 20th 2012 What is new in Solr 4.0ß Jan Høydahl Sponsors:
  • 2. Agenda 2 – Solr/Lucene 4 ß, what, when? – Near-Realtime-Search – SolrCloud – Better Spellchecker – Flex – smaller index – Pluggable Ranking – Sort by Function – Result field aliasing and pseudo fields – Pivot facets – Join query – New Admin GUI – And what about Solr 3.6 ?
  • 3. 4.0 beta? 3 – Never released a public beta before – So many changes, it makes sense – Time frame?? – Stability
  • 4. Near-Realtime-Search 4 – Before: • Add, add add add (not searchable) • Commit (new segment written → searchable) – 4.0: • In-memory index • Add • Soft-commit-(within/auto) • Real-time GET: <!-- realtime get handler, guaranteed to return the latest stored fields of any document, without the need to commit or open a new searcher. The current implementation relies on the updateLog feature being enabled. --> <requestHandler name="/get" class="solr.RealTimeGetHandler"> <lst name="defaults"> <str name="omitHeader">true</str> </lst> </requestHandler>
  • 5. Solr Cloud 5 – Solr Cloud is the popular name for an initiative to make Solr more easily scalable and managable in a distributed world – Enables centralized configuration and cluster status monitoring – Solr 4.0ß contains the first features • Apache ZooKeeper support, including built-in ZK • Support for auto distributed/LB query (by means of ZK) • Fault tolerant indexing and recovery • Add a new node and let it discover its role and sync up – Expected features to come • Tools to manage the config in ZK • Re-balancing of shards http://wiki.apache.org/solr/SolrCloud
  • 6. 6 Solr Cloud... – New concepts: • Collection: Cores making up one data set • ZooKeeper: Central coordination server – Easier distributed search: • /solr/web/select?q=*:*&distrib=true – This queries all cores in same "collection" – Easier distributed indexing: • http://<any.server>/solr/web/update...
  • 7. Solr Cloud on the index side... 7 http://wiki.apache.org/solr/SolrCloud
  • 8. Better spellchecker 8 – Direct SpellChecker – Automaton based (no extra lucene-index) – No long build times – Better performance – Better accuracy (?)
  • 9. Flex – smaller index 9 – Lucene's Flex APIs – Lets you plug in your own codecs – Greater flexibility in how you can represent the binary index – Opens up for many new features • DocValues • Pluggable ranking • TEXT index • Store as UTF-8[] • Or other encoding for space saving for Chinese
  • 10. Pluggable Ranking 10 – Lucene uses TF/IDF and VSM – Now support for BM25 – Plug your own! – Hopefully attracts researchers – Also, pluggable Similarity class per field
  • 11. Sort by Function 11 – q=foo&sort=sub(price,discount) desc – q=foo&sort=dist(2, x, y, 0, 0) asc
  • 12. Result field aliasing and pseudo fields 12 – Aliasing: • q=foo&fl=score,tittel:title,rabattpris:sub(price,discount) – Field name globbing: • q=foo&fl=score,t* – Pseudo fields: • q=foo&fl=score,[explain],[docid],[shard],[value v=42 t=int]
  • 13. Pivot facets 13 – Multi dimensional facets • &facet.pivot=cat,popularity
  • 14. Join query 14 – Simple Join feature (inner join) – &q={!join from=manu_id to=id}ipod
  • 16. Solr 3.6 16 – SOLR-2764*: NorwegianLightStemmer, NorwegianMinimalStemmer – SOLR-2202*: Money/Currency FieldType – SOLR-2826*: URLClassify Update Processor – SOLR-3056: Japanese field type in schema.xml – SOLR-3026*: eDismax user fields – SOLR-3140*: omitNorms default for all numeric field types – SOLR-2901*: Upgrade Solr to Tika 1.0 – SOLR-1709: Distributed Date and Range Faceting – SOLR-2487*: Do not include slf4j-jdk14 jar in WAR – SOLR-2509*: spellcheck StringIndexOutOfBoundsException * Committed by Jan Høydahl