• Like
  • Save
Oslo Solr MeetUp March 2012 - Solr4 alpha
Upcoming SlideShare
Loading in...5

Oslo Solr MeetUp March 2012 - Solr4 alpha



Short talk highlighting what we can expect in Solr 4.0 alpha/beta release soon to be released

Short talk highlighting what we can expect in Solr 4.0 alpha/beta release soon to be released



Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Oslo Solr MeetUp March 2012 - Solr4 alpha Oslo Solr MeetUp March 2012 - Solr4 alpha Presentation Transcript

    • Oslo Solr Community March 20th 2012 What is new in Solr 4.0ß Jan HøydahlSponsors:
    • Agenda 2– Solr/Lucene 4 ß, what, when?– Near-Realtime-Search– SolrCloud– Better Spellchecker– Flex – smaller index– Pluggable Ranking– Sort by Function– Result field aliasing and pseudo fields– Pivot facets– Join query– New Admin GUI– And what about Solr 3.6 ?
    • 4.0 beta? 3– Never released a public beta before– So many changes, it makes sense– Time frame??– Stability
    • Near-Realtime-Search 4– Before: • Add, add add add (not searchable) • Commit (new segment written → searchable)– 4.0: • In-memory index • Add • Soft-commit-(within/auto) • Real-time GET:<!-- realtime get handler, guaranteed to return the latest stored fields of any document, without the need to commit or open a new searcher. The current implementation relies on the updateLog feature being enabled.--> <requestHandler name="/get" class="solr.RealTimeGetHandler"> <lst name="defaults"> <str name="omitHeader">true</str> </lst> </requestHandler>
    • Solr Cloud 5 – Solr Cloud is the popular name for an initiative to make Solr more easily scalable and managable in a distributed world – Enables centralized configuration and cluster status monitoring – Solr 4.0ß contains the first features • Apache ZooKeeper support, including built-in ZK • Support for auto distributed/LB query (by means of ZK) • Fault tolerant indexing and recovery • Add a new node and let it discover its role and sync up – Expected features to come • Tools to manage the config in ZK • Re-balancing of shardshttp://wiki.apache.org/solr/SolrCloud
    • 6 Solr Cloud...– New concepts: • Collection: Cores making up one data set • ZooKeeper: Central coordination server– Easier distributed search: • /solr/web/select?q=*:*&distrib=true – This queries all cores in same "collection"– Easier distributed indexing: • http://<any.server>/solr/web/update...
    • Solr Cloud on the index side... 7http://wiki.apache.org/solr/SolrCloud
    • Better spellchecker 8– Direct SpellChecker– Automaton based (no extra lucene-index)– No long build times– Better performance– Better accuracy (?)
    • Flex – smaller index 9– Lucenes Flex APIs– Lets you plug in your own codecs– Greater flexibility in how you can represent the binary index– Opens up for many new features • DocValues • Pluggable ranking • TEXT index • Store as UTF-8[] • Or other encoding for space saving for Chinese
    • Pluggable Ranking 10– Lucene uses TF/IDF and VSM– Now support for BM25– Plug your own!– Hopefully attracts researchers– Also, pluggable Similarity class per field
    • Sort by Function 11– q=foo&sort=sub(price,discount) desc– q=foo&sort=dist(2, x, y, 0, 0) asc
    • Result field aliasing and pseudo fields 12– Aliasing: • q=foo&fl=score,tittel:title,rabattpris:sub(price,discount)– Field name globbing: • q=foo&fl=score,t*– Pseudo fields: • q=foo&fl=score,[explain],[docid],[shard],[value v=42 t=int]
    • Pivot facets 13– Multi dimensional facets • &facet.pivot=cat,popularity
    • Join query 14– Simple Join feature (inner join)– &q={!join from=manu_id to=id}ipod
    • New Admin GUI 15
    • Solr 3.6 16– SOLR-2764*: NorwegianLightStemmer, NorwegianMinimalStemmer– SOLR-2202*: Money/Currency FieldType– SOLR-2826*: URLClassify Update Processor– SOLR-3056: Japanese field type in schema.xml– SOLR-3026*: eDismax user fields– SOLR-3140*: omitNorms default for all numeric field types– SOLR-2901*: Upgrade Solr to Tika 1.0– SOLR-1709: Distributed Date and Range Faceting– SOLR-2487*: Do not include slf4j-jdk14 jar in WAR– SOLR-2509*: spellcheck StringIndexOutOfBoundsException* Committed by Jan Høydahl