SlideShare a Scribd company logo
Lucene/Solr 8: 

The next major release
Steve Rowe
Senior Software Developer, Lucidworks
@steven_a_rowe
#Activate18 #ActivateSearch
Agenda
• Recent release cadence
• 7.X
• 8.0
• 8.X
YOU

ARE
HERE
7.X average: 11 weeks6.X average: 10 weeks
7.X
1. Metrics
2. Autoscaling
3. CDCR
4. Time Routed Aliases
5. Replica types
6. Streaming expressions
7. JSON facet API
8. Configset / schema
9. Text Analysis / ML
10. Collections API
11. Queries
12. Large index segment
merging
13. Replication / recovery /
rolling updates
14. Block-join / nested docs
15. Miscellaneous
7.X: Metrics
• Continuation of 6.X work to support Autoscaling efforts
• 7.0: - Aggregated metrics collected in overseer

- solrconfig.xml <jmx> ➞ solr.xml <metrics><reporter>
• 7.1: Prometheus metrics exporter contrib
• 7.4: /admin/metrics/history API: basic long-term key metric
time series aggregation
• Fixed-width windows at

several resolutions
• Not yet in Admin UI:

SOLR-12426
7.X: Autoscaling
• 7.0: - Preferences and policy DSL: flexible replica placement

[ { minimize: cores }, { maximize: freedisk } ]

{ replica: "<2", shard: "#EACH", node: "#ANY" }

- Diagnostics API: return sorted nodes, policy violations
• 7.1: - autoAddReplicas ported to autoscaling framework

- Add/remove/suspend/resume triggers and listeners

- Triggers for added and lost nodes

- ComputePlanAction / ExecutePlanAction

- /autoscaling/history API: cluster events and actions
• 7.2: - Search rate trigger

- /autoscaling/suggestions API

- UTILIZENODE collections API command
7.X: Autoscaling
• 7.3: - Simulation framework

- Arbitrary metric threshold trigger

- Scheduled trigger

- Admin UI to display and execute suggestions
7.X: Autoscaling
• 7.4: - Periodic house-keeping task: cleans up inactive shards

- Index size trigger: document count or size in bytes
• 7.5: - Policy replica attribute: #ALL, #EQUAL, percentage,

range, and floating point values

- Policy cores attribute: #EQUAL, percentage, 

range, and floating point values

- Percentage in freedisk policy attribute

- Simulation framework: test scaling up to 1 billion docs
7.X: Cross Data Center Replication
• 7.2: Support bi-directional syncing of CDCR clusters
This is not
active-active, 

but rather

passive-active
or active-passive:
only one active

cluster at a time.
7.X: Time Routed Aliases
• 7.3: - Specialization of Solr’s collection alias feature

- Support time series data, e.g. logs / sensor data

- Maintain performance under continuous indexing

- CREATEALIAS: start, interval, retention policy

- Automatically create new collections

- Automatically delete old collections (optional)

- Route updates based on timestamp

- Search against all aliased collections*
• 7.5: Preemptively create the next collection when updates

are near the latest collection’s end date (optional)

* Pending optimization: minimize queried collections (SOLR-9562)
7.X: Replica types
• 7.0:













• 7.4: Query param to prioritize replicas by type, e.g.
shards.preference=replica.type:PULL,replica.type:TLOG
Type
Indexes

locally
Supports

soft
commit

& RTG
Pulls
segments
from
leader
Writes to

TLog
Can
become
shard
leader
Queryable
NRT ✅ ✅ ✅ ✅ ✅
TLOG leader ✅ ✅ ✅ ✅ ✅
TLOG ✅ ✅ ✅ ✅
PULL ✅ ✅
7.X: Streaming expressions
• Parallel computation function suite
• Some use cases: MapReduce, aggregations, parallel SQL, pub/
sub messaging, graph traversal, machine learning, statistical
programming
• Each 7.X release has added

many new functions
• 7.5: Ref guide:

Math Expressions User Guide
7.X: JSON Facet API
• 7.0: Terms facets: added optional refinement support
• 7.4: Semantic Knowledge Graph support via new 

relatedness() aggregate function
• Finds ad-hoc relationships by scoring documents
relative to foreground and background document
sets
• 7.5: Heatmap facet support
7.X: Configsets / schema
• 7.0: - _default configset

- Data-driven schema: auto-guessed text fields indexed 2 ways:
• tokenized for search
• strings for sorting/faceting: "*_str" string field, max 256 chars
- Turn off data-driven schema functionality:

curl http://host:8983/solr/mycollection/config 

-d "{ set-user-property: { update.autoCreateFields: false }}"
• 7.5: Disable configset upload: -Dconfigset.upload.enabled=false
7.X: Text analysis / machine learning
• 7.1: Bengali normalizer and stemmer
• 7.2: Enable off-ZooKeeper storage of large (>1MB) LTR models
• 7.3: OpenNLP integration: tokenization, POS tagging, phrase

chunking, lemmatization, NER, language detection
• 7.4: - ProtectedTermFilterFactory: don’t filter protected terms

- TaggerRequestHandler (a.k.a. SolrTextTagger): NER
• 7.5: - "nori" Korean morphological text analysis: "*_txt_ko"

- PhrasesIdentificationComponent: identify and score

candidate query phrases based on index statistics

- UIMA integration removed
7.X: Collections API
• 7.3: Add collection level properties similar to cluster properties
• 7.4: Cluster-wide defaults for numShards, nrtReplicas,

tlogReplicas, pullReplicas
• 7.5: - Support co-locating replicas of two or more collections

together in a node via the withCollection parameter

to the CREATE and MODIFYCOLLECTION commands

- SPLITSHARD: New split method using hard links: splitMethod=link
• 3-5 times faster than the original splitMethod=rewrite
• Slows down replication
• Increases disk usage on replica nodes
7.X: Queries
• 7.1: JSON
query
DSL

curl http://localhost:8983/solr/books/query -d '
{
query: {
bool: {
must: [
"title:solr",
{lucene: {df: content, query: "lucene solr"}}
],
must_not: [
{frange: {u: 3.0, query: ranking}}
]}}}'
7.X: Queries
• 7.2: New synonymQueryStyle field type option: enable

generation of appropriate queries for hierarchical

relations between overlapping terms
• as_same_term (default): SynonymQuery(bird,robin)
• pick_best: Dismax(bird,robin)
• as_distinct_terms: (bird OR robin)
• 7.4: JSON query DSL: Enable query/filter tagging,

e.g. { "#colorfilt" : "color:blue" } 

equivalent to local-param {!tag=colorfilt}color:blue

7.X: Large index segment merging
• Problem: Overly large segments (e.g. as a result of force-

merge/optimize) stop being eligible for merging,

and can start accumulating >50% deleted

documents, wasting space and skewing index stats.
• 7.5: - TieredMergePolicy now respects maxSegmentSizeMB

by default when executing force-merge/optimize and

expunge-deletes

- TieredMergePolicy’s reclaimDeletesWeight has been

replaced with a new deletesPctAllowed setting to

control how aggressively deletes should be reclaimed
7.X: Replication/recovery/rolling upgrades
• 7.3: The old Leader-Initiated-Recovery (LIR) implementation

is deprecated and replaced
• To perform a rolling upgrade to Solr 8, you must be on
Solr 7.3 or higher
• 7.4: - IndexFetcher now skips fetching identical files

- Buffering updates are written to a separate TLog

- Parallel replay of buffering TLogs
7.X: Block-join / nested documents
• 7.3: Added filters and excludeTags local-params for

{!parent} and {!child} query parsers, usable for

multi-select faceting
• 7.5: WIP: Allow Solr to more faithfully represent deeply

nested document relationships, rather than requiring

reconstruction based on the flattened list of child docs

returned by Solr
7.X: Miscellaneous
• 7.3: add-distinct atomic updates
• 7.4: - Ignore large document URP

- TLog: maxSize auto hard-commit setting

(in addition to maxDocs & maxTime)
• 7.5: Custom cluster properties allowed with ext. prefix
8.0
• Autoscaling
• Index upgrades
• HTTP/2
• Miscellaneous
8.0: Autoscaling
• Suggestions API: rebalance options even if no violations
• Suggestions API: add-replica for lost replicas
• maxOps limit for index size trigger
• Autoscaling policy framework will be the default replica
placement strategy
8.0: Index upgrades
• 7.0: Lucene indexes record the major Lucene version that

created the index, and the minimum Lucene version

that contributed to segments.
• 8.0: Version N-2 or older indexes will now fail to open,

even if they have been merged into an N-1 index.
• IndexUpgrader will not upgrade 6.X or earlier indexes
• Re-indexing will be required to upgrade
8.0: HTTP/2
• May 2018: Mark Miller announced his Star Burst effort:

many cleanups and performance enhancements
• July 2018: Cao Manh Dat took up the HTTP/2 aspects: SOLR-12639
• Indexing test: 33M docs, 1 shard, 2 replicas (SOLR-12642)
• Garbage: Leader: 26% less; replica: 76% less
• Indexing throughput: 54% higher
• CPU time: Leader: 39% higher; replica: 76% lower
• Ready to merge back to master, pending release of

Jetty 9.4.13, containing SPNEGO HTTP/2 implementation
8.0: Miscellaneous
• Lucene: scores must be non-negative
• Function(Score)Query-s convert negative scores to zero
• TODO: remove deprecations
• Trie fields? Removal effectively blocked by:
• SOLR-12074: Add numeric equivalent to StrField
• SOLR-11127: Mechanism to migrate schema
for .system collection (a.k.a. blob store) schema from
Trie (pre-7.0) to Points (7.0+)
8.X
• Lucene/Solr minimum JDK
• Luke: Lucene Toolbox
• New Lucene features
8.X: Lucene/Solr minimum JDK
• Oracle will end free JDK 8 support in January 2019
• Both JDK 9 & 10 are already EOL, no more Oracle support
• JDK 11 will very likely be next minimum supported JDK, no
schedule yet
• Under JDK 9+, Solr’s Hadoop-related functionality has
problems, including with Kerberos
• Uwe Schindler’s Jenkins server tests Lucene/Solr on Oracle
9+10+11+12 JDKs
• All have higher Solr test failure rates than on JDK 8
8.X: Luke: UI framework & licensing
• Andrzej Bialecki: Initial implementation: Thinlet, GPL
• Mark Harwood: GWT
• Mark Miller: Apache Pivot
• Dmitry Kan and Tomoko Uchida took ownership on Github
• Tomoko Uchida: JavaFX (bundled w/JDK 8)
• LUCENE-2562: Make Luke a Lucene/Solr Module
• JavaFX/OpenJFX unbundled from Java 11 JDK, GPL+CPE
• Tomoko Uchida: Swing (7.5 release available)
8.X: New Lucene features
• Index impacts, Block-Max WAND, similarity cleanups
• Some queries (especially term queries and disjunctions)
are much faster when number of hits is not required
• FeatureField: incorporate static relevance signals, e.g.
PageRank
• Soft deletes
• Merge policy retains deleted docs according to policy
• Enables document history, e.g. for time-travel indexes
• RAMDirectory replaced by ByteBuffersDirectory
Questions?
Thank you!
Steve Rowe
Senior Software Engineer, Lucidworks
@steven_a_rowe
#Activate18 #ActivateSearch

More Related Content

What's hot

Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
Lucidworks (Archived)
 
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, LucidworksApache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Lucidworks
 
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Lucidworks
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
Alex Moundalexis
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
lucenerevolution
 
Solr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, LucidworksSolr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, Lucidworks
Lucidworks
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
Varun Thacker
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
thelabdude
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
Shalin Shekhar Mangar
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
lucenerevolution
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
thelabdude
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7
Lucidworks
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4
thelabdude
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
lucenerevolution
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Lucidworks
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
Lucidworks (Archived)
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
Shalin Shekhar Mangar
 
Solr 4
Solr 4Solr 4
Solr 4
Erik Hatcher
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Lucidworks
 
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, LucidworksLifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
Lucidworks
 

What's hot (20)

Understanding Lucene Search Performance
Understanding Lucene Search PerformanceUnderstanding Lucene Search Performance
Understanding Lucene Search Performance
 
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, LucidworksApache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
Apache Solr: Upgrading Your Upgrade Experience - Hrishikesh Gadre, Lucidworks
 
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
Faster Data Analytics with Apache Spark using Apache Solr - Kiran Chitturi, L...
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
Real-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and StormReal-time Inverted Search in the Cloud Using Lucene and Storm
Real-time Inverted Search in the Cloud Using Lucene and Storm
 
Solr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, LucidworksSolr Metrics - Andrzej Białecki, Lucidworks
Solr Metrics - Andrzej Białecki, Lucidworks
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Building a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solrBuilding a near real time search engine & analytics for logs using solr
Building a near real time search engine & analytics for logs using solr
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Webinar: What's New in Solr 7
Webinar: What's New in Solr 7 Webinar: What's New in Solr 7
Webinar: What's New in Solr 7
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4
 
The First Class Integration of Solr with Hadoop
The First Class Integration of Solr with HadoopThe First Class Integration of Solr with Hadoop
The First Class Integration of Solr with Hadoop
 
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, EtsySearch-time Parallelism: Presented by Shikhar Bhushan, Etsy
Search-time Parallelism: Presented by Shikhar Bhushan, Etsy
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
High Performance Solr
High Performance SolrHigh Performance Solr
High Performance Solr
 
Solr 4
Solr 4Solr 4
Solr 4
 
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
Solr Troubleshooting - Treemap Approach: Presented by Alexandre Rafolovitch, ...
 
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, LucidworksLifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
Lifecycle of a Solr Search Request - Chris "Hoss" Hostetter, Lucidworks
 

Similar to Lucene/Solr 8: The next major release

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
Angel Borroy López
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
Lucidworks (Archived)
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
Anshum Gupta
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Shalin Shekhar Mangar
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10
Anshum Gupta
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
Anshum Gupta
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
Erik Hatcher
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Lucidworks
 
Distributed tracing in OpenStack
Distributed tracing in OpenStackDistributed tracing in OpenStack
Distributed tracing in OpenStack
Ilya Shakhat
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Lucidworks (Archived)
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
Saumitra Srivastav
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
Cominvent AS
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1
Stefan Schmidt
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
Erik Hatcher
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
JSGB
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
Rahul Jain
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)
Mathew Beane
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under control
Marcin Przepiórowski
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
Erik Hatcher
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
Lucidworks
 

Similar to Lucene/Solr 8: The next major release (20)

(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco(Re)Indexing Large Repositories in Alfresco
(Re)Indexing Large Repositories in Alfresco
 
What's new in solr june 2014
What's new in solr june 2014What's new in solr june 2014
What's new in solr june 2014
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
What's New in Apache Solr 4.10
What's New in Apache Solr 4.10What's New in Apache Solr 4.10
What's New in Apache Solr 4.10
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
 
Distributed tracing in OpenStack
Distributed tracing in OpenStackDistributed tracing in OpenStack
Distributed tracing in OpenStack
 
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Oslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alphaOslo Solr MeetUp March 2012 - Solr4 alpha
Oslo Solr MeetUp March 2012 - Solr4 alpha
 
New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1New Persistence Features in Spring Roo 1.1
New Persistence Features in Spring Roo 1.1
 
Solr Recipes
Solr RecipesSolr Recipes
Solr Recipes
 
Apache Solr Workshop
Apache Solr WorkshopApache Solr Workshop
Apache Solr Workshop
 
Introduction to Kafka and Zookeeper
Introduction to Kafka and ZookeeperIntroduction to Kafka and Zookeeper
Introduction to Kafka and Zookeeper
 
ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)ELK Ruminating on Logs (Zendcon 2016)
ELK Ruminating on Logs (Zendcon 2016)
 
Take your database source code and data under control
Take your database source code and data under controlTake your database source code and data under control
Take your database source code and data under control
 
What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0What's New in Solr 3.x / 4.0
What's New in Solr 3.x / 4.0
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 

Recently uploaded

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
lorraineandreiamcidl
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
Boni García
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
Hironori Washizaki
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
Aftab Hussain
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
Hornet Dynamics
 

Recently uploaded (20)

Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptxLORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
LORRAINE ANDREI_LEQUIGAN_HOW TO USE WHATSAPP.pptx
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)APIs for Browser Automation (MoT Meetup 2024)
APIs for Browser Automation (MoT Meetup 2024)
 
SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024SWEBOK and Education at FUSE Okinawa 2024
SWEBOK and Education at FUSE Okinawa 2024
 
Vitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdfVitthal Shirke Java Microservices Resume.pdf
Vitthal Shirke Java Microservices Resume.pdf
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
Neo4j - Product Vision and Knowledge Graphs - GraphSummit Paris
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
Graspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code AnalysisGraspan: A Big Data System for Big Code Analysis
Graspan: A Big Data System for Big Code Analysis
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
E-commerce Application Development Company.pdf
E-commerce Application Development Company.pdfE-commerce Application Development Company.pdf
E-commerce Application Development Company.pdf
 

Lucene/Solr 8: The next major release

  • 1. Lucene/Solr 8: 
 The next major release Steve Rowe Senior Software Developer, Lucidworks @steven_a_rowe #Activate18 #ActivateSearch
  • 2. Agenda • Recent release cadence • 7.X • 8.0 • 8.X YOU
 ARE HERE
  • 3. 7.X average: 11 weeks6.X average: 10 weeks
  • 4. 7.X 1. Metrics 2. Autoscaling 3. CDCR 4. Time Routed Aliases 5. Replica types 6. Streaming expressions 7. JSON facet API 8. Configset / schema 9. Text Analysis / ML 10. Collections API 11. Queries 12. Large index segment merging 13. Replication / recovery / rolling updates 14. Block-join / nested docs 15. Miscellaneous
  • 5. 7.X: Metrics • Continuation of 6.X work to support Autoscaling efforts • 7.0: - Aggregated metrics collected in overseer
 - solrconfig.xml <jmx> ➞ solr.xml <metrics><reporter> • 7.1: Prometheus metrics exporter contrib • 7.4: /admin/metrics/history API: basic long-term key metric time series aggregation • Fixed-width windows at
 several resolutions • Not yet in Admin UI:
 SOLR-12426
  • 6. 7.X: Autoscaling • 7.0: - Preferences and policy DSL: flexible replica placement
 [ { minimize: cores }, { maximize: freedisk } ]
 { replica: "<2", shard: "#EACH", node: "#ANY" }
 - Diagnostics API: return sorted nodes, policy violations • 7.1: - autoAddReplicas ported to autoscaling framework
 - Add/remove/suspend/resume triggers and listeners
 - Triggers for added and lost nodes
 - ComputePlanAction / ExecutePlanAction
 - /autoscaling/history API: cluster events and actions • 7.2: - Search rate trigger
 - /autoscaling/suggestions API
 - UTILIZENODE collections API command
  • 7. 7.X: Autoscaling • 7.3: - Simulation framework
 - Arbitrary metric threshold trigger
 - Scheduled trigger
 - Admin UI to display and execute suggestions
  • 8. 7.X: Autoscaling • 7.4: - Periodic house-keeping task: cleans up inactive shards
 - Index size trigger: document count or size in bytes • 7.5: - Policy replica attribute: #ALL, #EQUAL, percentage,
 range, and floating point values
 - Policy cores attribute: #EQUAL, percentage, 
 range, and floating point values
 - Percentage in freedisk policy attribute
 - Simulation framework: test scaling up to 1 billion docs
  • 9. 7.X: Cross Data Center Replication • 7.2: Support bi-directional syncing of CDCR clusters This is not active-active, 
 but rather
 passive-active or active-passive: only one active
 cluster at a time.
  • 10. 7.X: Time Routed Aliases • 7.3: - Specialization of Solr’s collection alias feature
 - Support time series data, e.g. logs / sensor data
 - Maintain performance under continuous indexing
 - CREATEALIAS: start, interval, retention policy
 - Automatically create new collections
 - Automatically delete old collections (optional)
 - Route updates based on timestamp
 - Search against all aliased collections* • 7.5: Preemptively create the next collection when updates
 are near the latest collection’s end date (optional)
 * Pending optimization: minimize queried collections (SOLR-9562)
  • 11. 7.X: Replica types • 7.0:
 
 
 
 
 
 
 • 7.4: Query param to prioritize replicas by type, e.g. shards.preference=replica.type:PULL,replica.type:TLOG Type Indexes
 locally Supports
 soft commit
 & RTG Pulls segments from leader Writes to
 TLog Can become shard leader Queryable NRT ✅ ✅ ✅ ✅ ✅ TLOG leader ✅ ✅ ✅ ✅ ✅ TLOG ✅ ✅ ✅ ✅ PULL ✅ ✅
  • 12. 7.X: Streaming expressions • Parallel computation function suite • Some use cases: MapReduce, aggregations, parallel SQL, pub/ sub messaging, graph traversal, machine learning, statistical programming • Each 7.X release has added
 many new functions • 7.5: Ref guide:
 Math Expressions User Guide
  • 13. 7.X: JSON Facet API • 7.0: Terms facets: added optional refinement support • 7.4: Semantic Knowledge Graph support via new 
 relatedness() aggregate function • Finds ad-hoc relationships by scoring documents relative to foreground and background document sets • 7.5: Heatmap facet support
  • 14. 7.X: Configsets / schema • 7.0: - _default configset
 - Data-driven schema: auto-guessed text fields indexed 2 ways: • tokenized for search • strings for sorting/faceting: "*_str" string field, max 256 chars - Turn off data-driven schema functionality:
 curl http://host:8983/solr/mycollection/config 
 -d "{ set-user-property: { update.autoCreateFields: false }}" • 7.5: Disable configset upload: -Dconfigset.upload.enabled=false
  • 15. 7.X: Text analysis / machine learning • 7.1: Bengali normalizer and stemmer • 7.2: Enable off-ZooKeeper storage of large (>1MB) LTR models • 7.3: OpenNLP integration: tokenization, POS tagging, phrase
 chunking, lemmatization, NER, language detection • 7.4: - ProtectedTermFilterFactory: don’t filter protected terms
 - TaggerRequestHandler (a.k.a. SolrTextTagger): NER • 7.5: - "nori" Korean morphological text analysis: "*_txt_ko"
 - PhrasesIdentificationComponent: identify and score
 candidate query phrases based on index statistics
 - UIMA integration removed
  • 16. 7.X: Collections API • 7.3: Add collection level properties similar to cluster properties • 7.4: Cluster-wide defaults for numShards, nrtReplicas,
 tlogReplicas, pullReplicas • 7.5: - Support co-locating replicas of two or more collections
 together in a node via the withCollection parameter
 to the CREATE and MODIFYCOLLECTION commands
 - SPLITSHARD: New split method using hard links: splitMethod=link • 3-5 times faster than the original splitMethod=rewrite • Slows down replication • Increases disk usage on replica nodes
  • 17. 7.X: Queries • 7.1: JSON query DSL
 curl http://localhost:8983/solr/books/query -d ' { query: { bool: { must: [ "title:solr", {lucene: {df: content, query: "lucene solr"}} ], must_not: [ {frange: {u: 3.0, query: ranking}} ]}}}'
  • 18. 7.X: Queries • 7.2: New synonymQueryStyle field type option: enable
 generation of appropriate queries for hierarchical
 relations between overlapping terms • as_same_term (default): SynonymQuery(bird,robin) • pick_best: Dismax(bird,robin) • as_distinct_terms: (bird OR robin) • 7.4: JSON query DSL: Enable query/filter tagging,
 e.g. { "#colorfilt" : "color:blue" } 
 equivalent to local-param {!tag=colorfilt}color:blue

  • 19. 7.X: Large index segment merging • Problem: Overly large segments (e.g. as a result of force-
 merge/optimize) stop being eligible for merging,
 and can start accumulating >50% deleted
 documents, wasting space and skewing index stats. • 7.5: - TieredMergePolicy now respects maxSegmentSizeMB
 by default when executing force-merge/optimize and
 expunge-deletes
 - TieredMergePolicy’s reclaimDeletesWeight has been
 replaced with a new deletesPctAllowed setting to
 control how aggressively deletes should be reclaimed
  • 20. 7.X: Replication/recovery/rolling upgrades • 7.3: The old Leader-Initiated-Recovery (LIR) implementation
 is deprecated and replaced • To perform a rolling upgrade to Solr 8, you must be on Solr 7.3 or higher • 7.4: - IndexFetcher now skips fetching identical files
 - Buffering updates are written to a separate TLog
 - Parallel replay of buffering TLogs
  • 21. 7.X: Block-join / nested documents • 7.3: Added filters and excludeTags local-params for
 {!parent} and {!child} query parsers, usable for
 multi-select faceting • 7.5: WIP: Allow Solr to more faithfully represent deeply
 nested document relationships, rather than requiring
 reconstruction based on the flattened list of child docs
 returned by Solr
  • 22. 7.X: Miscellaneous • 7.3: add-distinct atomic updates • 7.4: - Ignore large document URP
 - TLog: maxSize auto hard-commit setting
 (in addition to maxDocs & maxTime) • 7.5: Custom cluster properties allowed with ext. prefix
  • 23. 8.0 • Autoscaling • Index upgrades • HTTP/2 • Miscellaneous
  • 24. 8.0: Autoscaling • Suggestions API: rebalance options even if no violations • Suggestions API: add-replica for lost replicas • maxOps limit for index size trigger • Autoscaling policy framework will be the default replica placement strategy
  • 25. 8.0: Index upgrades • 7.0: Lucene indexes record the major Lucene version that
 created the index, and the minimum Lucene version
 that contributed to segments. • 8.0: Version N-2 or older indexes will now fail to open,
 even if they have been merged into an N-1 index. • IndexUpgrader will not upgrade 6.X or earlier indexes • Re-indexing will be required to upgrade
  • 26. 8.0: HTTP/2 • May 2018: Mark Miller announced his Star Burst effort:
 many cleanups and performance enhancements • July 2018: Cao Manh Dat took up the HTTP/2 aspects: SOLR-12639 • Indexing test: 33M docs, 1 shard, 2 replicas (SOLR-12642) • Garbage: Leader: 26% less; replica: 76% less • Indexing throughput: 54% higher • CPU time: Leader: 39% higher; replica: 76% lower • Ready to merge back to master, pending release of
 Jetty 9.4.13, containing SPNEGO HTTP/2 implementation
  • 27. 8.0: Miscellaneous • Lucene: scores must be non-negative • Function(Score)Query-s convert negative scores to zero • TODO: remove deprecations • Trie fields? Removal effectively blocked by: • SOLR-12074: Add numeric equivalent to StrField • SOLR-11127: Mechanism to migrate schema for .system collection (a.k.a. blob store) schema from Trie (pre-7.0) to Points (7.0+)
  • 28. 8.X • Lucene/Solr minimum JDK • Luke: Lucene Toolbox • New Lucene features
  • 29. 8.X: Lucene/Solr minimum JDK • Oracle will end free JDK 8 support in January 2019 • Both JDK 9 & 10 are already EOL, no more Oracle support • JDK 11 will very likely be next minimum supported JDK, no schedule yet • Under JDK 9+, Solr’s Hadoop-related functionality has problems, including with Kerberos • Uwe Schindler’s Jenkins server tests Lucene/Solr on Oracle 9+10+11+12 JDKs • All have higher Solr test failure rates than on JDK 8
  • 30. 8.X: Luke: UI framework & licensing • Andrzej Bialecki: Initial implementation: Thinlet, GPL • Mark Harwood: GWT • Mark Miller: Apache Pivot • Dmitry Kan and Tomoko Uchida took ownership on Github • Tomoko Uchida: JavaFX (bundled w/JDK 8) • LUCENE-2562: Make Luke a Lucene/Solr Module • JavaFX/OpenJFX unbundled from Java 11 JDK, GPL+CPE • Tomoko Uchida: Swing (7.5 release available)
  • 31. 8.X: New Lucene features • Index impacts, Block-Max WAND, similarity cleanups • Some queries (especially term queries and disjunctions) are much faster when number of hits is not required • FeatureField: incorporate static relevance signals, e.g. PageRank • Soft deletes • Merge policy retains deleted docs according to policy • Enables document history, e.g. for time-travel indexes • RAMDirectory replaced by ByteBuffersDirectory
  • 33. Thank you! Steve Rowe Senior Software Engineer, Lucidworks @steven_a_rowe #Activate18 #ActivateSearch