×
  • Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
 

Batch indexing & near real time, keeping things fast.

by on May 15, 2013

  • 1,287 views

Presented by Marc Sturlese, Architect, Backend engineer, Trovit ...

Presented by Marc Sturlese, Architect, Backend engineer, Trovit

In this talk I will explain how we combine a mixed architecture using Hadoop for batch indexing and Storm, HBase and Zookeeper to keep our indexes updated in near real time.Will talk about why we didn't choose just a default Solr Cloud and it's real time feature (mainly to avoid hitting merges while serving queries on the slaves) and the advantages and complexities of having a mixed architecture. Both parts of the infrastucture and how they are coordinated will be explained with details.Finally will mention future lines, how we plan to use Lucene real time feature.

Statistics

Views

Total Views
1,287
Views on SlideShare
688
Embed Views
599

Actions

Likes
2
Downloads
24
Comments
0

6 Embeds 599

http://www.lucenerevolution.org 507
http://lucenerevolution.org 80
http://lucenerevolution.com 8
http://www.lucenerevolution.com 2
http://lucenerevolution.stephenz.com 1
http://webcache.googleusercontent.com 1

Accessibility

Categories

Upload Details

Uploaded via SlideShare as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
Post Comment
Edit your comment

Batch indexing & near real time, keeping things fast. Batch indexing & near real time, keeping things fast. Presentation Transcript