4 June 2012                 Sanne Grinovero, Red HatWhat you get by replicating Lucene indexes on the  Infinispan Data Grid
Who is that guy?•   Sanne Grinovero    •   From this planet•   Team Hibernate    •   Hibernate Search    •   Hibernate OGM...
What are we talking          about?•   Apache Lucene•   Infinispan    •   Integrations with Lucene         ●             I...
Apache Lucene ?
•   An in-memory datagrid    •   Memory of multiple nodes    •   Cluster modes    •   CacheLoaders    •   Integrations wit...
Infinispan API?•   Map-like key/value store    •   JSR 107 javax.cache.Cache interface    •   JSR 347 ??•   Asynchronous API
In practice:cache.put( “user-34”, userInstance );cache.get( “user-34” );cache.remove( “user-34” );cache.putIfAbsent( “user...
Distributed Data
Connected via JGroupsA Toolkit for Reliable Multicast Communication             http://jgroups.org
Or remote clients via:•   Memcached•   REST•   Hot Rod (Ruby, Python, C, C#, ...)    •   Netty
Consistent Hashing: DIST
Transactions!
JBoss AS7 core          component•   Cluster nodes autodiscovery•   Session replication / failover•   Hibernate second lev...
In-memory volatile?Cache Stores: durability, warm caches,more capacity...•   Cassandra•   HBase•   JDBC•   Clouds (S3, ......
Back on Lucene:Single Writer lock
Queue-based clustering      (filesystem index)
Lucene index storage
Index stored in   Infinispan
Example architecture :    JIRA / Scarlet
Hints•   Some tuning options might have    different effects than what youre used•   Network is orders of magnitude faster...
“benchmarks”, stats                      and more lies                       Write ops/sec                                ...
Its not about the                                  figures                       Write ops/sec                            ...
Whats next?•   Infinispan (core) 5.2 and 6•   Lucene 4.x•   Dynamic chunk sizes•   Ad-hoc “Lucene native” CacheStore    • ...
Conclusions•   Quick index replication•   Transactions•   Not a replacements for shards•   Cloud-friendly    •   Delegates...
Q&Ahttp://infinispan.org     @Infinispanhttp://in.relation.to     @Hibernatehttp://jboss.org          @SanneGrinovero
What you get by replicating Lucene indexes on the Infinispan Data Grid (Berlin Buzzwords 2012)
Upcoming SlideShare
Loading in...5
×

What you get by replicating Lucene indexes on the Infinispan Data Grid (Berlin Buzzwords 2012)

3,106

Published on

At the core Infinispan is an in-memory data grid with options for multi node replication or distribution. It is also able to index and search stored data using Apache Lucene, or inverting the roles to have Lucene store any index in the grid. This talk will illustrate how to get started and what benefits (or drawbacks!) you get by storing your Lucene indexes in Infinispan.

http://berlinbuzzwords.de/sessions/what-you-get-replicating-lucene-indexes-infinispan-data-grid

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,106
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
42
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

What you get by replicating Lucene indexes on the Infinispan Data Grid (Berlin Buzzwords 2012)

  1. 1. 4 June 2012 Sanne Grinovero, Red HatWhat you get by replicating Lucene indexes on the Infinispan Data Grid
  2. 2. Who is that guy?• Sanne Grinovero • From this planet• Team Hibernate • Hibernate Search • Hibernate OGM• Team Infinispan • Infinispan Core • Infinispan Query• Apache Lucene, Netty, HotSpot, ANTLR, JGroups, Byteman, The Jokre
  3. 3. What are we talking about?• Apache Lucene• Infinispan • Integrations with Lucene ● Infinispan Lucene Directory
  4. 4. Apache Lucene ?
  5. 5. • An in-memory datagrid • Memory of multiple nodes • Cluster modes • CacheLoaders • Integrations with Lucene • Lucene Directory
  6. 6. Infinispan API?• Map-like key/value store • JSR 107 javax.cache.Cache interface • JSR 347 ??• Asynchronous API
  7. 7. In practice:cache.put( “user-34”, userInstance );cache.get( “user-34” );cache.remove( “user-34” );cache.putIfAbsent( “user-38”, other );
  8. 8. Distributed Data
  9. 9. Connected via JGroupsA Toolkit for Reliable Multicast Communication http://jgroups.org
  10. 10. Or remote clients via:• Memcached• REST• Hot Rod (Ruby, Python, C, C#, ...) • Netty
  11. 11. Consistent Hashing: DIST
  12. 12. Transactions!
  13. 13. JBoss AS7 core component• Cluster nodes autodiscovery• Session replication / failover• Hibernate second level cache• mod_cluster integration
  14. 14. In-memory volatile?Cache Stores: durability, warm caches,more capacity...• Cassandra• HBase• JDBC• Clouds (S3, ...)• Plain Old Files• Many more + custom
  15. 15. Back on Lucene:Single Writer lock
  16. 16. Queue-based clustering (filesystem index)
  17. 17. Lucene index storage
  18. 18. Index stored in Infinispan
  19. 19. Example architecture : JIRA / Scarlet
  20. 20. Hints• Some tuning options might have different effects than what youre used• Network is orders of magnitude faster than disk (YMMV) • But data locality helps • Balance resources• Get mergers to avoid segment chunking, or readlocks will engage
  21. 21. “benchmarks”, stats and more lies Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 Infinispan D4 Infinispan D4 queries per second Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  22. 22. Its not about the figures Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 Infinispan D4 Infinispan D4 queries per second Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  23. 23. Whats next?• Infinispan (core) 5.2 and 6• Lucene 4.x• Dynamic chunk sizes• Ad-hoc “Lucene native” CacheStore • NIO byte buffers?
  24. 24. Conclusions• Quick index replication• Transactions• Not a replacements for shards• Cloud-friendly • Delegates to any storage
  25. 25. Q&Ahttp://infinispan.org @Infinispanhttp://in.relation.to @Hibernatehttp://jboss.org @SanneGrinovero
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×