Your SlideShare is downloading. ×
Infinispan,Lucene,Hibername OGM
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Infinispan,Lucene,Hibername OGM


Published on

Sanne Grinovero - JBug Milano

Sanne Grinovero - JBug Milano

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Padova, InfoCamere JBoss User Group 12 Aprile 2012
  • 2. Chi sono?• Team Hibernate Sanne Grinovero– Hibernate Search Italiano, Olandese, Newcastle Red Hat: JBoss, Engineering– Hibernate OGM• Infinispan– Infinispan Core– Infinispan Query– JGroups• Apache Lucene
  • 3. Infinispan• Cache distribuita• Datagrid scalabile e transazionale: performance estreme e cloud• NoSQL “DataBase”: key-value store– Come si interroga un data grid ? SELECT * FROM GRID
  • 4. Interrogareuna “Grid”Object v =cache.get(“c7”);
  • 5. Senza chiave, non puoiottenere il valore.
  • 6. É pratico il soloaccesso per chiave?
  • 7. Test sulla mia libreria• Dové Hibernate Search in Action?• Mi passi ISBN 978-1- 933988-17-7 ?• Prendi i libri su Gaudí ?
  • 8. Come implementare queste funzioni su un Key/Value store?• Dové Hibernate Search in Action?• Mi passi ISBN 978-1-933988-17-7 ?• Trovi i libri su Gaudí ?
  • 9. document based NoSQL: Map/ReduceInfinispan non é propriamente document based maoffre Map/Reduce.Eppure non é escluso luso di JSON, XML, YAML, Java: public class Book implements Serializable { final String title; final String author; final String editor; public Book(String title, String author, String editor) { this.title = title; = author; this.editor = editor; } }
  • 10. Iterate & collectclass TitleBookSearcher implements Mapper<String, Book, String, Book> { final String title; public TitleBookSearcher(String t) { title = t; } public void map(String key, Book value, Collector collector){ if ( title.equals( value.title ) ) collector.emit( key, value ); }class BookReducer implements Reducer<String, Book> { public Book reduce(String reducedKey, Iterator<Book> iter) { return; }}
  • 11. Implementare queste semplici funzioni:✔ Trova “Hibernate Search in Action”?✔ Trova per codice “ISBN 978-1-933988-17-7” ?✗ Quanti libri a proposito di“Shakespeare” ? • Per uno score corretto in ricerche fulltext servono le frequenze dei frammenti di testo relative al corpus. • Il Pre-tagging é poco pratico e limitante
  • 12. Apache Lucene• Progetto open source Apache™• Integrato in innumerevoli progetti• .. tra cui Hibernate via Hibernate Search• Clusterizzabile via Infinispan– Performance– Real time– High availability
  • 13. Cosa offre Lucene?• Ricerche per Similarity score• Analisi del testo– Sinonyms, Stopwords, Stemming, ...• Reusable declarative Filters• TermVectors• MoreLikeThis• Faceted Search• Veloce!
  • 14. Lucene: Stopwordsa, able, about, across, after, all, almost, also, am,among, an, and, any, are, as, at, be, because, been,but, by, can, cannot, could, dear, did, do, does,either, else, ever, every, for, from, get, got, had,has, have, he, her, hers, him, his, how, however, i, if,in, into, is, it, its, just, least, let, like, likely,may, me, might, most, must, my, neither, no, nor, not,of, off, often, on, only, or, other, our, own, rather,said, say, says, she, should, since, so, some, than,that, the, their, them, then, there, these, they, this,tis, to, too, twas, us, wants, was, we, were, what,when, where, which, while, who, whom, why, will, with,would, yet, you, your
  • 15. Filters
  • 16. Faceted Search
  • 17. Facciamo un bel motore diricerca che restituisce irisultati in ordinealfabetico?
  • 18. Chi usa Lucene?Nexus
  • 19. Dové la fregatura?• Necessita di un indice: risorse fisiche e di amministrazione.– in memory– on filesystem– in Infinispan• Sostanzialmente immutable segments– Ottimizzato per data mining / query, non per updates.• Un mondo di stringhe e vettori di frequenze
  • 20. Infinispan Query quickstart • Abilita indexing=true nella configurazione • Aggiungi il modulo infinispan- query.jar al classpath • Annota i POJO inseriti nella cache per le modalitá di indicizzazione <dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-query</artifactId> <version>5.1.3.FINAL</version> </dependency>
  • 21. Configurazione tramite codiceConfiguration c = new Configuration() .fluent() .indexing() .addProperty("","ram") .build();CacheManager manager = new DefaultCacheManager(c);
  • 22. Configurazione / XML<?xml version="1.0" encoding="UTF-8"?><infinispan xmlns:xsi="" xsi:schemaLocation="urn:infinispan:config:5.0" xmlns="urn:infinispan:config:5.0"><default> <indexing enabled="true" indexLocalOnly="true"> <properties> <property name="" value="..." /> <property name="" value="..." /> </properties> </indexing></default>
  • 23. Annotazioni sul modello@ProvidedId @Indexedpublic class Book implements Serializable { @Field String title; @Field String author; @Field String editor; public Book(String title, String author, String editor) { this.title = title; = author; this.editor = editor; }}
  • 24. Esecuzione di QuerySearchManager sm = Search.getSearchManager(cache);Query query = sm.buildQueryBuilderForClass(Book.class) .get() .phrase() .onField("title") .sentence("in action") .createQuery();List<Object> list = sm.getQuery(query).list();
  • 25. Architettura• Integra Hibernate Search (engine)– Listener a eventi Hibernate & transazioni • Eventi Infinispan & transazioni– Mappa tipi Java e grafi del modello a Documents di Lucene– Thin-layer design
  • 26. Index mapping
  • 27. Tests per Infinispan Query
  • 28. luceneQuery = queryBuilder.phrase() .onField( "description" ) .andField( "title" ) .sentence( "a book on highly scalable query engines" ) .enableFullTextFilter( “ready-for-shipping” ) .createQuery();CacheQuery cacheQuery = searchManager.getQuery( luceneQuery, Book.class);List<Book> objectList = cacheQuery.list();
  • 29. Architettura: Infinispan Query
  • 30. Problemi di scalabilitá• Writer locks globali• Sharing su NFS molto problematico
  • 31. Queue-based clustering (filesystem)
  • 32. Index stored in Infinispan
  • 33. Quickstart Hibernate Search • Aggiungi la dipendenza ad hibernate- search:<dependency>   <groupId>org.hibernate</groupId>   <artifactId>hibernate­search­orm</artifactId>   <version>4.1.0.Final</version></dependency>
  • 34. Quickstart Hibernate Search• Tutto il resto é opzionale:– Come gestire gli indici– Moduli di estensione, Analyzer custom– Performance tuning– Mapping custom dei tipi– Clustering • JGroups • Infinispan • JMS
  • 35. Quickstart Hibernate@Entity Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 36. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 37. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 38. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 39. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne @IndexedEmbedded    public Author getAuthor() { return author; }...
  • 40. Un secondo esempio@Entity @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; private String title; private String name; } @OneToMany private Set<Book>books;}
  • 41. Struttura dellindice@Entity @Indexed @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; @Field(store=Store.YES) private String title;@Field(store=Store.YES) } private String name; @OneToMany @IndexedEmbedded private Set<Book>books;}
  • 42. QueryString[] productFields = {"summary", ""};Query luceneQuery = // query builder or any Lucene QueryFullTextEntityManager ftEm =   Search.getFullTextEntityManager(entityManager);FullTextQuery query =   ftEm.createFullTextQuery( luceneQuery, Product.class );List<Product> items =   query.setMaxResults(100).getResultList();int totalNbrOfResults = query.getResultSize(); TotalNbrOfResults= 8.320.000 (0.002 seconds)
  • 43. Uso della DSL
  • 44. Sui risultati:• Managed POJO: modifiche alle entitá applicati sia a Lucene che al database• Paginazione JPA, familiari (standard):– .setMaxResults( 20 ).setFirstResult( 100 );• Restrizioni sul tipo, query fulltext polimorifiche:– .createQuery( luceneQuery, A.class, B.class, ..);• Projection• Result mapping
  • 45. FiltersFullTextQuery ftQuery = s // s is a FullTextSession   .createFullTextQuery( query, Product.class )   .enableFullTextFilter( "filtroMinori" )   .enableFullTextFilter( "offertaDelGiorno" )      .setParameter( "day", “20120412” )   .enableFullTextFilter( "inStockA" )      .setParameter( "location", "Padova" );List<Product> results = ftQuery.list();
  • 46. Uso di Infinispan per ladistribuzione degli indici
  • 47. Clustering di un uso Lucene “diretto”• Usando org.apache.lucene– Tradizionalmente difficile da distribuire su nodi multipli– Su qualsiasi cloud
  • 48. Nodo singolo idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  • 49. Nodi multipli idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  • 50. Le scritture non scalano?
  • 51. Suggerimenti per performance ottimali• Calibra il chunk_size per luso effettivo del vostro indice (evita i read lock evitando la frammentazione)• Verifica la dimensione dei pacchetti network: blob size, JGroups packets, network interface and hardware.• Scegli e configura un CacheLoader adatto
  • 52. Requisiti di memoria• RAMDirectory: tutto lindice (e piú) in RAM.• FSDirectory: un buon OS sa fare un ottimo lavoro di caching di IO – spesso meglio di RAMDirectory.• Infinispan: configurabile, fino alla memoria condivisa tra nodi– Flexible– Fast– Network vs. disk
  • 53. Moduli per cloud deployment scalabili One Infinispan to rule them all– Store Lucene indexes– Hibernate second level cache– Application managed cache– Datagrid– EJB, session replication in AS7– As a JPA “store” via Hibernate OGM
  • 54. Ingredienti per la cloud• JGroups DISCOVERY protocol– MPING– TCP_PING– JDBC_PING– S3_PING• Scegli un CacheLoader– Database based, Jclouds, Cassandra, ...
  • 55. Futuro prossimo• Semplificare la scalabilitá in scrittura• Auto-tuning dei parametri di clustering – ergonomics!• Parallel searching: multi/core + multi/node• A component of–
  • 56. JPA for NoSQL
  • 57. NoSQL: la flessibilitá costa• Programming model • one per product :-(• no schema => app driven schema• query (Map Reduce, specific DSL, ...)• data structure transpires• Transaction• durability / consistency
  • 58. Esempio: InfinispanDistributed Key/Value store (or Replicated, local only efficient cache, • invalidating cache)Each node is equal Just start more nodes, or kill some •No bottlenecks by design •Cloud-network friendly JGroups • And “cloud storage” friendly too! •
  • 59. ABC di Infinispanmap.put( “user-34”,userInstance );map.get( “user-34” );map.remove( “user-34” );
  • 60. É una ConcurrentMap !map.put( “user-34”, userInstance );map.get( “user-34” );map.remove( “user-34” );map.putIfAbsent( “user-38”,another );
  • 61. Qualche altro dettaglio su Infinispan ● Support for Transactions (XA) ● CacheLoaders ● Cassandra, JDBC, Amazon S3 (jclouds),... ● Tree API for JBossCache compatibility ● Lucene integration ● Two-fold ● Some Hibernate integrations ● Second level cache ● Hibernate Search indexing backend
  • 62. Obiettivi di Hibernate OGM Encourage new data usage patterns • Familiar environment • Ease of use • easy to jump in • easy to jump out • Push NoSQL exploration in enterprises • “PaaS for existing API” initiative •
  • 63. Cosé• JPA front end to key/value stores • Object CRUD (incl polymorphism and associations) • OO queries (JP-QL)• Reuses • Hibernate Core • Hibernate Search (and Lucene) • Infinispan• Is not a silver bullet • not for all NoSQL use cases
  • 64. Entitá come blob serializzati?• Serialize objects into the (key) value • store the whole graph?• maintain consistency with duplicated objects • guaranteed identity a == b • concurrency / latency • structure change and (de)serialization, class definition changes
  • 65. OGM’s approach to schema• Keep what’s best from relational model • as much as possible • tables / columns / pks• Decorrelate object structure from data structure• Data stored as (self-described) tuples• Core types limited • portability
  • 66. Query• Hibernate Search indexes entities• Store Lucene indexes in Infinispan• JP-QL to Lucene query transformation• Works for simple queries • Lucene is not a relational SQL engine
  • 67. E ora?• MongoDB• EHCache / Terracotta• Redis• Voldemort• Neo4J• Dynamo• ... Git? Spreadsheet? ...CapeDwarf?
  • 68. Q&A @Infinispan @Hibernate @SanneGrinovero