Padova, InfoCamere  JBoss User Group     12 Aprile 2012
Chi sono?•   Team Hibernate    Sanne Grinovero– Hibernate Search    Italiano, Olandese, Newcastle                      Red...
Infinispan•   Cache distribuita•   Datagrid scalabile e transazionale:    performance estreme e cloud•   NoSQL “DataBase”:...
Interrogareuna “Grid”Object v =cache.get(“c7”);
Senza chiave, non puoiottenere il valore.
É pratico il soloaccesso per chiave?
Test sulla mia   libreria• Dové Hibernate    Search in Action?•   Mi passi    ISBN 978-1-    933988-17-7 ?•   Prendi i lib...
Come implementare    queste funzioni su un      Key/Value store?•   Dové Hibernate Search in Action?•   Mi passi ISBN 978-...
document based NoSQL:        Map/ReduceInfinispan non é propriamente document based maoffre Map/Reduce.Eppure non é esclus...
Iterate & collectclass TitleBookSearcher implements                        Mapper<String, Book, String, Book> {  final Str...
Implementare queste         semplici funzioni:✔ Trova “Hibernate Search in Action”?✔ Trova per codice “ISBN 978-1-933988-1...
Apache Lucene•   Progetto open source Apache™•   Integrato in innumerevoli progetti•   .. tra cui Hibernate via Hibernate ...
Cosa offre Lucene?•   Ricerche per Similarity score•   Analisi del testo– Sinonyms, Stopwords, Stemming, ...•   Reusable d...
Lucene: Stopwordsa, able, about, across, after, all, almost, also, am,among, an, and, any, are, as, at, be, because, been,...
Filters
Faceted Search
Facciamo un bel motore diricerca che restituisce irisultati in ordinealfabetico?
Chi usa Lucene?Nexus
Dové la fregatura?•   Necessita di un indice: risorse fisiche e di    amministrazione.– in memory– on filesystem– in Infin...
Infinispan Query quickstart  • Abilita indexing=true nella       configurazione   •   Aggiungi il modulo infinispan-      ...
Configurazione tramite          codiceConfiguration c = new Configuration()   .fluent()      .indexing()      .addProperty...
Configurazione / XML<?xml version="1.0" encoding="UTF-8"?><infinispan            xmlns:xsi="http://www.w3.org/2001/XMLSche...
Annotazioni sul modello@ProvidedId @Indexedpublic class Book implements Serializable {    @Field String title;    @Field S...
Esecuzione di QuerySearchManager sm = Search.getSearchManager(cache);Query query = sm.buildQueryBuilderForClass(Book.class...
Architettura•    Integra Hibernate Search (engine)– Listener a eventi Hibernate &  transazioni    • Eventi Infinispan & tr...
Index mapping
Tests per     Infinispan Queryhttps://github.com/infinispan/infinispan
org.apache.lucene.search.Query luceneQuery =    queryBuilder.phrase()          .onField( "description" )          .andFiel...
Architettura: Infinispan         Query
Problemi di scalabilitá•   Writer locks globali•   Sharing su NFS    molto problematico
Queue-based clustering     (filesystem)
Index stored in   Infinispan
Quickstart Hibernate           Search •    Aggiungi la dipendenza ad hibernate-      search:<dependency>   <groupId>org.hi...
Quickstart Hibernate                       Search•    Tutto il resto é opzionale:– Come gestire gli indici– Moduli di este...
Quickstart Hibernate@Entity          Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String...
Quickstart Hibernate@Entity @Indexed                 Searchpublic class Essay {   @Id   public Long getId() { return id; }...
Quickstart Hibernate@Entity @Indexed                 Searchpublic class Essay {   @Id   public Long getId() { return id; }...
Quickstart Hibernate@Entity @Indexed                 Searchpublic class Essay {   @Id   public Long getId() { return id; }...
Quickstart Hibernate@Entity @Indexed                 Searchpublic class Essay {   @Id   public Long getId() { return id; }...
Un secondo esempio@Entity                        @Entitypublic class Author {          public class Book {        @Id @Gen...
Struttura dellindice@Entity @Indexed              @Entitypublic class Author {         public class Book {        @Id @Gen...
QueryString[] productFields = {"summary", "author.name"};Query luceneQuery = // query builder or any Lucene QueryFullTextE...
Uso della DSL
Sui risultati:•   Managed POJO: modifiche alle entitá applicati sia a    Lucene che al database•   Paginazione JPA, famili...
FiltersFullTextQuery ftQuery = s // s is a FullTextSession   .createFullTextQuery( query, Product.class )   .enableFullTex...
Uso di Infinispan per ladistribuzione degli indici
Clustering di un uso      Lucene “diretto”•   Usando org.apache.lucene– Tradizionalmente difficile da distribuire  su nodi...
Nodo singolo         idea di performance                       Write ops/sec                                              ...
Nodi multipli         idea di performance                       Write ops/sec                                             ...
Le scritture non    scalano?
Suggerimenti per    performance ottimali•   Calibra il chunk_size per luso effettivo    del vostro indice (evita i read lo...
Requisiti di memoria•   RAMDirectory: tutto lindice (e piú) in RAM.•   FSDirectory: un buon OS sa fare un ottimo    lavoro...
Moduli per cloud deployment scalabili   One Infinispan to rule them all– Store Lucene indexes– Hibernate second level cach...
Ingredienti per la cloud• JGroups DISCOVERY protocol– MPING– TCP_PING– JDBC_PING– S3_PING•   Scegli un CacheLoader– Databa...
Futuro prossimo•   Semplificare la scalabilitá in scrittura•   Auto-tuning dei parametri di    clustering – ergonomics!•  ...
JPA for NoSQL
NoSQL:   la flessibilitá costa• Programming model  • one per product :-(• no schema => app driven schema• query (Map Reduc...
Esempio: InfinispanDistributed Key/Value store         (or Replicated, local only efficient cache,      •      invalidatin...
ABC di Infinispanmap.put( “user-34”,userInstance );map.get( “user-34” );map.remove( “user-34” );
É una ConcurrentMap !map.put( “user-34”, userInstance );map.get( “user-34” );map.remove( “user-34” );map.putIfAbsent( “use...
Qualche altro dettaglio su       Infinispan ●   Support for Transactions (XA) ●   CacheLoaders     ●   Cassandra, JDBC, Am...
Obiettivi di Hibernate         OGM      Encourage new data usage patterns  •      Familiar environment  •      Ease of use...
Cosé• JPA front end to key/value stores   • Object CRUD (incl polymorphism and      associations)   • OO queries (JP-QL)• ...
Entitá come blob       serializzati?• Serialize objects into the (key) value  • store the whole graph?• maintain consisten...
OGM’s approach to      schema• Keep what’s best from relational model  • as much as possible  • tables / columns / pks• De...
Query• Hibernate Search indexes entities• Store Lucene indexes in Infinispan• JP-QL to Lucene query transformation• Works ...
E ora?•   MongoDB•   EHCache / Terracotta•   Redis•   Voldemort•   Neo4J•   Dynamo•   ... Git? Spreadsheet? ...CapeDwarf?
Q&Ahttp://infinispan.org   @Infinispanhttp://in.relation.to   @Hibernatehttp://jboss.org        @SanneGrinovero
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Infinispan,Lucene,Hibername OGM
Upcoming SlideShare
Loading in...5
×

Infinispan,Lucene,Hibername OGM

1,323

Published on

Sanne Grinovero - JBug Milano

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,323
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
31
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Infinispan,Lucene,Hibername OGM

  1. 1. Padova, InfoCamere JBoss User Group 12 Aprile 2012
  2. 2. Chi sono?• Team Hibernate Sanne Grinovero– Hibernate Search Italiano, Olandese, Newcastle Red Hat: JBoss, Engineering– Hibernate OGM• Infinispan– Infinispan Core– Infinispan Query– JGroups• Apache Lucene
  3. 3. Infinispan• Cache distribuita• Datagrid scalabile e transazionale: performance estreme e cloud• NoSQL “DataBase”: key-value store– Come si interroga un data grid ? SELECT * FROM GRID
  4. 4. Interrogareuna “Grid”Object v =cache.get(“c7”);
  5. 5. Senza chiave, non puoiottenere il valore.
  6. 6. É pratico il soloaccesso per chiave?
  7. 7. Test sulla mia libreria• Dové Hibernate Search in Action?• Mi passi ISBN 978-1- 933988-17-7 ?• Prendi i libri su Gaudí ?
  8. 8. Come implementare queste funzioni su un Key/Value store?• Dové Hibernate Search in Action?• Mi passi ISBN 978-1-933988-17-7 ?• Trovi i libri su Gaudí ?
  9. 9. document based NoSQL: Map/ReduceInfinispan non é propriamente document based maoffre Map/Reduce.Eppure non é escluso luso di JSON, XML, YAML, Java: public class Book implements Serializable { final String title; final String author; final String editor; public Book(String title, String author, String editor) { this.title = title; this.author = author; this.editor = editor; } }
  10. 10. Iterate & collectclass TitleBookSearcher implements Mapper<String, Book, String, Book> { final String title; public TitleBookSearcher(String t) { title = t; } public void map(String key, Book value, Collector collector){ if ( title.equals( value.title ) ) collector.emit( key, value ); }class BookReducer implements Reducer<String, Book> { public Book reduce(String reducedKey, Iterator<Book> iter) { return iter.next(); }}
  11. 11. Implementare queste semplici funzioni:✔ Trova “Hibernate Search in Action”?✔ Trova per codice “ISBN 978-1-933988-17-7” ?✗ Quanti libri a proposito di“Shakespeare” ? • Per uno score corretto in ricerche fulltext servono le frequenze dei frammenti di testo relative al corpus. • Il Pre-tagging é poco pratico e limitante
  12. 12. Apache Lucene• Progetto open source Apache™• Integrato in innumerevoli progetti• .. tra cui Hibernate via Hibernate Search• Clusterizzabile via Infinispan– Performance– Real time– High availability
  13. 13. Cosa offre Lucene?• Ricerche per Similarity score• Analisi del testo– Sinonyms, Stopwords, Stemming, ...• Reusable declarative Filters• TermVectors• MoreLikeThis• Faceted Search• Veloce!
  14. 14. Lucene: Stopwordsa, able, about, across, after, all, almost, also, am,among, an, and, any, are, as, at, be, because, been,but, by, can, cannot, could, dear, did, do, does,either, else, ever, every, for, from, get, got, had,has, have, he, her, hers, him, his, how, however, i, if,in, into, is, it, its, just, least, let, like, likely,may, me, might, most, must, my, neither, no, nor, not,of, off, often, on, only, or, other, our, own, rather,said, say, says, she, should, since, so, some, than,that, the, their, them, then, there, these, they, this,tis, to, too, twas, us, wants, was, we, were, what,when, where, which, while, who, whom, why, will, with,would, yet, you, your
  15. 15. Filters
  16. 16. Faceted Search
  17. 17. Facciamo un bel motore diricerca che restituisce irisultati in ordinealfabetico?
  18. 18. Chi usa Lucene?Nexus
  19. 19. Dové la fregatura?• Necessita di un indice: risorse fisiche e di amministrazione.– in memory– on filesystem– in Infinispan• Sostanzialmente immutable segments– Ottimizzato per data mining / query, non per updates.• Un mondo di stringhe e vettori di frequenze
  20. 20. Infinispan Query quickstart • Abilita indexing=true nella configurazione • Aggiungi il modulo infinispan- query.jar al classpath • Annota i POJO inseriti nella cache per le modalitá di indicizzazione <dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-query</artifactId> <version>5.1.3.FINAL</version> </dependency>
  21. 21. Configurazione tramite codiceConfiguration c = new Configuration() .fluent() .indexing() .addProperty("hibernate.search.default.directory_provider","ram") .build();CacheManager manager = new DefaultCacheManager(c);
  22. 22. Configurazione / XML<?xml version="1.0" encoding="UTF-8"?><infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:infinispan:config:5.0http://www.infinispan.org/schemas/infinispan-config-5.0.xsd" xmlns="urn:infinispan:config:5.0"><default> <indexing enabled="true" indexLocalOnly="true"> <properties> <property name="hibernate.search.option1" value="..." /> <property name="hibernate.search.option2" value="..." /> </properties> </indexing></default>
  23. 23. Annotazioni sul modello@ProvidedId @Indexedpublic class Book implements Serializable { @Field String title; @Field String author; @Field String editor; public Book(String title, String author, String editor) { this.title = title; this.author = author; this.editor = editor; }}
  24. 24. Esecuzione di QuerySearchManager sm = Search.getSearchManager(cache);Query query = sm.buildQueryBuilderForClass(Book.class) .get() .phrase() .onField("title") .sentence("in action") .createQuery();List<Object> list = sm.getQuery(query).list();
  25. 25. Architettura• Integra Hibernate Search (engine)– Listener a eventi Hibernate & transazioni • Eventi Infinispan & transazioni– Mappa tipi Java e grafi del modello a Documents di Lucene– Thin-layer design
  26. 26. Index mapping
  27. 27. Tests per Infinispan Queryhttps://github.com/infinispan/infinispan
  28. 28. org.apache.lucene.search.Query luceneQuery = queryBuilder.phrase() .onField( "description" ) .andField( "title" ) .sentence( "a book on highly scalable query engines" ) .enableFullTextFilter( “ready-for-shipping” ) .createQuery();CacheQuery cacheQuery = searchManager.getQuery( luceneQuery, Book.class);List<Book> objectList = cacheQuery.list();
  29. 29. Architettura: Infinispan Query
  30. 30. Problemi di scalabilitá• Writer locks globali• Sharing su NFS molto problematico
  31. 31. Queue-based clustering (filesystem)
  32. 32. Index stored in Infinispan
  33. 33. Quickstart Hibernate Search • Aggiungi la dipendenza ad hibernate- search:<dependency>   <groupId>org.hibernate</groupId>   <artifactId>hibernate­search­orm</artifactId>   <version>4.1.0.Final</version></dependency>
  34. 34. Quickstart Hibernate Search• Tutto il resto é opzionale:– Come gestire gli indici– Moduli di estensione, Analyzer custom– Performance tuning– Mapping custom dei tipi– Clustering • JGroups • Infinispan • JMS
  35. 35. Quickstart Hibernate@Entity Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  36. 36. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  37. 37. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  38. 38. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  39. 39. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne @IndexedEmbedded    public Author getAuthor() { return author; }...
  40. 40. Un secondo esempio@Entity @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; private String title; private String name; } @OneToMany private Set<Book>books;}
  41. 41. Struttura dellindice@Entity @Indexed @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; @Field(store=Store.YES) private String title;@Field(store=Store.YES) } private String name; @OneToMany @IndexedEmbedded private Set<Book>books;}
  42. 42. QueryString[] productFields = {"summary", "author.name"};Query luceneQuery = // query builder or any Lucene QueryFullTextEntityManager ftEm =   Search.getFullTextEntityManager(entityManager);FullTextQuery query =   ftEm.createFullTextQuery( luceneQuery, Product.class );List<Product> items =   query.setMaxResults(100).getResultList();int totalNbrOfResults = query.getResultSize(); TotalNbrOfResults= 8.320.000 (0.002 seconds)
  43. 43. Uso della DSL
  44. 44. Sui risultati:• Managed POJO: modifiche alle entitá applicati sia a Lucene che al database• Paginazione JPA, familiari (standard):– .setMaxResults( 20 ).setFirstResult( 100 );• Restrizioni sul tipo, query fulltext polimorifiche:– .createQuery( luceneQuery, A.class, B.class, ..);• Projection• Result mapping
  45. 45. FiltersFullTextQuery ftQuery = s // s is a FullTextSession   .createFullTextQuery( query, Product.class )   .enableFullTextFilter( "filtroMinori" )   .enableFullTextFilter( "offertaDelGiorno" )      .setParameter( "day", “20120412” )   .enableFullTextFilter( "inStockA" )      .setParameter( "location", "Padova" );List<Product> results = ftQuery.list();
  46. 46. Uso di Infinispan per ladistribuzione degli indici
  47. 47. Clustering di un uso Lucene “diretto”• Usando org.apache.lucene– Tradizionalmente difficile da distribuire su nodi multipli– Su qualsiasi cloud
  48. 48. Nodo singolo idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  49. 49. Nodi multipli idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  50. 50. Le scritture non scalano?
  51. 51. Suggerimenti per performance ottimali• Calibra il chunk_size per luso effettivo del vostro indice (evita i read lock evitando la frammentazione)• Verifica la dimensione dei pacchetti network: blob size, JGroups packets, network interface and hardware.• Scegli e configura un CacheLoader adatto
  52. 52. Requisiti di memoria• RAMDirectory: tutto lindice (e piú) in RAM.• FSDirectory: un buon OS sa fare un ottimo lavoro di caching di IO – spesso meglio di RAMDirectory.• Infinispan: configurabile, fino alla memoria condivisa tra nodi– Flexible– Fast– Network vs. disk
  53. 53. Moduli per cloud deployment scalabili One Infinispan to rule them all– Store Lucene indexes– Hibernate second level cache– Application managed cache– Datagrid– EJB, session replication in AS7– As a JPA “store” via Hibernate OGM
  54. 54. Ingredienti per la cloud• JGroups DISCOVERY protocol– MPING– TCP_PING– JDBC_PING– S3_PING• Scegli un CacheLoader– Database based, Jclouds, Cassandra, ...
  55. 55. Futuro prossimo• Semplificare la scalabilitá in scrittura• Auto-tuning dei parametri di clustering – ergonomics!• Parallel searching: multi/core + multi/node• A component of– http://www.cloudtm.eu
  56. 56. JPA for NoSQL
  57. 57. NoSQL: la flessibilitá costa• Programming model • one per product :-(• no schema => app driven schema• query (Map Reduce, specific DSL, ...)• data structure transpires• Transaction• durability / consistency
  58. 58. Esempio: InfinispanDistributed Key/Value store (or Replicated, local only efficient cache, • invalidating cache)Each node is equal Just start more nodes, or kill some •No bottlenecks by design •Cloud-network friendly JGroups • And “cloud storage” friendly too! •
  59. 59. ABC di Infinispanmap.put( “user-34”,userInstance );map.get( “user-34” );map.remove( “user-34” );
  60. 60. É una ConcurrentMap !map.put( “user-34”, userInstance );map.get( “user-34” );map.remove( “user-34” );map.putIfAbsent( “user-38”,another );
  61. 61. Qualche altro dettaglio su Infinispan ● Support for Transactions (XA) ● CacheLoaders ● Cassandra, JDBC, Amazon S3 (jclouds),... ● Tree API for JBossCache compatibility ● Lucene integration ● Two-fold ● Some Hibernate integrations ● Second level cache ● Hibernate Search indexing backend
  62. 62. Obiettivi di Hibernate OGM Encourage new data usage patterns • Familiar environment • Ease of use • easy to jump in • easy to jump out • Push NoSQL exploration in enterprises • “PaaS for existing API” initiative •
  63. 63. Cosé• JPA front end to key/value stores • Object CRUD (incl polymorphism and associations) • OO queries (JP-QL)• Reuses • Hibernate Core • Hibernate Search (and Lucene) • Infinispan• Is not a silver bullet • not for all NoSQL use cases
  64. 64. Entitá come blob serializzati?• Serialize objects into the (key) value • store the whole graph?• maintain consistency with duplicated objects • guaranteed identity a == b • concurrency / latency • structure change and (de)serialization, class definition changes
  65. 65. OGM’s approach to schema• Keep what’s best from relational model • as much as possible • tables / columns / pks• Decorrelate object structure from data structure• Data stored as (self-described) tuples• Core types limited • portability
  66. 66. Query• Hibernate Search indexes entities• Store Lucene indexes in Infinispan• JP-QL to Lucene query transformation• Works for simple queries • Lucene is not a relational SQL engine
  67. 67. E ora?• MongoDB• EHCache / Terracotta• Redis• Voldemort• Neo4J• Dynamo• ... Git? Spreadsheet? ...CapeDwarf?
  68. 68. Q&Ahttp://infinispan.org @Infinispanhttp://in.relation.to @Hibernatehttp://jboss.org @SanneGrinovero
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×