Your SlideShare is downloading. ×
Infinispan,Lucene,Hibername OGM
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Infinispan,Lucene,Hibername OGM

1,237

Published on

Sanne Grinovero - JBug Milano

Sanne Grinovero - JBug Milano

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,237
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
29
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Padova, InfoCamere JBoss User Group 12 Aprile 2012
  • 2. Chi sono?• Team Hibernate Sanne Grinovero– Hibernate Search Italiano, Olandese, Newcastle Red Hat: JBoss, Engineering– Hibernate OGM• Infinispan– Infinispan Core– Infinispan Query– JGroups• Apache Lucene
  • 3. Infinispan• Cache distribuita• Datagrid scalabile e transazionale: performance estreme e cloud• NoSQL “DataBase”: key-value store– Come si interroga un data grid ? SELECT * FROM GRID
  • 4. Interrogareuna “Grid”Object v =cache.get(“c7”);
  • 5. Senza chiave, non puoiottenere il valore.
  • 6. É pratico il soloaccesso per chiave?
  • 7. Test sulla mia libreria• Dové Hibernate Search in Action?• Mi passi ISBN 978-1- 933988-17-7 ?• Prendi i libri su Gaudí ?
  • 8. Come implementare queste funzioni su un Key/Value store?• Dové Hibernate Search in Action?• Mi passi ISBN 978-1-933988-17-7 ?• Trovi i libri su Gaudí ?
  • 9. document based NoSQL: Map/ReduceInfinispan non é propriamente document based maoffre Map/Reduce.Eppure non é escluso luso di JSON, XML, YAML, Java: public class Book implements Serializable { final String title; final String author; final String editor; public Book(String title, String author, String editor) { this.title = title; this.author = author; this.editor = editor; } }
  • 10. Iterate & collectclass TitleBookSearcher implements Mapper<String, Book, String, Book> { final String title; public TitleBookSearcher(String t) { title = t; } public void map(String key, Book value, Collector collector){ if ( title.equals( value.title ) ) collector.emit( key, value ); }class BookReducer implements Reducer<String, Book> { public Book reduce(String reducedKey, Iterator<Book> iter) { return iter.next(); }}
  • 11. Implementare queste semplici funzioni:✔ Trova “Hibernate Search in Action”?✔ Trova per codice “ISBN 978-1-933988-17-7” ?✗ Quanti libri a proposito di“Shakespeare” ? • Per uno score corretto in ricerche fulltext servono le frequenze dei frammenti di testo relative al corpus. • Il Pre-tagging é poco pratico e limitante
  • 12. Apache Lucene• Progetto open source Apache™• Integrato in innumerevoli progetti• .. tra cui Hibernate via Hibernate Search• Clusterizzabile via Infinispan– Performance– Real time– High availability
  • 13. Cosa offre Lucene?• Ricerche per Similarity score• Analisi del testo– Sinonyms, Stopwords, Stemming, ...• Reusable declarative Filters• TermVectors• MoreLikeThis• Faceted Search• Veloce!
  • 14. Lucene: Stopwordsa, able, about, across, after, all, almost, also, am,among, an, and, any, are, as, at, be, because, been,but, by, can, cannot, could, dear, did, do, does,either, else, ever, every, for, from, get, got, had,has, have, he, her, hers, him, his, how, however, i, if,in, into, is, it, its, just, least, let, like, likely,may, me, might, most, must, my, neither, no, nor, not,of, off, often, on, only, or, other, our, own, rather,said, say, says, she, should, since, so, some, than,that, the, their, them, then, there, these, they, this,tis, to, too, twas, us, wants, was, we, were, what,when, where, which, while, who, whom, why, will, with,would, yet, you, your
  • 15. Filters
  • 16. Faceted Search
  • 17. Facciamo un bel motore diricerca che restituisce irisultati in ordinealfabetico?
  • 18. Chi usa Lucene?Nexus
  • 19. Dové la fregatura?• Necessita di un indice: risorse fisiche e di amministrazione.– in memory– on filesystem– in Infinispan• Sostanzialmente immutable segments– Ottimizzato per data mining / query, non per updates.• Un mondo di stringhe e vettori di frequenze
  • 20. Infinispan Query quickstart • Abilita indexing=true nella configurazione • Aggiungi il modulo infinispan- query.jar al classpath • Annota i POJO inseriti nella cache per le modalitá di indicizzazione <dependency> <groupId>org.infinispan</groupId> <artifactId>infinispan-query</artifactId> <version>5.1.3.FINAL</version> </dependency>
  • 21. Configurazione tramite codiceConfiguration c = new Configuration() .fluent() .indexing() .addProperty("hibernate.search.default.directory_provider","ram") .build();CacheManager manager = new DefaultCacheManager(c);
  • 22. Configurazione / XML<?xml version="1.0" encoding="UTF-8"?><infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="urn:infinispan:config:5.0http://www.infinispan.org/schemas/infinispan-config-5.0.xsd" xmlns="urn:infinispan:config:5.0"><default> <indexing enabled="true" indexLocalOnly="true"> <properties> <property name="hibernate.search.option1" value="..." /> <property name="hibernate.search.option2" value="..." /> </properties> </indexing></default>
  • 23. Annotazioni sul modello@ProvidedId @Indexedpublic class Book implements Serializable { @Field String title; @Field String author; @Field String editor; public Book(String title, String author, String editor) { this.title = title; this.author = author; this.editor = editor; }}
  • 24. Esecuzione di QuerySearchManager sm = Search.getSearchManager(cache);Query query = sm.buildQueryBuilderForClass(Book.class) .get() .phrase() .onField("title") .sentence("in action") .createQuery();List<Object> list = sm.getQuery(query).list();
  • 25. Architettura• Integra Hibernate Search (engine)– Listener a eventi Hibernate & transazioni • Eventi Infinispan & transazioni– Mappa tipi Java e grafi del modello a Documents di Lucene– Thin-layer design
  • 26. Index mapping
  • 27. Tests per Infinispan Queryhttps://github.com/infinispan/infinispan
  • 28. org.apache.lucene.search.Query luceneQuery = queryBuilder.phrase() .onField( "description" ) .andField( "title" ) .sentence( "a book on highly scalable query engines" ) .enableFullTextFilter( “ready-for-shipping” ) .createQuery();CacheQuery cacheQuery = searchManager.getQuery( luceneQuery, Book.class);List<Book> objectList = cacheQuery.list();
  • 29. Architettura: Infinispan Query
  • 30. Problemi di scalabilitá• Writer locks globali• Sharing su NFS molto problematico
  • 31. Queue-based clustering (filesystem)
  • 32. Index stored in Infinispan
  • 33. Quickstart Hibernate Search • Aggiungi la dipendenza ad hibernate- search:<dependency>   <groupId>org.hibernate</groupId>   <artifactId>hibernate­search­orm</artifactId>   <version>4.1.0.Final</version></dependency>
  • 34. Quickstart Hibernate Search• Tutto il resto é opzionale:– Come gestire gli indici– Moduli di estensione, Analyzer custom– Performance tuning– Mapping custom dei tipi– Clustering • JGroups • Infinispan • JMS
  • 35. Quickstart Hibernate@Entity Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 36. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 37. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob    public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 38. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne    public Author getAuthor() { return author; }...
  • 39. Quickstart Hibernate@Entity @Indexed Searchpublic class Essay {   @Id   public Long getId() { return id; }   @Field   public String getSummary() { return summary; }   @Lob @Field @Boost(0.8)   public String getText() { return text; }   @ManyToOne @IndexedEmbedded    public Author getAuthor() { return author; }...
  • 40. Un secondo esempio@Entity @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; private String title; private String name; } @OneToMany private Set<Book>books;}
  • 41. Struttura dellindice@Entity @Indexed @Entitypublic class Author { public class Book { @Id @GeneratedValue private Integer id; private Integer id; @Field(store=Store.YES) private String title;@Field(store=Store.YES) } private String name; @OneToMany @IndexedEmbedded private Set<Book>books;}
  • 42. QueryString[] productFields = {"summary", "author.name"};Query luceneQuery = // query builder or any Lucene QueryFullTextEntityManager ftEm =   Search.getFullTextEntityManager(entityManager);FullTextQuery query =   ftEm.createFullTextQuery( luceneQuery, Product.class );List<Product> items =   query.setMaxResults(100).getResultList();int totalNbrOfResults = query.getResultSize(); TotalNbrOfResults= 8.320.000 (0.002 seconds)
  • 43. Uso della DSL
  • 44. Sui risultati:• Managed POJO: modifiche alle entitá applicati sia a Lucene che al database• Paginazione JPA, familiari (standard):– .setMaxResults( 20 ).setFirstResult( 100 );• Restrizioni sul tipo, query fulltext polimorifiche:– .createQuery( luceneQuery, A.class, B.class, ..);• Projection• Result mapping
  • 45. FiltersFullTextQuery ftQuery = s // s is a FullTextSession   .createFullTextQuery( query, Product.class )   .enableFullTextFilter( "filtroMinori" )   .enableFullTextFilter( "offertaDelGiorno" )      .setParameter( "day", “20120412” )   .enableFullTextFilter( "inStockA" )      .setParameter( "location", "Padova" );List<Product> results = ftQuery.list();
  • 46. Uso di Infinispan per ladistribuzione degli indici
  • 47. Clustering di un uso Lucene “diretto”• Usando org.apache.lucene– Tradizionalmente difficile da distribuire su nodi multipli– Su qualsiasi cloud
  • 48. Nodo singolo idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  • 49. Nodi multipli idea di performance Write ops/sec Queries/sec RAMDirectory RAMDirectory Infinispan 0 Infinispan 0 queries per second Infinispan D4 Infinispan D4 Infinispan D40 Infinispan D40 FSDirectory FSDirectoryInfinispan Local Infinispan Local 0 50 100 150 200 250 300 350 400 0 5000 10000 15000 20000 25000
  • 50. Le scritture non scalano?
  • 51. Suggerimenti per performance ottimali• Calibra il chunk_size per luso effettivo del vostro indice (evita i read lock evitando la frammentazione)• Verifica la dimensione dei pacchetti network: blob size, JGroups packets, network interface and hardware.• Scegli e configura un CacheLoader adatto
  • 52. Requisiti di memoria• RAMDirectory: tutto lindice (e piú) in RAM.• FSDirectory: un buon OS sa fare un ottimo lavoro di caching di IO – spesso meglio di RAMDirectory.• Infinispan: configurabile, fino alla memoria condivisa tra nodi– Flexible– Fast– Network vs. disk
  • 53. Moduli per cloud deployment scalabili One Infinispan to rule them all– Store Lucene indexes– Hibernate second level cache– Application managed cache– Datagrid– EJB, session replication in AS7– As a JPA “store” via Hibernate OGM
  • 54. Ingredienti per la cloud• JGroups DISCOVERY protocol– MPING– TCP_PING– JDBC_PING– S3_PING• Scegli un CacheLoader– Database based, Jclouds, Cassandra, ...
  • 55. Futuro prossimo• Semplificare la scalabilitá in scrittura• Auto-tuning dei parametri di clustering – ergonomics!• Parallel searching: multi/core + multi/node• A component of– http://www.cloudtm.eu
  • 56. JPA for NoSQL
  • 57. NoSQL: la flessibilitá costa• Programming model • one per product :-(• no schema => app driven schema• query (Map Reduce, specific DSL, ...)• data structure transpires• Transaction• durability / consistency
  • 58. Esempio: InfinispanDistributed Key/Value store (or Replicated, local only efficient cache, • invalidating cache)Each node is equal Just start more nodes, or kill some •No bottlenecks by design •Cloud-network friendly JGroups • And “cloud storage” friendly too! •
  • 59. ABC di Infinispanmap.put( “user-34”,userInstance );map.get( “user-34” );map.remove( “user-34” );
  • 60. É una ConcurrentMap !map.put( “user-34”, userInstance );map.get( “user-34” );map.remove( “user-34” );map.putIfAbsent( “user-38”,another );
  • 61. Qualche altro dettaglio su Infinispan ● Support for Transactions (XA) ● CacheLoaders ● Cassandra, JDBC, Amazon S3 (jclouds),... ● Tree API for JBossCache compatibility ● Lucene integration ● Two-fold ● Some Hibernate integrations ● Second level cache ● Hibernate Search indexing backend
  • 62. Obiettivi di Hibernate OGM Encourage new data usage patterns • Familiar environment • Ease of use • easy to jump in • easy to jump out • Push NoSQL exploration in enterprises • “PaaS for existing API” initiative •
  • 63. Cosé• JPA front end to key/value stores • Object CRUD (incl polymorphism and associations) • OO queries (JP-QL)• Reuses • Hibernate Core • Hibernate Search (and Lucene) • Infinispan• Is not a silver bullet • not for all NoSQL use cases
  • 64. Entitá come blob serializzati?• Serialize objects into the (key) value • store the whole graph?• maintain consistency with duplicated objects • guaranteed identity a == b • concurrency / latency • structure change and (de)serialization, class definition changes
  • 65. OGM’s approach to schema• Keep what’s best from relational model • as much as possible • tables / columns / pks• Decorrelate object structure from data structure• Data stored as (self-described) tuples• Core types limited • portability
  • 66. Query• Hibernate Search indexes entities• Store Lucene indexes in Infinispan• JP-QL to Lucene query transformation• Works for simple queries • Lucene is not a relational SQL engine
  • 67. E ora?• MongoDB• EHCache / Terracotta• Redis• Voldemort• Neo4J• Dynamo• ... Git? Spreadsheet? ...CapeDwarf?
  • 68. Q&Ahttp://infinispan.org @Infinispanhttp://in.relation.to @Hibernatehttp://jboss.org @SanneGrinovero

×