Advanced Queries on the
Infinispan Data Grid
Navin Surtani
13th May 2015
GeeCon, Krakow
Who is Navin?
• Worked on Red Hat projects
since 2008
• Infinispan
• Hibernate Search
• Wildfly/JBoss EAP
Tweet your questions
@navssurtani
#advancedqueries
What are we talking about?
• What is Infinispan?
• The Query module
• Backend tech  Hibernate Search & Apache Lucene
• Setup and configuration
• Demo and code walkthrough
What is Infinispan?
• Distributed in-memory key/value data store
• Extension of java.util.Map
• Modes
• Library  Embed into EE/SE application
• Server  Connect remotely
Some features
• Fully transactional (JTA, XA)
• Hibernate 2nd level caching
• Full-text querying
• Non-JVM clients for server mode
How do I use it?
• Cache  Sit in front of your NoSQL data store
• In-memory DB  Primary data store is in memory
• Clusterability  Manage state that is distributed
… but we have a problem here
• How do I find my data?
• I don’t want to give out
keys
• I might not know what I
need to find
Query module to the rescue
• Allows searching of values in the cache
• Original project: JBoss Cache Searchable in 2008
• Integration between Infinispan and Hibernate Search
• Became Query module in 2009
Full-text search
• Library example:
• Is author name: Surname, Name?
• Name, Surname?
• How do I deal with …
• Special characters?
• Typos?
Lucene
• Scalable high-performance indexing
• Small RAM requirement  ~ 1MB heap
• Index size  ~ 20-30% size of data
• 100% open source and written in Java
• Apache Licensing
• Ports to other languages exist
Lucene
• Optimised for searching and querying
• Rich feature-set for query types
• Typo-tolerant searches
• Similar keywords
• Document structure
• Unstructured data
• Documents stored in-memory or on disk
Two features we will look at
Facets
• Obtain counts, or frequencies
of a result
• O(1) to obtain counts
• EBay counts
Filters
• Filters are:
• Declarative
• Stacking
• Reusable
How it all fits together
XML Configuration
<local-cache name="Votes">
<transaction mode="NONE"/>
<indexing index="ALL">
<property name="default.directory_provider">
ram
</property>
</indexing>
</local-cache>
Programmatic Configuration
ConfigurationBuilder cb = new ConfigurationBuilder();
cb.indexing()
.enable()
.indexLocalOnly() // Will only index local node
.withProperties(properties);
EmbeddedCacheManager cm = new DefaultCacheManager(cb.build());
// My key is an int and value is of type Person
Cache<int, Person> cache = cm.getCache();
Annotations required
• @Indexed
• @Field
• @IndexedEmbedded
Running queries
// I have a cache instance which is not empty
SearchManager sm = Search.getSearchManager(cache);
QueryBuilder qb = sm.buildQueryForClass(Person.class)
.get();
Query q = qb.keyword().onField(“name”).matching(“Surtani”)
.createQuery();
CacheQuery cq = sm.getQuery(q, Person.class);
How it all ties together …
• Web-application using Infinispan running on WildFly 9 CR
• App-server ships with Query module
• Use a web-form to vote in an ‘election’
• One vote for governor
• One vote for senator
Flow I: Query ‘warm-up’
• Story: ‘We don’t know who is running in the election’
• WebSocket endpoint to delegate to Worker object
• Worker object executes on CandidateCacheDao
• Returns results through WebSocket endpoint
Flow II: Voting form
• Story: ‘This is our ballot paper’
• Front-end creates JSON to go to WebSocket endpoint
• JSON gets parsed by BallotWorker object
• BallotWorker puts parsed JSON into Cache through VotingCacheDao
Flow III: Faceted search
• Story: ‘We want to know who has won the election’
• Front-end asks for the result of an election (governor or senator)
• ElectionResultWorker object runs a query through the
VotingCacheDao
• Result passed back to web-page as JSON
Flow III: Faceted search with Filter
• Story: ‘We would like to know who has received the most votes
in a particular region’
• Essentially the same workflow as III but we also pass a Filter to our query
• We are using the same query code, except we also filter out our results.
Demo time
Roadmap
• API:
• JDK 8 integration
• FunctionalCache interface
• Query:
• Query on Non-Indexed fields
• Continuous querying
Summary
• Query module 101
• Configuration
• Demo
• Basic query on multiple fields
• Faceted search with and without filter
Get in touch
Twitter:
• @navssurtani
• @infinispan
• @c2b2consulting
IRC:
• #infinispan on FreeNode
Blogs:
• navssurtani.blogspot.com
• blog.infinispan.org
• blog.c2b2.co.uk
Demo:
• github.com/navssurtani/query-
demo
Q&A
#thankyougeecon

Advanced queries on the Infinispan Data Grid

  • 1.
    Advanced Queries onthe Infinispan Data Grid Navin Surtani 13th May 2015 GeeCon, Krakow
  • 2.
    Who is Navin? •Worked on Red Hat projects since 2008 • Infinispan • Hibernate Search • Wildfly/JBoss EAP
  • 3.
  • 4.
    What are wetalking about? • What is Infinispan? • The Query module • Backend tech  Hibernate Search & Apache Lucene • Setup and configuration • Demo and code walkthrough
  • 6.
    What is Infinispan? •Distributed in-memory key/value data store • Extension of java.util.Map • Modes • Library  Embed into EE/SE application • Server  Connect remotely
  • 7.
    Some features • Fullytransactional (JTA, XA) • Hibernate 2nd level caching • Full-text querying • Non-JVM clients for server mode
  • 8.
    How do Iuse it? • Cache  Sit in front of your NoSQL data store • In-memory DB  Primary data store is in memory • Clusterability  Manage state that is distributed
  • 9.
    … but wehave a problem here • How do I find my data? • I don’t want to give out keys • I might not know what I need to find
  • 10.
    Query module tothe rescue • Allows searching of values in the cache • Original project: JBoss Cache Searchable in 2008 • Integration between Infinispan and Hibernate Search • Became Query module in 2009
  • 11.
    Full-text search • Libraryexample: • Is author name: Surname, Name? • Name, Surname? • How do I deal with … • Special characters? • Typos?
  • 14.
    Lucene • Scalable high-performanceindexing • Small RAM requirement  ~ 1MB heap • Index size  ~ 20-30% size of data • 100% open source and written in Java • Apache Licensing • Ports to other languages exist
  • 15.
    Lucene • Optimised forsearching and querying • Rich feature-set for query types • Typo-tolerant searches • Similar keywords • Document structure • Unstructured data • Documents stored in-memory or on disk
  • 16.
    Two features wewill look at Facets • Obtain counts, or frequencies of a result • O(1) to obtain counts • EBay counts Filters • Filters are: • Declarative • Stacking • Reusable
  • 18.
    How it allfits together
  • 19.
    XML Configuration <local-cache name="Votes"> <transactionmode="NONE"/> <indexing index="ALL"> <property name="default.directory_provider"> ram </property> </indexing> </local-cache>
  • 20.
    Programmatic Configuration ConfigurationBuilder cb= new ConfigurationBuilder(); cb.indexing() .enable() .indexLocalOnly() // Will only index local node .withProperties(properties); EmbeddedCacheManager cm = new DefaultCacheManager(cb.build()); // My key is an int and value is of type Person Cache<int, Person> cache = cm.getCache();
  • 21.
    Annotations required • @Indexed •@Field • @IndexedEmbedded
  • 22.
    Running queries // Ihave a cache instance which is not empty SearchManager sm = Search.getSearchManager(cache); QueryBuilder qb = sm.buildQueryForClass(Person.class) .get(); Query q = qb.keyword().onField(“name”).matching(“Surtani”) .createQuery(); CacheQuery cq = sm.getQuery(q, Person.class);
  • 24.
    How it allties together … • Web-application using Infinispan running on WildFly 9 CR • App-server ships with Query module • Use a web-form to vote in an ‘election’ • One vote for governor • One vote for senator
  • 25.
    Flow I: Query‘warm-up’ • Story: ‘We don’t know who is running in the election’ • WebSocket endpoint to delegate to Worker object • Worker object executes on CandidateCacheDao • Returns results through WebSocket endpoint
  • 26.
    Flow II: Votingform • Story: ‘This is our ballot paper’ • Front-end creates JSON to go to WebSocket endpoint • JSON gets parsed by BallotWorker object • BallotWorker puts parsed JSON into Cache through VotingCacheDao
  • 27.
    Flow III: Facetedsearch • Story: ‘We want to know who has won the election’ • Front-end asks for the result of an election (governor or senator) • ElectionResultWorker object runs a query through the VotingCacheDao • Result passed back to web-page as JSON
  • 28.
    Flow III: Facetedsearch with Filter • Story: ‘We would like to know who has received the most votes in a particular region’ • Essentially the same workflow as III but we also pass a Filter to our query • We are using the same query code, except we also filter out our results.
  • 29.
  • 30.
    Roadmap • API: • JDK8 integration • FunctionalCache interface • Query: • Query on Non-Indexed fields • Continuous querying
  • 31.
    Summary • Query module101 • Configuration • Demo • Basic query on multiple fields • Faceted search with and without filter
  • 32.
    Get in touch Twitter: •@navssurtani • @infinispan • @c2b2consulting IRC: • #infinispan on FreeNode Blogs: • navssurtani.blogspot.com • blog.infinispan.org • blog.c2b2.co.uk Demo: • github.com/navssurtani/query- demo
  • 33.
  • 34.