View stunning SlideShares in full-screen with the new iOS app!Introducing SlideShare for AndroidExplore all your favorite topics in the SlideShare appGet the SlideShare app to Save for Later — even offline
View stunning SlideShares in full-screen with the new Android app!View stunning SlideShares in full-screen with the new iOS app!
InﬁnispanData Grids, NoSQL, Cloud Storage & JSR-347 Manik Surtani Founder and Project Lead, Infinispan Red Hat, Inc.
Who is Manik?• Hacker@JBoss, Red Hat’s middleware division• Founder and Project Lead, Infinispan• Spec lead, JSR 347 •Data Grids for Java• EG representative, JSR 107 •Temporary Caching for Java• http://blog.infinispan.org• http://twitter.com/maniksurtani
Agenda• A brief introduction to Infinispan• Understanding Data Grids• .. and NoSQL• Their role in Cloud Storage• JSR 347 and related standards
What is Infinispan?• An open source data grid platform• Written in Java and Scala • Not just for the JVM though• Distributed key/value store • Transactional (JTA) • Low-latency (in-memory) • Optionally persisted to disk • Feature-rich
Client/Server Architecture Supported Protocols • REST • Memcached • Hot Rod
WTF is Hot Rod?• Wire protocol for client server communications• Open• Language independent• Built-in failover and load balancing• Smart routing
Server Endpoint Comparison Protocol Client Clustered? Smart Load Balancing/ Libraries Routing FailoverREST Text N/A Yes No Any HTTP load balancerMemcached Text Plenty Yes No Only with predefined server listHot Rod Binary Java, Yes Yes Dynamic Python, Ruby
Data Grids.What Are They? An evolution of distributed caches
Why use distributed caches?• Cache data that is expensive to retrieve/calculate • E.g., from a database• The need for fast, low-latency data access • Performance or time-sensitive applications• Very commonly used in: • Financial Services industry • Telcos • Highly scalable e-commerce
Data grids as clustering toolkits• To introduce high availability and failover to frameworks • Commercial and open source frameworks • In-house frameworks and reusable architectures• Delegate all state management to the data grid • Framework becomes stateless and hence elastic
ButData Grids > Distributed Caches • Querying • Task execution and map/reduce • Control over data co-location
What is NoSQL?• An alternative form of typically disk-based data storage• Free from relational structure • Usually key/value or document-based• Allows for greater scalability and easier clustering/distribution
NoSQL and Consistency• BASE not ACID • Relax consistency in exchange for high availability and partition tolerance• Usually eventually consistent • Which means applications need to be designed with this in mind
Cloud Storage• Traditional mechanisms (RDBMSs and file systems) are hard to deal with• Clouds are ephemeral• All cloud components are expected to be: • elastic • highly available
Cloud Storage•Data grids and NoSQL win over traditional storage mechanisms in the cloud• Data grids and NoSQL are fast converging in feature sets • E.g., Data grids can write through to disk; many NoSQL engines would also cache in memory
JSR 347 Data Grids for the Java Platform• A new JSR for proposed inclusion in Java EE 8 • to make enterprise Java more cloud-friendly• Standardize data grid APIs and behavior for the Java platform• Does not define NoSQL • Data grids primarily used from within a JVM • NoSQL primarily used via client connectors over a socket • Standardizing wire protocols beyond the scope of the JCP
JSR 347 Data Grids for the Java Platform• Extends JSR 107 (Temporary Caching for Java)• Adds: • Asynchronous, non-blocking API • Grouping API to control co-location • Distributed code execution and Map/Reduce APIs • Eventually consistent API • Possibly more• Still very much work in progress • Participate!
Related standards and efforts• JSR 107 • A temporary caching API that defines: • Basic interaction • JTA compatibility • Persistence: write-through and write-behind • Listeners
Related standards and efforts• Hibernate OGM • JPA for key/value stores! • Common and familiar paradigm for persisting data • Except persistence is made to a data grid or NoSQL store
Related standards and efforts• Contexts and Dependency Injection • Interaction with caches defined in JSR 107 • Familiar and well proven programming model • Works well with JPA and hence Hibernate OGM • Works well even for direct access to key/value data grids
Where does Infinispan fit in?• Will implement JSR 107 • Currently implements most of this at least in concept• Will implement JSR 347 • Currently serves as a “donor” for most of JSR 347 features and API• Is already the reference backend for Hibernate OGM• Already supports CDI integration
To Summarize•What data grids and distributed caches are•Where NoSQL came from and main differences between NoSQL and data grids•Cloud storage challenges•JSR 347: Data Grids for the Java Platform•Infinispan and where it sits in all this