NoSQL Landscape and a
Solution to Polyglot
Persistence
© Impetus Technologies
Agenda
• Big Data Problems
• Transition from RDMS to NoSQL
• NoSQL Landscape
• Challenges in transi...
© Impetus Technologies
BIG Data Problem
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Why not RDBMS?
Scalability
Data format
High availability
 Data volume in zeta
byte, yottabyte
 Ho...
© Impetus Technologies
Non-RDBMS way
Scale out
Scale up Static schema
Dynamic schema
Centralized
Decentralized
Recorded ve...
© Impetus Technologies
Introduction to NoSQL
“An approach to storing and retrieving data with horizontal scaling, simple
d...
© Impetus Technologies
NoSQL :A Pragmatic Solution?
With NOSQL data can be consistent, highly available
and with no SPOF!
...
© Impetus Technologies
CAP Theorem
Consistency
Availability
Partition
Tolerance N/A
Recorded version available at http://b...
© Impetus Technologies
Thinking NoSQL?
Size
Format
VelocityFiltering
Large Data
Recorded version available at http://bit.l...
© Impetus Technologies
Size
High data growth ! scalability is an issue?
Traditional RDBMS based solutions will not work!
x...
© Impetus Technologies
Velocity
Near real time/Big Data analytics
Parallel processing, ready-for-read design is required
T...
© Impetus Technologies
Filtering
Filtering. Fraud detection
Risk management analysis
Traditional RDBMS may work on small
s...
© Impetus Technologies
Format
Non relational data format.
Different nature of data set: graph based, key-value based acces...
© Impetus Technologies
NoSQL Landscape
NOSQL
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Transition to NoSQL
Datastore
Selection
API
exploration
Landscape
Understanding
Implementation
Reco...
© Impetus Technologies
Selecting a NoSQL Datastore
Neo4j, Titan,
Objectivity,
Orient DB,
Vertex DB
Cassandra,
HBase,
Hyper...
© Impetus Technologies
High Level APIs
 Kundera
 Kundera
 Kundera
 Hector
 Easy Cassandra
 Datastax java
driver
 As...
© Impetus Technologies
Hybrid Design
Cassandra, HBase RDBMS Redis
MongoDB, Couchbase Neo4J, Titan Hadoop, Spark
Recorded v...
© Impetus Technologies
Bumpy Ride!
Unlearn and Learn
new APIs!
Index based retrieval
over multiple NOSQL
data stores
Atomi...
© Impetus Technologies
One Stop Solution
Master key, possible?
Let’s explore!
Recorded version available at http://bit.ly/...
© Impetus Technologies
Polyglot Way
Migrating existing
solutions
Guarantee
atomicity
Switch
databases
Recorded version ava...
© Impetus Technologies
High Level Polyglot API
 Spring data
 Kundera
 Spring data
 Kundera
 Spring data
 Kundera
 S...
© Impetus Technologies
Kundera to the Rescue!!
Supports 8 data stores –
Cassandra, Hbase,
MongoDB, Redis, Neo4j,
Oracle No...
© Impetus Technologies
Getting Started
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
User Logs Sample App
@Entity
@Table(name = "user“)
@IndexCollection(columns = { @Index(name = "emai...
© Impetus Technologies
User Logs Sample App
Configuration : Persistence.xml
<!-- Persistence unit for Cassandra persistenc...
© Impetus Technologies
Switching Data stores
<!-- Persistence unit for Cassandra persistence -->
<persistence-unit name=“l...
© Impetus Technologies
Performance & Benchmarks
Recorded version available at http://bit.ly/1hfz4Tn
© Impetus Technologies
Technical Challenges Addressed!
• Distributed indexing over multiple NOSQL database e.g. Solr,
Elas...
Q&A
Big Data Solutions and Services partner for Enterprises
bigdata@impetus.com
© Impetus Technologies
© Impetus Technologies
ThankYou!
• Meet us at
• Hadoop Summit, San Jose
• CIO Big Data Summit, Texas
• Strata Conference +...
Upcoming SlideShare
Loading in …5
×

NoSQL Landscape and a Solution to Polyglot Persistence

1,398 views

Published on

Impetus webcast ‘NoSQL Landscape and a Solution to Polyglot Persistence’ available at http://bit.ly/1hfz4Tn

Published in: Technology
  • Be the first to comment

NoSQL Landscape and a Solution to Polyglot Persistence

  1. 1. NoSQL Landscape and a Solution to Polyglot Persistence
  2. 2. © Impetus Technologies Agenda • Big Data Problems • Transition from RDMS to NoSQL • NoSQL Landscape • Challenges in transition • Tools for NoSQL • Kundera – an open source polyglot solution Recorded version available at http://bit.ly/1hfz4Tn © Impetus Technologies
  3. 3. © Impetus Technologies BIG Data Problem Recorded version available at http://bit.ly/1hfz4Tn
  4. 4. © Impetus Technologies Why not RDBMS? Scalability Data format High availability  Data volume in zeta byte, yottabyte  Horizontal scaling would be expensive  Data format can be static or dynamic  Relational / Non- relational  Data locality  No single point of failure Recorded version available at http://bit.ly/1hfz4Tn
  5. 5. © Impetus Technologies Non-RDBMS way Scale out Scale up Static schema Dynamic schema Centralized Decentralized Recorded version available at http://bit.ly/1hfz4Tn
  6. 6. © Impetus Technologies Introduction to NoSQL “An approach to storing and retrieving data with horizontal scaling, simple design and high availability” Data format driven processing Distributed with No single point of failure(SPOF) Thinking out of SQL box Recorded version available at http://bit.ly/1hfz4Tn
  7. 7. © Impetus Technologies NoSQL :A Pragmatic Solution? With NOSQL data can be consistent, highly available and with no SPOF! But not 100%! Recorded version available at http://bit.ly/1hfz4Tn
  8. 8. © Impetus Technologies CAP Theorem Consistency Availability Partition Tolerance N/A Recorded version available at http://bit.ly/1hfz4Tn
  9. 9. © Impetus Technologies Thinking NoSQL? Size Format VelocityFiltering Large Data Recorded version available at http://bit.ly/1hfz4Tn
  10. 10. © Impetus Technologies Size High data growth ! scalability is an issue? Traditional RDBMS based solutions will not work! xxx xxx xxx Recorded version available at http://bit.ly/1hfz4Tn
  11. 11. © Impetus Technologies Velocity Near real time/Big Data analytics Parallel processing, ready-for-read design is required Traditional RDBMS solutions are not fast enough to meet the SLAs ! Recorded version available at http://bit.ly/1hfz4Tn
  12. 12. © Impetus Technologies Filtering Filtering. Fraud detection Risk management analysis Traditional RDBMS may work on small scale but not with large data ! Recorded version available at http://bit.ly/1hfz4Tn
  13. 13. © Impetus Technologies Format Non relational data format. Different nature of data set: graph based, key-value based access Traditional database is limited to static tables! lo g s Recorded version available at http://bit.ly/1hfz4Tn
  14. 14. © Impetus Technologies NoSQL Landscape NOSQL Recorded version available at http://bit.ly/1hfz4Tn
  15. 15. © Impetus Technologies Transition to NoSQL Datastore Selection API exploration Landscape Understanding Implementation Recorded version available at http://bit.ly/1hfz4Tn
  16. 16. © Impetus Technologies Selecting a NoSQL Datastore Neo4j, Titan, Objectivity, Orient DB, Vertex DB Cassandra, HBase, Hypertable, BigTable Oraclekv, Redis, Couch DB, Riak MongoDB, Couch base Graph Columnar Key-value Document Recorded version available at http://bit.ly/1hfz4Tn
  17. 17. © Impetus Technologies High Level APIs  Kundera  Kundera  Kundera  Hector  Easy Cassandra  Datastax java driver  Astyanax  Morphia  Data Nucleus  Jongo  Spring data  Spring data  Neo4j  Hibernate OGM  Data nucleus  Hbase api  Spring data  Kundera Recorded version available at http://bit.ly/1hfz4Tn
  18. 18. © Impetus Technologies Hybrid Design Cassandra, HBase RDBMS Redis MongoDB, Couchbase Neo4J, Titan Hadoop, Spark Recorded version available at http://bit.ly/1hfz4Tn
  19. 19. © Impetus Technologies Bumpy Ride! Unlearn and Learn new APIs! Index based retrieval over multiple NOSQL data stores Atomic operations NOSQL world is still evolving, may need to explore among data stores Migration of existing production applications and many more… Recorded version available at http://bit.ly/1hfz4Tn
  20. 20. © Impetus Technologies One Stop Solution Master key, possible? Let’s explore! Recorded version available at http://bit.ly/1hfz4Tn
  21. 21. © Impetus Technologies Polyglot Way Migrating existing solutions Guarantee atomicity Switch databases Recorded version available at http://bit.ly/1hfz4Tn
  22. 22. © Impetus Technologies High Level Polyglot API  Spring data  Kundera  Spring data  Kundera  Spring data  Kundera  Spring data  Kundera Let’s implement in JPA way! Recorded version available at http://bit.ly/1hfz4Tn
  23. 23. © Impetus Technologies Kundera to the Rescue!! Supports 8 data stores – Cassandra, Hbase, MongoDB, Redis, Neo4j, Oracle NoSQL, CouchDB and any RDBMS CRUD / Strong Query Support Object Relationships Handling Datastore-Optimized Persistence and Query Approach Interceptors / Events / Caching Connection Pool / Fallback (Lucene) Indexing Flexibility Recorded version available at http://bit.ly/1hfz4Tn
  24. 24. © Impetus Technologies Getting Started Recorded version available at http://bit.ly/1hfz4Tn
  25. 25. © Impetus Technologies User Logs Sample App @Entity @Table(name = "user“) @IndexCollection(columns = { @Index(name = "emailId") }) public class User { @Id @Column(name = "user_id") private String userId; @Column(name = "first_name") private String firstName; @Column(name = "last_name") private String lastName; @Column(name = "emailId") private String emailId; @OneToMany(cascade = CascadeType.ALL, fetch = FetchType.LAZY) @JoinColumn(name = "user_id") private Set<userLogs> logs; @Embedded private PersonalDetail personalDetail; public User() { // Default constructor. } //Setters and Getters @Entity @Table(name = “logs”) @Index(columns = { "body", “created_at" }, index = true) public class UserLogs { @Id @Column(name = “log_id") private String logId; @Column(name = "body") private String body; @Column(name = “created_at") @Temporal(TemporalType.DATE) private Date createdDate; public UserLogs() { // Default constructor. } // Setters and Getters User Entity UserLogs Entity Recorded version available at http://bit.ly/1hfz4Tn
  26. 26. © Impetus Technologies User Logs Sample App Configuration : Persistence.xml <!-- Persistence unit for Cassandra persistence --> <persistence-unit name=“logCassandra"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.UserLogs</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="9160" /> <property name="kundera.keyspace" value=“userstore" /> <property name="kundera.dialect" value="cassandra" /> <property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> <property name="index.home.dir" value="lucene"/> </properties> </persistence-unit> <!-- Persistence unit for mysql persistence --> <persistence-unit name=“logRdbms"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.User</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.client.lookup.class" value="com.impetus.client.rdbms.RDBMSClientFactory" /> <property name="hibernate.hbm2ddl.auto" value="create" /> <property name="hibernate.show_sql" value="false" /><property name="hibernate.format_sql" value="false" /> <property name="hibernate.dialect" value="org.hibernate.dialect.MySQL5Dialect" /> <property name="hibernate.connection.driver_class" value="com.mysql.jdbc.Driver" /> <property name="hibernate.connection.url" value="jdbc:mysql://localhost:3306/userstore" /> <property name="hibernate.connection.username" value="root" /> <property name="hibernate.connection.password" value="root" /> </propertie> </persistence-unit> Recorded version available at http://bit.ly/1hfz4Tn
  27. 27. © Impetus Technologies Switching Data stores <!-- Persistence unit for Cassandra persistence --> <persistence-unit name=“logCassandra"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.userLogs</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="9160" /> <property name="kundera.keyspace" value=“userstore" /> <property name="kundera.dialect" value="cassandra" /> <property name="kundera.client.lookup.class" value="com.impetus.client.cassandra.thrift.ThriftClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> <property name="index.home.dir" value="lucene"/> </properties> </persistence-unit> <!-- Persistence unit for Mongo persistence --> <persistence-unit name=“logMongo"> <provider>com.impetus.kundera.KunderaPersistence</provider> <class>com.impetus.kvapps.entities.User</class> <exclude-unlisted-classes>true</exclude-unlisted-classes> <properties> <property name="kundera.nodes" value="localhost" /> <property name="kundera.port" value="27017" /> <property name="kundera.keyspace" value=“userlstore" /> <property name="kundera.dialect" value="mongodb" /> <property name="kundera.client.lookup.class" value="com.impetus.client.mongodb.MongoDBClientFactory" /> <property name="kundera.ddl.auto.prepare" value="create" /> </properties> </persistence-unit> //create entity manager factory. EntityManagerFactory emf = Persistence.createEntityManagerFactory(“logCassandra,logMongo”, properties); EntityManager em = emf.createEntityManager(); ….. em.persist(user); Configuration : Persistence.xml Persist Data Recorded version available at http://bit.ly/1hfz4Tn
  28. 28. © Impetus Technologies Performance & Benchmarks Recorded version available at http://bit.ly/1hfz4Tn
  29. 29. © Impetus Technologies Technical Challenges Addressed! • Distributed indexing over multiple NOSQL database e.g. Solr, Elastic search • Plugin Kundera powered ES or Lucene indexer • Build your own library and simply plugin • Unlearn and Learn new APIs! • Based on most popular JPA 2.0 specification • Atomicity guarantee and Transaction management • Built in support for JPA/JTA transaction and batch operations • NOSQL world is evolving, plan to switch databases? • Since it’s a JPA powered solution, reuse same code with almost no changes Recorded version available at http://bit.ly/1hfz4Tn
  30. 30. Q&A Big Data Solutions and Services partner for Enterprises bigdata@impetus.com © Impetus Technologies
  31. 31. © Impetus Technologies ThankYou! • Meet us at • Hadoop Summit, San Jose • CIO Big Data Summit, Texas • Strata Conference + Hadoop World, New York • Gartner Symposium, Orlando • Try / Recommend Kundera • https://github.com/impetus-opensource/Kundera • @impetustech

×