Preparing your data for the cloud

  • 1,542 views
Uploaded on

More and more Enterprises are moving their IT infrastructure to Cloud platforms. Out of the entire components, Data Storage still remains a tricky part of the puzzle. I would like to present an …

More and more Enterprises are moving their IT infrastructure to Cloud platforms. Out of the entire components, Data Storage still remains a tricky part of the puzzle. I would like to present an overview of the choices, their advantages and limitations, we as Software Developers have currently. Based upon the choices, we may need to think about the design and architecture of the data-manipulation components of the application, we plan to put on Cloud. Following is an overview of the proposed agenda:

Existing “Cloud Capable” and “Cloud Native” Relational DBMS
Existing “Cloud Capable” and “Cloud Native” Non-Relational DBMS
Main differences between Relational and Non-Relational DBMS’s
Advantages and Limitations of Relational DBMS on Cloud Platforms
Advantages and Limitations of Non-Relational DBMS on Cloud Platforms
Design Patterns while using Non-Relational DBMS in the application
Code Walk-through showing Integration of “Cloud Capable” and “Cloud Native” Non-Relational DBMS with a Web-Application

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
1,542
On Slideshare
0
From Embeds
0
Number of Embeds
2

Actions

Shares
Downloads
34
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Preparing Your Data For Cloud Narinder Kumar Inphina Technologies 1
  • 2. Agenda  Relational DBMS's : Pros & Cons  Non-Relational DBMS's : Pros & Cons  Types of Non-Relational DBMS's  Current Market State  Applicability of Different Data-Bases in different environments 2
  • 3. Relational DBMS - Pros  Data Integrity  ACID Capabilities  High Level Query Model  Data Normalization  Data Independence 3
  • 4. Relational DBMS - Cons  Scaling Issues  By Duplication (Master-Slave)  By Sharding/Division (Not transparent)  Fixed Schema  Mostly disk-oriented (Performance)  May fair poorly with large data 4
  • 5. Non-Relational DBMS - Pros  Scalability  Replication / Availability  Performance  Deployment Flexibility  Modelling Flexibility  Faster Development (?) 5
  • 6. Non-Relational DBMS - Cons  Lack of Transactional Support  Data Integrity is Application's responsibility  Data Duplication / Application Dependent  Eventually Consistent (mostly)  No Standardization  New Technology 6
  • 7. RDBMS's and Cloud 7
  • 8. Cloud Capable RDBMS s Almost every RDBMS can run in a IAAS Cloud Platform 8
  • 9. Cloud Native RDBMS 9
  • 10. Types of Non-Relational DBMS  Key Value Stores  Document Stores  Column Stores  Graph Stores 10
  • 11. Key Value Data-Bases  Object is completely Opaque to DB  Mostly GET, PUT & DELETE operations are supported  There may be limits on size of Objects Inspired by Amazon Dynamo Paper 11
  • 12. Key Value DataBases & Cloud Project Voldemort MemCachedDB Tokyo Tyrant 12
  • 13. Document Data-Bases  Object is not completely opaque to DB  Every Object has it's own schema FirstName="Bob", Address="5 Oak St.", Hobby="sailing". FirstName="Jonathan", Children=("Michael,10", "Jennifer,8")  Can perform queries based on Object's attributes  Possible to describe relationships between Objects  Joins and Transactions are not supported  Good for XML or JSON objects 13
  • 14. Document Data-Bases & Cloud 14
  • 15. Column-Store Data-Bases  Richer than Document Stores  Multi-Dimensional Map  Tables  Row  Column  Time-Stamp  Supports Multiple Data Types  Usually use an Underlying DFS Inspired by Google Big Table Paper 15
  • 16. Column-Store Data-Bases & Cloud 16
  • 17. Key Factors while Making a Choice  Application Architecture Requirements  Platform choices  Non-Functional Requirements  Consistency  Availability  Partition  Security  Data Redemption 17
  • 18. Different Requirements = Different Solutions 18
  • 19. Scenario 1  Feature First  Corporate Data  Consistency Requirements  Business Intelligence  Legacy Application RDBMS on Amazon Cloud, RackSpace (IaaS) or Microsoft Azure/Amazon RDS (PaaS) 19
  • 20. Scenario 2  Consumer Facing Application  Big Files (Images, BLOB's, Files)  Geographically Distributed  Mostly writes  Not heavy requirement on Rich Queries Key-Value Data Stores (Amazon S3, Project Voldemort, Redis) 20
  • 21. Scenario 3  Hundreds Of Government Documents with different schemas  Need to serve on Web  Data Mining Document Data-Stores (Amazon SimpleDB, Apache Couche DB, MongoDB) 21
  • 22. Scenario 4  Scale First  Huge Data-Set  Analytical Requirements  Consumer Facing  High Availability over Consistency Column Data-Stores (Google App Engine, Hbase, Cassandra) 22
  • 23. Mix & Match of Earlier Scenarios  Polyglot Persistence  RDBMS for low-volume and high value  Key-Value DB for large files with little queries  Memcached DB for short- lived Data  Column DB for Analytics 23
  • 24. CAP Theorem 24
  • 25. Conclusions  One Size does not Fit all  Many choices  No-SQL DB's providing Alternatives  RDBMS serve useful purpose 25
  • 26. nkumar@inphina.com www.inphina.com http://thoughts.inphina.com 26
  • 27. References  http://nosql-database.org/  http://www.drdobbs.com/database/218900502  http://perspectives.mvdirona.com/2009/11/03/OneSizeDoesNotFitAll.aspx  http://blog.nahurst.com/visual-guide-to-nosql-systems  http://blog.heroku.com/archives/2010/7/20/nosql/  http://www.vineetgupta.com/2010/01/nosql-databases-part-1-landscape.html  http://project-voldemort.com/  http://code.google.com/p/redis/  http://memcachedb.org/  http://aws.amazon.com/simpledb/  http://couchdb.apache.org/  http://www.mongodb.org  http://riak.basho.com/ 27