Is NoSQL the Future of Data         Storage?        By Gary Short      Developer Express
Introduction•   Gary Short•   Technical Evangelist for Developer Express•   C# MVP•   garys@devexpress.com•   www.garyshor...
What About You Guys?                       3
Breadth First Look @ NoSQL                             4
Be Doing 3 Things1. Define NoSQL databases2. Look at scenarios where you can use NoSQL3. Drill into a specific use case.  ...
6
Where Does NoSQL Originate?• 1998  – OS relational database     •   Created by Carlo Strozzi     •   Didn’t expose an SQL ...
More Recently...• Eric Evans reintroduced the term in 2009  – Johan Oskarsson (last.fm)     • Event to discuss OS distribu...
Atlanta 2009• No:sql(east) conference• Billed as “conference of no-rel datastores”• Worst tag line ever  – SELECT fun, pro...
Not Ant-RDBMS                10
Let’s Talk a Bit About What NoSQL DBs               Look Like...                                    11
Key Attributes of NoSQL Databases•   Don’t require fixed table schemas•   Non-relational•   (Usually) avoid join operation...
What Does the Taxonomy Look Like?                                    13
Document Store•   RavenDB•   Apache Jackrabbit•   CouchDB•   MongoDB•   SimpleDB•   XML Databases    – MarkLogic Server   ...
Document What?                 15
Graph Storage•   Trinity•   AllegroGraph•   Core Data•   Neo4j•   DEX•   FlockDB.                               16
Which Means?• Graph consists of  – Node (‘stations’ of the graph)  – Edges (lines between them)• FlockDB  – Created by the...
Social Graph               18
Key/Value Stores• On disk• Cache in Ram• Eventually Consistent   – Weak Definition      • “If no updates occur for a perio...
Object Databases•   Db4o•   GemStone/S•   InterSystems Caché•   Objectivity/DB•   ZODB.                                20
How the &*$% do You Index         That?!                            21
Okay got it, Now Let’s Compare Some       Real World Scenarios                                  22
You Need Constant Consistency•   You’re dealing with financial transactions•   You’re dealing with medical records•   You’...
You Need Horizontal Scalability• You’re working across defined geographic regions• You’re working with large quantities of...
Up in the Clouds Baby                        25
26
Frequently Written Rarely Read•   Think web counters and the like•   Every time a user comes to a page = ctr++•   But it’s...
I Got Big Data!                  28
Binary Baby!•   If you are YouTube•   Flickr•   Twitpic•   Spotify•   NoSQL (Amazon S3).                              29
Here Today Gone Tomorrow• Transient data like..  – Web Sessions  – Locks  – Short Term Stats     • Shopping cart contents•...
Data Replication• Same data in two or more locations  – Music Library     • Web browser     • iPone App• NoSQL (CouchDB). ...
Hit me Baby One More Time!• High Availability  – High number of important transactions     • Online gambling     • Pay Per...
Give me a Real World Example• Twitter  – The challenges     • Needs to store many graphs        – Who you are following   ...
What Did They Try?• Relational Databases• Key-Value storage of denormalized lists                                         ...
Did it Work?               35
What Did They Need?• Simplest possible thing that would work• Allow for horizontal partitioning• Allow write operations to...
The Result was FlockDB• Stores graph data• Not optimised for graph traversal operations• Optimised for large adjacency lis...
How Does it Work?• Stores graphs as sets of edges between nodes• Data is partitioned by node  – All queries can be answere...
A Little More About Idempotency• Applied several times with no change to the  result• A operation ’O’ on set S is called i...
A Little More About Commutative• Changing the order of operands doesn’t  change the result.  3+2=5• Can be combined with i...
Commutative Writes Help Bring up            Partitions• Partition can receive write traffic immediately• Receive dump of d...
Performance?• Currently store 13 billion edges• 20K writes / second• 100K reads / second.                                 ...
Punchline?• Under all the bells and whistles...  – Its MySQL ☺.                                        43
So is this the Future?• Yes!• And No!                                 44
What?! How Can That be?!                           45
Upcoming SlideShare
Loading in...5
×

Is NoSQL The Future of Data Storage?

859

Published on

The relational database model was designed to solve the problems of yesterday’s data storage requirements. The massively connected world of today presents different problems and new challenges. We’ll explore the NoSQL philosophy, before comparing and contrasting the strengths and weaknesses of the relational model versus the NoSQL model. While stepping through real-world scenarios, we’ll discuss the reasons for choosing one solution over the other.
To complete this session, let’s demonstrate our findings with an application written with a NoSQL storage layer and explain the advantages that accrue from that decision. By taking a look at the new challenges we face with our data storage needs, we’ll examine why the principles behind NoSQL make it a better candidate as a solution, than yesterday’s relational model.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
859
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Is NoSQL The Future of Data Storage?

  1. 1. Is NoSQL the Future of Data Storage? By Gary Short Developer Express
  2. 2. Introduction• Gary Short• Technical Evangelist for Developer Express• C# MVP• garys@devexpress.com• www.garyshort.org• @garyshort. 2
  3. 3. What About You Guys? 3
  4. 4. Breadth First Look @ NoSQL 4
  5. 5. Be Doing 3 Things1. Define NoSQL databases2. Look at scenarios where you can use NoSQL3. Drill into a specific use case. 5
  6. 6. 6
  7. 7. Where Does NoSQL Originate?• 1998 – OS relational database • Created by Carlo Strozzi • Didn’t expose an SQL interface • Called NoSQL • The author said: • “departs from the relational model altogether...” • “...should have been called ‘NoREL”. 7
  8. 8. More Recently...• Eric Evans reintroduced the term in 2009 – Johan Oskarsson (last.fm) • Event to discuss OS distributed databases• This labels growing number datastores – Open source – Non-relational – Distributed – (often) don’t guarantee ACID. 8
  9. 9. Atlanta 2009• No:sql(east) conference• Billed as “conference of no-rel datastores”• Worst tag line ever – SELECT fun, profit FROM real_world WHERE rel=false. 9
  10. 10. Not Ant-RDBMS 10
  11. 11. Let’s Talk a Bit About What NoSQL DBs Look Like... 11
  12. 12. Key Attributes of NoSQL Databases• Don’t require fixed table schemas• Non-relational• (Usually) avoid join operations• Scale horizontally – Adding more nodes to a storage system. 12
  13. 13. What Does the Taxonomy Look Like? 13
  14. 14. Document Store• RavenDB• Apache Jackrabbit• CouchDB• MongoDB• SimpleDB• XML Databases – MarkLogic Server – eXist. 14
  15. 15. Document What? 15
  16. 16. Graph Storage• Trinity• AllegroGraph• Core Data• Neo4j• DEX• FlockDB. 16
  17. 17. Which Means?• Graph consists of – Node (‘stations’ of the graph) – Edges (lines between them)• FlockDB – Created by the Twitter folks – Nodes = Users – Edges = Nature of relationship between nodes. 17
  18. 18. Social Graph 18
  19. 19. Key/Value Stores• On disk• Cache in Ram• Eventually Consistent – Weak Definition • “If no updates occur for a period, eventually all updates will propagate through the system and all replicas will be consistent” – Strong Definition • “for a given update and a given replica eventually either the update reaches the replica or the replica retires”• Ordered – Distributed Hash Table allows lexicographical processing. 19
  20. 20. Object Databases• Db4o• GemStone/S• InterSystems Caché• Objectivity/DB• ZODB. 20
  21. 21. How the &*$% do You Index That?! 21
  22. 22. Okay got it, Now Let’s Compare Some Real World Scenarios 22
  23. 23. You Need Constant Consistency• You’re dealing with financial transactions• You’re dealing with medical records• You’re dealing with bonded goods• Best you use a RDMBS ☺. 23
  24. 24. You Need Horizontal Scalability• You’re working across defined geographic regions• You’re working with large quantities of data• Game server sharding• Use NoSQL – Something like Cassandra. 24
  25. 25. Up in the Clouds Baby 25
  26. 26. 26
  27. 27. Frequently Written Rarely Read• Think web counters and the like• Every time a user comes to a page = ctr++• But it’s only read when the report is run• Use NoSQL (key-value storage/memcache). 27
  28. 28. I Got Big Data! 28
  29. 29. Binary Baby!• If you are YouTube• Flickr• Twitpic• Spotify• NoSQL (Amazon S3). 29
  30. 30. Here Today Gone Tomorrow• Transient data like.. – Web Sessions – Locks – Short Term Stats • Shopping cart contents• Use NoSQL (Memcache). 30
  31. 31. Data Replication• Same data in two or more locations – Music Library • Web browser • iPone App• NoSQL (CouchDB). 31
  32. 32. Hit me Baby One More Time!• High Availability – High number of important transactions • Online gambling • Pay Per view – Ahem! • Online Auction• NoSQL (Cassandra – automatic clustering). 32
  33. 33. Give me a Real World Example• Twitter – The challenges • Needs to store many graphs – Who you are following – Who’s following you – Who you receive phone notifications from etc • To deliver a tweet requires rapid paging of followers • Heavy write load as followers are added and removed • Set arithmetic for @mentions (intersection of users). 33
  34. 34. What Did They Try?• Relational Databases• Key-Value storage of denormalized lists 34
  35. 35. Did it Work? 35
  36. 36. What Did They Need?• Simplest possible thing that would work• Allow for horizontal partitioning• Allow write operations to – Arrive out of order – Or be processed more than once• Failures should result in redundant work – Not lost work! 36
  37. 37. The Result was FlockDB• Stores graph data• Not optimised for graph traversal operations• Optimised for large adjacency lists – List of all edges in a graph • Each entry is a set of end points (or tuple if directed)• Optimised for fast read and write• Optimised for page-able set arithmetic. 37
  38. 38. How Does it Work?• Stores graphs as sets of edges between nodes• Data is partitioned by node – All queries can be answered by a single partition• Write operations are idempotent – Can be applied multiple times without changing the result• And commutative – Changing the order of operands doesn’t change the result. 38
  39. 39. A Little More About Idempotency• Applied several times with no change to the result• A operation ’O’ on set S is called idempotent if, for all x in S, x O x = x.• Set union – A U B = {X: X E A or X E B}• Set intersection – A n B = {X: X E A and X E B} 39
  40. 40. A Little More About Commutative• Changing the order of operands doesn’t change the result. 3+2=5• Can be combined with idempotency• Let’s look at the follow command in Twitter • Let X = follow person X • Let Y = follow person Y • Then 3X + 2Y = 2Y + 3X • And 2X + 3Y = 3X + 2Y• Note: it’s only true for the same operation. 40
  41. 41. Commutative Writes Help Bring up Partitions• Partition can receive write traffic immediately• Receive dump of data in the background• Live for read as soon as the dump is complete. 41
  42. 42. Performance?• Currently store 13 billion edges• 20K writes / second• 100K reads / second. 42
  43. 43. Punchline?• Under all the bells and whistles... – Its MySQL ☺. 43
  44. 44. So is this the Future?• Yes!• And No! 44
  45. 45. What?! How Can That be?! 45
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×