Successfully reported this slideshow.
Your SlideShare is downloading. ×

The return of big iron?

More Related Content

Related Audiobooks

Free with a 30 day trial from Scribd

See all

The return of big iron?

  1. 1. Welcome
  2. 2. Remember when it looked like this. They were all pretty much alike?
  3. 3. It used to be easy…
  4. 4. Now it’s quite c0nfuZ1nG!
  5. 5. But it’s also quite exciting!
  6. 6. We Lose: Joe Hellerstein (Berkeley) 2001 “Databases are commoditised and cornered to slow-moving, evolving, structure intensive, applications that require schema evolution.“ … “The internet companies are lost and we will remain in the doldrums of the enterprise space.” … “As databases are black boxes which require a lot of coaxing to get maximum performance”
  7. 7. What Happened?
  8. 8. The Web
  9. 9. With a lot of users.
  10. 10. we changed scale
  11. 11. tack ge d we ch an
  12. 12. New Approach to Data Access Simple Pragmatic Solved an insoluble problem Unencumbered by tradition (good & bad)
  13. 13. With this came a Different Focus Tradition No SQL •  Global consistency •  Local consistency •  Schema driven •  Schemaless •  Reliable Network •  Unreliable Network •  Highly Structured •  Semi-structured/ Unstructured NoSQL / Big Data technologies really focus on load and volume problems by avoiding the complexities associated with traditional transactional storage
  14. 14. The ‘Relational Camp’ had been busy too Realisation that the traditional architecture was insufficient for various modern workloads
  15. 15. End of an Era Paper - 2007 “Because RDBMSs can be beaten by more than an order of magnitude on the standard OLTP benchmark, then there is no market where they are competitive. As such, they should be considered as legacy technology more than a quarter of a century in age, for which a complete redesign and re-architecting is the appropriate next step.” – Michael Stonebraker
  16. 16. No Longer a One-Size-Fits-All
  17. 17. There is a new and impressive breed •  Products ~ 5 years old •  Shared nothing (sharded) •  Designed for SSD’s & 10GE •  Large address spaces (256GB+) •  No indexes (column oriented) •  Dropping traditional tenets (referential integrity etc) •  Surprisingly quick for big queries when compared with incumbent technologies.
  18. 18. Both types of solution have clear value
  19. 19. ..and it’s not really a question of size TB 0 1 10 100 1000 10,000 The majority of us live in the overlap region
  20. 20. More a question of utility … which this tends to lead to composite offerings
  21. 21. Compose Solutions
  22. 22. So what does this mean for the enterprise?
  23. 23. 80% Enterprise Databases < 1TB This Reference is getting pretty old now, sorry (2009)
  24. 24. Yet we often have a lot of them
  25. 25. Communication is Store & Forward The outside world
  26. 26. Sometimes we’re a bit more organized!
  27. 27. But most of our data is not that accessible Core Operational Exposed Data
  28. 28. …and sharing is often an afterthought Core Operational Exposed Data
  29. 29. Services can help
  30. 30. But as data is getting bigger and heavier
  31. 31. ..it can make it hard to join data together
  32. 32. So we often we turn to some form of Enterprise Data Warehouse (or maybe data virtualization)
  33. 33. Big data tech sometimes provides a composite solution (or ETL)
  34. 34. Ability to model data is much more of a gating factor than raw size Dave Campbell (Microsoft – VLDB Keynote 2012)
  35. 35. Importing data into a standard model is a slow and painful processs
  36. 36. An alternative is to use a Late Bound Schema
  37. 37. Combining structured & unstructured approaches in a layered fashion makes the process more nimble Structured Late Bound Standardisation Schema Layer Raw Data
  38. 38. We take this kind of approach •  Grid of machines •  Late bound schema •  Sharded, immutable data •  Low latency (real time) and high throughput (grid) use cases •  All data is observable (an event) •  Interfaces: Standardised (safe) or Raw (uayor)
  39. 39. Both Raw & Standardised data is available Operational Relational (real time / MR) Analytics Object/SQL Standardisation Raw Data
  40. 40. This helps to loosen the grip of the single schema, whilst also providing a more iterative approach to standardisation
  41. 41. Support for both one standardised and many bespoke models in the same technology Raw Facts from different systems Standardised Model
  42. 42. Next step: to centralise common processing tasks Standardised Risk Model Calculation
  43. 43. Are we back to the mainframe?
  44. 44. Thanks http://www.benstopford.com

×