Googol Data
Upcoming SlideShare
Loading in...5
×
 

Googol Data

on

  • 1,462 views

 

Statistics

Views

Total Views
1,462
Views on SlideShare
1,460
Embed Views
2

Actions

Likes
0
Downloads
4
Comments
0

1 Embed 2

http://www.slideshare.net 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Googol Data Googol Data Presentation Transcript

  • Googol records (with MySQL) IPC | October 2008 | Alex Aulbach
  • Definition: Googol 10100 or 10 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 or “Imaginable big number” „Googol records“ © MAYFLOWER GmbH 2008 2
  • Overview What will the future bring for databases? Is the principal way to access data the best? Patterns (or suggestions) and showing how that could work with MySQL. Discuss! „Googol records“ © MAYFLOWER GmbH 2008 3
  • The (performance) future of the web Only 10-20 % of world population are “in the Internet”. How should it be with 80 % ? „Googol records“ © MAYFLOWER GmbH 2008 4
  • The (performance) future of the web World population is growing and people get older. „Googol records“ © MAYFLOWER GmbH 2008 5
  • The (performance) future of the web More specialized databases More ways to access them Much easier to access Sharing knowledge vs. closed knowledge: Who wins? Services become more dependent to others The web grows faster than Moores Law! (Moores Law: “Only” Factor 1000 in 20 years.) „Googol records“ © MAYFLOWER GmbH 2008 6
  • What does this mean us? We will surely come into problems But cannot say when, where and why No Boss. The data belongs to everyone It’s like new roads It’s “Real-live”! „Googol records“ © MAYFLOWER GmbH 2008 7
  • Consequences of growth New hardware will no longer solve speed problems Even new database will not Even a rewrite of the application won’t Need to rethink the problems from scratch! „Googol records“ © MAYFLOWER GmbH 2008 8
  • Of course... … need for splitting, sharding, partitioning, cluster etc. … need to plan growth from beginning of the project. … hardware resources can no longer be planned. … distinct importance of data. … estimate instead of being correct. „Googol records“ © MAYFLOWER GmbH 2008 9
  • But ... Is this enough? „Googol records“ © MAYFLOWER GmbH 2008 10
  • Patterns (or better: suggestions) Brain storage engine. Reading differs from writing. Redundancy and specialization. The storage itself can keep the information. Time (and sleep). The journey is the reward. „Googol records“ © MAYFLOWER GmbH 2008 11
  • 1 :: Brain storage engine :: 1 Short term memory (working memory) Unsorted, unfiltered, any data Fast read Very much fast updates/changes Remembers which data is changed/invalid Limited „Googol records“ © MAYFLOWER GmbH 2008 12
  • 1 :: Brain storage engine :: 2 Long-term memory Presorted, well filtered data Unlimited (well, more or less) Extremely fast read access (sometimes) Updates/inserts by repeating in working memory Sleep helps to better store „Googol records“ © MAYFLOWER GmbH 2008 13
  • How does that model fit into real life? Nobody awaits to find old things fast Telephone-books 90/10-Problems „Googol records“ © MAYFLOWER GmbH 2008 14
  • Show Searching in long term memory. Scaling of working/long-term memory vs. one table with inserts/updates/deletes. „Googol records“ © MAYFLOWER GmbH 2008 15
  • 2 :: Reading differs from writing Look at the physical processes Reading with the fingertips: No read and write at the same time Handling reading and writing as different aspects of the same thing is a compromise Only specialization enables good optimization „Googol records“ © MAYFLOWER GmbH 2008 16
  • Reader/Writer: Simplest layout „Googol records“ © MAYFLOWER GmbH 2008 17
  • The web as storage? „Googol records“ © MAYFLOWER GmbH 2008 18
  • Web can work like this „Googol records“ © MAYFLOWER GmbH 2008 19
  • Recursive definition of the catalog „Googol records“ © MAYFLOWER GmbH 2008 20
  • Scaling, setup as “black box” „Googol records“ © MAYFLOWER GmbH 2008 21
  • Share everything „Googol records“ © MAYFLOWER GmbH 2008 22
  • Comments How does this scale? What doesn’t work with this? „Googol records“ © MAYFLOWER GmbH 2008 23
  • 3 :: Redundancy and specialization :: 1 We cannot backup a googol Nobody needs backup, but everybody needs to restore „Googol records“ © MAYFLOWER GmbH 2008 24
  • 3 :: Redundancy and specialization :: 2 Redundancy: Store the information on many places Store more important information on more places Specialization: “Materialized views” EAV modeling and pivoting Take ideas from data warehouses and repositories „Googol records“ © MAYFLOWER GmbH 2008 25
  • 3 :: Redundancy and specialization :: 3 The wheel comes full circle: More important: more access. More access: More need for redundancy. More redundancy: more speed and reliability. More speed and reliability: more important. „Googol records“ © MAYFLOWER GmbH 2008 26
  • Implementation with Reader/Writer „Googol records“ © MAYFLOWER GmbH 2008 27
  • 4 :: The storage itself can keep the information. “A storage has always physical limitations. A logical information of data which belongs together doesn't have any physical limitations.” Alex Aulbach, Sept. 2008 „Googol records“ © MAYFLOWER GmbH 2008 28
  • The index is the problem! The googol-universe is limited. The index can take “half of the galaxies”. Only the “rest” can be used for the data. Less index means: Faster search in the “needed” index. Less time to write data and index. Less time to warm up. More space for the records. „Googol records“ © MAYFLOWER GmbH 2008 29
  • Show Access full table or split data into several parts. Index-size Write Presorted tables „Googol records“ © MAYFLOWER GmbH 2008 30
  • 5 :: Time (and sleep) :: 1 Human brain: Only three bits per second! We all have been babies. Trust! Just wait and see. Developers (and customers) need to think in decades not in days till to the project-end. „Googol records“ © MAYFLOWER GmbH 2008 31
  • 5 :: Sleep (and time) :: 2 Again human brain: Learns while sleeping! Why not apply this for databases? Premise: Redundancy! Dolphins sleep only with one hemisphere at a time. The wheel comes full circle Redundancy. Distinct read and write. „Googol records“ © MAYFLOWER GmbH 2008 32
  • Show Well, I can’t show this, because it takes … time. „Googol records“ © MAYFLOWER GmbH 2008 33
  • 6 :: The journey is the reward Future: Not so important how to search, but where. Store step by step where to find the result, not the result. You can find faster ways only by trying a shortcut. It comes full circle: Search many different ways and take the fastest. While sleeping try out new things (dreaming). „Googol records“ © MAYFLOWER GmbH 2008 34
  • Conclusion Dreams may come true while sleeping. We must invent now the tools to solve the problems of the future. Speed is not a matter of hardware but of how things are done. Never take speed as stated: In a googol-universe wormholes exists! Moores Law may help, but do not trust em. „Googol records“ © MAYFLOWER GmbH 2008 35
  • Thank you! Alex Aulbach Mayflower GmbH Pleichertorstr. 2 97070 Würzburg, Germany +49 (931) 35 9 65 - 0 alex.aulbach@mayflower.de