Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2011.06.13 hotstorage-panel


Published on

Published in: Technology, Business
  • Be the first to comment

  • Be the first to like this

2011.06.13 hotstorage-panel

  1. 1. What is Big Data? Andy Twigg Acunu and Oxford HotStorage11 Panel
  2. 2. Claim: “Big Data” is not:scale-out, web-scale, nosql,etc... Some of the biggest big data problems involve mysql. Some of the best-known NoSQL DBs don’t scale out.
  3. 3.’s about surfing.
  4. 4. Storage is changing:3 fundamental shifts
  5. 5. 1: storage technology $100 $100 60 GB 2000 GB40,000 IOPS 100 IOPS 150 MB/s 130 MB/s
  6. 6. Can we remain here? The B-tree operates here (even with a LFS)~device capacity
  7. 7. 2: know your workloads! Ingest Analyze ServeHigh-entropy update Range queries Random reads • small entries (e.g. cookies, ticks) • little value (you don’t care losing 1%) • high-entropy, high rate updates (>100,000/s) • range queries (analytics)
  8. 8. 3: commodity h/wLots of people rely on EC2. How do you get reliability there?Enterprises like support and reliability. But what is the easiestway to get 5 9’s reliability, assuming s/w works?Scale-out is unavoidable.The challenge is getting to grips with the problems scale-outbrings - CAP, Vector clocks, etc.Read the literature! Don’t reinvent the wheel!
  9. 9., Apache Cassandra, Cassandra, Hadoop, and the eye andelephant logos are trademarks of the Apache Software Foundation.