SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012

674
-1

Published on

Sponsored workshop from Cleversafe.
#dataconf
More at http://event.gigaom.com/structuredata/

Published in: Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
674
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
29
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

SPONSORED WORKSHOP by Cleversafe from Structure:Data 2012

  1. 1. Store and Analyze Big Data Without Limits March 23, 2012Friday, July 27, 2012
  2. 2. Big Data Challenges From 800 exabytes in 2008 to 35,000 exabytes in 2020 90% of data is unstructured format, and 89% of growth in storage is unstructured format 75% of data is generated by individuals, and enterprises have liability for 80% of data generated Concern for data security and reliability in the Cloud Public Cloud deployments and content depots are projected to grow to $7.4B by 2014 to accommodate capacity “Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes of a wide variety of data, by enabling high- velocity capture, discovery, and/or analysis.” – IDC Extracting Value from Chaos, May 2011 Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 2Friday, July 27, 2012
  3. 3. Data Storage is Transforming 5000 Capacity-Optimized storage growing 63% annually* 3750Data 2500 1250 0 2002 2012 Year Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 3Friday, July 27, 2012
  4. 4. Data Storage is Transforming 5000 Capacity-Optimized storage growing 63% annually* 3750 Traditional Data Numbers, text,Data databases 2500 1250 0 2002 2012 Year Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 3Friday, July 27, 2012
  5. 5. Data Storage is Transforming 5000 New Data Capacity-Optimized storage Images, scans, audio files videos, hi-res videos growing 63% annually* 3750 Traditional Data Numbers, text,Data databases 2500 1250 0 2002 2012 Year Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 3Friday, July 27, 2012
  6. 6. Data Storage is Transforming 5000 New Data Capacity-Optimized storage Images, scans, audio files videos, hi-res videos growing 63% annually* 3750 Traditional Data Numbers, text,Data databases 2500 1250 0 2002 2012 Year •Growing 100X every 10 years •Required new methods Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 3Friday, July 27, 2012
  7. 7. Practical Applications for a 10 Exabyte Data Storage System • Understand certain IP traffic patterns for tracking fraudulent activity • Determine online purchasing patterns for a retailer or merchandiser to help launch a new product or service • Identify hot new trends in entertainment, sports, gaming, etc. • In this election year, understand the appeal of a political message and more directly target potential voters Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved.Friday, July 27, 2012
  8. 8. RAID Can’t Effectively Scale • RAID is not ideal for storing large amounts (PB) of digital content. • RAID does not allow configurable reliability to be established. • Increasing amounts of stored data is raising the risk of data loss and corruption. • Spindle size is increasing faster than IO performance causing longer rebuild times and exposure to data loss. • Spindle size is equal to Unrecoverable Read Error (URE) rates causing silent data corruption. Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 5Friday, July 27, 2012
  9. 9. How Dispersed Storage Technology Works DATA Cleversafe IDA Data is expanded, virtualized, transformed, 1 sliced and dispersed using Information Dispersal Algorithms. Slices are distributed to separate 2 disks, storage nodes and geographic locations. SITE 1 SITE 2 SITE 3 SITE 4 Even with individual servers or entire 3 sites down, real time bit perfect data is retrieved from a subset of slices. Cleversafe IDA DATA Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 6Friday, July 27, 2012
  10. 10. What Does a Limitless Scale Storage System Look Like? • Single instance of data with guaranteed reliability and availability – not RAID and copy based • Built-in geographic distribution for high availability and site failure tolerance • Data concurrency with multiple simultaneous readers and writers • Continuous data availability through upgrade cycles and storage node replacement • Flat namespace with highly efficient metadata management and no database or master name node • Architecture delivers independent scaling of storage capacity and performance • Take advantage of largest capacity most power-efficient disk drives available in the industry Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 7Friday, July 27, 2012
  11. 11. 10 Exabyte Data Storage System Configuration • Data integrity and availability provided without the overhead of replication • Deployed across multiple sites for site failure tolerance and high availability Portable Datacenter • High bandwidth network between sites (PD) • Utilize a portable datacenter (PD) container model for rapid setup and mobility • Each PD houses multiple racks for storage and a single rack for network connectivity • Flat architecture with no centralized database or management node • Hundreds of simultaneous readers/writers with instantaneous access to billions of objects Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved.Friday, July 27, 2012
  12. 12. System Configuration • 16 sites across the US • 35 PDs per site (560 total) • High bandwidth WAN • 21 Racks / PD (11,760 total) • IDA W32, T22, 1.45 expansion • 189 Storage Nodes / PD • Massively parallel distributed (105,840 total) readers/writers • 45 3TB drives per storage node • Filter capability with ingest (4.7M total) • Access embedded in application • ~15 EB raw, ~10EB usable Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 9Friday, July 27, 2012
  13. 13. System Architecture Very Big Data Sources Near Real-time Parallel Data Analyzers (and filters) Multiple Simultaneous Writers Data & Indexes Very Large Object Storage Cloud • Deployed across multiple sites Metadata • Using container-based (POD) model • Flat architecture, no central database Analysis & Results Multiple Simultaneous Readers and Writers Secondary (Parallel) Data Analyzers Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 10Friday, July 27, 2012
  14. 14. Use Case: Store and Analyze 6 months of Internet traffic Total Global Monthly Internet Traffic Growing 32% Annually PB 80 Exabytes per month in Dec. 2015 IP Traffic North America Monthly Worldwide Monthly 2012 12 EB 37 EB 2015 23 EB 80 EB Source: Cisco VNI, 2010 Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 11Friday, July 27, 2012
  15. 15. Use Case: Store and Analyze 6 months of Internet traffic North America North America Monthly Rolling 6 Months* 2012 12 EB 96 EB Source: Cisco VNI, 2010 Very Large Scale Very Large Scale Processing Requirements Storage Requirements • Ingest/Filter : 4.6 TB per sec • Store 10EB grow to 1,000 EB • Analyze/Index : ~0.5 TB per sec • ~900 GB/sec of data ingest (assuming a 10:1 filter of IP traffic) • Growing 32% per year Potential Solutions: Traditional data storage systems • Massively parallel, distributed not capable of this scale pioneered by Google, Yahoo, etc. Cleversafe Focus ** Rolling 6 months requires capacity to store 8 months worth of data in order to safely capture the next month before deleting the oldest month’s worth of data Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 12Friday, July 27, 2012
  16. 16. Key Takeaways • RAID can’t effectively scale to multi-petabytes and beyond • A limitless scale data storage system requires: – Single instance of data with guaranteed reliability and availability– not RAID and copy based – Built-in geographic distribution for high availability and site failure tolerance – Data concurrency with multiple simultaneous readers and writers – Flat namespace with highly efficient metadata management and no database or master name node Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 13Friday, July 27, 2012
  17. 17. Copyright © 2012 Cleversafe, Inc. All rights reserved. Copyright © 2012 Cleversafe, Inc. All rights reserved. 14Friday, July 27, 2012
  18. 18. Text Sponsored WorkshopFriday, July 27, 2012

×