Data Warehousing Infrastructure on Cloud

1,534 views

Published on

TDWI India Chapter 2011 Feb 05 Hosted at Intel, Presentation from Praveen Habbagodi, Director, Akamai Technologies

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,534
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
55
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data Warehousing Infrastructure on Cloud

  1. 1. Data warehousing infrastructure on CloudPraveen HebbagodiDirector of Engineering, Akamai Technologies Akamai Confidential
  2. 2. Agenda• Introduction to Akamai• Akamai BI Solutions• Data warehousing platform • Features • Architecture • Operations• ConclusionsAkamai Confidential Powering a Better Internet ©2011 Akamai
  3. 3. The Akamai Network… … a large-scale on-demand distributed computing platform Accelerating Daily Traffic of: • 3+ Tbps • 11+ million hits per second • 10+ million concurrent streams • 800+ billion deliveries/day • 30+ petabytes/day Connecting: - 88,000+ Servers • 465 million unique IP addresses - 1,100+ Networks • From 234 countries - 1,600+ Locations - 650+ Cities Deflecting Attack Traffic: - 71 Countries • From 198 countries • Targeting 10,000 unique portsAkamai Confidential Powering a Better Internet ©2011 Akamai
  4. 4. Major services provided by Akamai Content delivery: • HTTP/S (15-30% of total HTTP traffic!) • Live and On-Demand Streaming Application delivery: • Web Application Acceleration • Dynamic Site Acceleration • EdgeComputing • IP Application Acceleration Example applications: • Online commerce, media delivery, B2B/B2C applications, software downloads, social networking sites, … • You likely use many of our services each dayAkamai Confidential Powering a Better Internet ©2011 Akamai
  5. 5. Akamai BI SolutionsHelps our customers to get deeper insights into their audience and content usage in the context of their businessExample Solutions:Media Analytics• A comprehensive solution for content and audience intelligence for broadband media• Features • Dashboards for Engagement Overview, Ad Optimizations and Content Usage • Standard Reports with detailed engagement and audinece information • Custom dimensions and reports to suit business specific needsAkamai Confidential Powering a Better Internet ©2011 Akamai
  6. 6. Akamai BI Solutions: ExamplesQOS Monitor• Real-time quality of service monitoring solution for online media delivery• Features • Set thresholds for breaches, find root causes and resolve issues using “Notifications” • Live Real-time monitoring console with data aggregation as fine as 30 sec. • 20 standard reports & dashboards for historical diagnostics & debugging.Akamai Confidential Powering a Better Internet ©2011 Akamai
  7. 7. Features• Intuitive dashboard & report builder UI with advanced visualizations• Over 50 Standard dimensions & metrics• Support for any customer specific dimensions • Regular expression extraction • Plug-in API• Ad-hoc query, drill-down• Lookup tables• Dashboards & reports can be provisioned on the fly• Real-time notifications• Data access via web interface, SOAP API, Email & download (CSV, PDF, HTML)Akamai Confidential Powering a Better Internet ©2011 Akamai
  8. 8. Data warehousing Platform xml xmlReporting Data Data xml Storage Processing Data Collection Data Sources Analytics workflow programmable via portal • Data sources, filters, metrics, dimensions, reports, dashboards configured via xml metadata Distributed data collection in the Cloud • Data sources: end user machines (beacons), edge server logs, agents • Filtering, and partial aggregation at the source and in collection layers • Facilitates scalability and better utilization of resources Akamai Confidential Powering a Better Internet ©2011 Akamai
  9. 9. Data warehousing Platform xml xmlReporting Data Data xml Storage Processing Data Collection Data Sources Data Processing is a flexible map-reduce framework • Dataflow graph of map-reduce operations • Enhancements for better latencies, scheduling optimizations • Faster message passing interfaces (network, in-memory) Akamai Confidential Powering a Better Internet ©2011 Akamai
  10. 10. Data-warehousing Platform xml Processing Reporting Data Artist Name Storage TimeData abstraction is a set of data cubes • Supports fast slice-and-dice, drill-down operations,…Data cubes are physically realized in distributed columnar DB • SQL interface, column compression, bitmap indexes • In-situ updates, write-optimized store • Sharding and cluster managementAkamai Confidential Powering a Better Internet ©2011 Akamai
  11. 11. Approach to Operations• Treat failures as normal• Build in layers of redundancy • At all levels: geo/network, with in a cluster • Multi-path communications• Weaker data consistency models• Zoning • Dynamic Configuration • Software Installs• Design systems that run themselves • Autonomic response where appropriateAkamai Confidential Powering a Better Internet ©2011 Akamai
  12. 12. Conclusions• Being on cloud facilitates in building highly scalable platform for “big data” applications• Design for failures • Build redundant systems at all levels • Multiple levels of fault-tolerance• Automation, autonomics, more automation… • Avoid “manual changes” • They will happen, so have good process to minimize/track• Deterministic software and config management system • Converges to consistent state & built-in safe roll-back• Good tools for understanding system behavior and data quality• Sophisticated tools for capacity management and performance monitoringAkamai Confidential Powering a Better Internet ©2011 Akamai
  13. 13. To find out more about Akamai…More info: www.akamai.comContacting me: Praveen Hebbagodi phebbagodi@akamai.comTechnical publications: http://www.akamai.com/html/perspectives/techpubs.htmlJobs: http://www.akamai.com/html/careers/index.html http://twitter.com/akamaijobsindiaQuestions?Akamai Confidential Powering a Better Internet ©2011 Akamai
  14. 14. Thank you Akamai Confidential

×