Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Hadoop in the Cloud: Real World Lessons from Enterprise Customers

694 views

Published on

Hadoop in the Cloud: Real World Lessons from Enterprise Customers

Published in: Technology
  • Be the first to comment

Hadoop in the Cloud: Real World Lessons from Enterprise Customers

  1. 1. Storage Azure HDInsight R Server Local (HDFS) or Cloud (Azure Blob/Azure Data Lake Store) Analytics
  2. 2. Azure HDInsight Hadoop Meets the Cloud Microsoft’s managed Hadoop as a Service 100% open source Apache Hadoop Built on the latest releases across Hadoop (2.7) Up and running in minutes with no hardware to deploy Run on Windows or Linux Supported by Microsoft
  3. 3. Rockwell Automation is partnered with one of the six oil and gas super majors to build unmanned internet-connected gas dispensers. Each dispenser emits real-time management metrics allowing them to detect anomalies and predict when proactive maintenance needs to occur. Store sensor data every 5 minutes  Temperature, pressure, vibration, etc.  Tens of thousands of data points / second Azure Blobs Azure HDInsight Hive, Pig, Azure SQL DB Power BI for O365 Mobile Notification Hub Mobile Device Real-time notification
  4. 4. JustGiving wanted to harness the power of their data by using network science to map people’s connections and relationships so that they could connect people with the causes they care about. Based on 15 years of data, the JustGiving GiveGraph is the world’s largest ecosystem of giving behavior. It contains more than 81 million person nodes, thousands of causes and 285 million connections and is the engine that drives JustGiving’s social platform, enabling levels of personalization and engagement that a traditional infrastructure would be unable to deliver. SQL Server On-premises Agent Azure Blobs Azure HDInsight Give Graph Azure Tables Web APIWebsite + Event store Service Bus Serves results Azure Cache Activity Feeds
  5. 5.   
  6. 6. *Pending IDC study found on a per TB basis, Microsoft customers using cloud-based Hadoop in Data Lake have a 63% lower TCO than on- premises
  7. 7. Always on cluster Cluster as a service Storage choice Local HDFS, Azure Blob, Azure Data Lake Store Azure Blob, Azure Data Lake Store Job Scheduling Oozie Azure Data Factory Data persistence after cluster deletion N/A Azure Blob, Azure Data Lake Store Metadata persistence after cluster deletion N/A Azure SQL
  8. 8. Partition 1 Partition 2 Partition 3 2014-10.part0 2014-11.part0 2014-12.part0 2014-10.part1 2014-11.part1 2014-12.part1 2014-10.part2 2014-11.part2 2014-12.part2 2014-10.part3 2014-11.part3 2014-12.part3
  9. 9. Partition 1 Partition 2 Partition 3 2014-10.part0 2014-11.part0 2014-12.part0 2014-10.part1 2014-11.part1 2014-12.part1 2014-10.part2 2014-11.part2 2014-12.part2 2014-10.part3 2014-11.part3 2014-12.part3
  10. 10. Partition 1 Partition 2 Partition 3 2014-10.part0 2014-10.part1 2014-10.part2 2014-11.part0 2014-11.part1 2014-11.part2 2014-12.part0 2014-12.part1 2014- 12.part2 2014-10.part3 2014-11.part3 2014-12.part3 Partition 4
  11. 11. No limits on file sizes Analytics scale on demand No code rewrites as you increase size of data stored Optimized for massive throughput Optimized for IOT with high volume of small writes PB TB GB PB TB
  12. 12. Goals
  13. 13. • • • •
  14. 14. Azure Security: Encryption At Rest Azure Blob Storage (In Preview) • Encryption @ rest using Microsoft managed keys • Customers can use Azure Storage configuration to manage encryption. No HDInsight changes required.
  15. 15. RBAC : Securing HDInsight with Blue Talon (ISV)
  16. 16. Deep integration to Visual Studio Easy for novices to write simple queries Robust environment for experts to also be productive Integrated with Pig, Hive, and Storm Playback that visualizes performance to identify bottlenecks and areas for optimization
  17. 17.    

×