Hadoop in the cloud – The what, why and how from the experts

•

4 likes•1,751 views

DataWorks Summit

Hadoop summit 2015

Technology

Hadoop in the cloud –
The what, why and how from the experts
Nishant Thacker
Technical Product Manager – Big Data
Microsoft
@nishantthacker

Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Hadoop Clusters
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
8

Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Cloud
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
9

Distributed Storage
• Files split across storage
• Files replicated
• Nearest node responds
• Abstracted Administration
Hadoop in the Cloud
Extensible
• APIs to extend functionality
• Add new capabilities
• Allow for inclusion in custom
environments
Automated Failover
• Unmonitored failover to replicated data
• Built for resiliency
• Metadata stored for later retrieval
Hyper-Scale
• Add resources as desired
• Built to include commodity configs
• Direct correlation of performance and
resources
Distributed Compute
• Distributed processing
• Resource Utilization
• Cost-Efficient method calls
10

Scenarios for deploying Hadoop as hybrid

Traditional Hadoop Clusters – On Prem
14
Hadoop Cluster
Worker Node
HDFS
HDFS HDFS
Tasks Tasks Tasks Tasks Tasks Tasks
Task Tracker
Master Node
Client
Job (jar) file
Job (jar) file

HDInsight Cluster Architecture
AzureVNet
HTTPS
traffic
ODBC/JDBC WebHCatalog Oozie Ambari
Secure gateway
AuthN
HTTP Proxy
Highly available
Head nodes
Worker nodes
WASB
10
01
10
01
10
01
10
01
10
01
10
01
10
01

Decoupling Compute from Storage
10
01
Latency? Consistency?
Bandwidth?
Network

Decoupling Compute from Storage
10
01
Network
HDD-like latency
50 Tb+ aggregate
bandwidth[1]
Strong consistency
[1] Azure Flat Network Architecture

Blob storage concepts
Container Page/blocksBlobAccount
20

The Azure Data Lake Approach
Ingest all data
regardless of requirements
Store all data
in native format without
schema definition
Do analysis
Using analytic engines
like Hadoop
Interactive queries
Batch queries Machine Learning
Data warehouse
Real-time analytics
Devices

Customize
cluster?
HDInsight cluster provisioning states
RDP to cluster, update
config files (non-durable)
Ad hoc
Cluster customization options
Hive/Oozie Metastore
Storage accounts & VNET’s
ScriptAction
Via Azure portal
Ready for
deployment
Accepted
Cluster
storage
provisioned
AzureVM
configuration
Running
Timed Out
Error
Cluster
operational
Configuring
HDInsight
Cluster
customization
(custom script
running
Config values
JAR file placement in
cluster
Via scripting / SDK
No
Yes

Cluster integration options
Each cluster surfaces a REST endpoint for integration,
secured via basic authN over SSL
/thrift – ODBC & JDBC
/Templeton – Job Submission,
Metadata management
/ambari – Cluster health,
monitoring
/oozie – Job orchestration,
scheduling

Data Usage
SSRS
SharePoint
BI
Excel BI
Power BI
Azure
Marketplace
Data ProcessingStorageEvent ProcessingData Generation
MAHOUT
HIVE
HIVE
OOZIE
SQOOP PIG
SQL Server
Analysis Services
Azure HDInsight
(Hadoop)
Azure Machine
Learning
Data WarehouseAzure
Document DB
Azure SQL DB
HBase on
Azure
HDInsight
Azure Blob Storage
Datamarts and other
transactional systems
Big Data Sources (Raw Unstructured)
Log files
Azure Website
Azure Event
Hubs
Storm on Azure
HDInsight
Azure Stream Analytics
Microsoft Big Data Solution
Cold path for Data
Hot path for Data
Data Integration
AZURE DATA FACTORY

 For more information visit: http://azure.com/hdinsight
 Questions, Feedback or Follow-up: nishant.thacker@microsoft.com

What's hot

Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol HARMAN Services

Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...DataWorks Summit

How Big Data and Hadoop Integrated into BMC ControlM at CARFAXBMC Software

Build Big Data Enterprise Solutions Faster on Azure HDInsightDataWorks Summit/Hadoop Summit

Empowering you with Democratized Data Access, Data Science and Machine LearningDataWorks Summit

Hadoop Reporting and Analysis - JaspersoftHortonworks

Moving Health Care Analytics to Hadoop to Build a Better Predictive ModelDataWorks Summit

50 Shades of SQLDataWorks Summit

Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo ClinicDataWorks Summit

Building a Big Data platform with the Hadoop ecosystemGregg Barrett

Breaking the Silos: Storage for Analytics & AIDataWorks Summit

Data-In-Motion UnleashedDataWorks Summit

Productionizing Hadoop: 7 Architectural Best PracticesMapR Technologies

Analyzing Hadoop Data Using Sparklyr Cloudera, Inc.

How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...DataWorks Summit

Evolving Hadoop into an Operational Platform with Data ApplicationsDataWorks Summit

Breakout: Hadoop and the Operational Data StoreCloudera, Inc.

Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopHortonworks

YARN Ready: Integrating to YARN with Tez Hortonworks

Data Lake for the Cloud: Extending your Hadoop ImplementationHortonworks

What's hot (20)

Introduction to Microsoft Azure HD Insight by Dattatrey Sindhol

Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...

How Big Data and Hadoop Integrated into BMC ControlM at CARFAX

Build Big Data Enterprise Solutions Faster on Azure HDInsight

Empowering you with Democratized Data Access, Data Science and Machine Learning

Hadoop Reporting and Analysis - Jaspersoft

Moving Health Care Analytics to Hadoop to Build a Better Predictive Model

50 Shades of SQL

Big Data Platform Processes Daily Healthcare Data for Clinic Use at Mayo Clinic

Building a Big Data platform with the Hadoop ecosystem

Breaking the Silos: Storage for Analytics & AI

Data-In-Motion Unleashed

Productionizing Hadoop: 7 Architectural Best Practices

Analyzing Hadoop Data Using Sparklyr 

How to use flash drives with Apache Hadoop 3.x: Real world use cases and proo...

Evolving Hadoop into an Operational Platform with Data Applications

Breakout: Hadoop and the Operational Data Store

Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop

YARN Ready: Integrating to YARN with Tez

Data Lake for the Cloud: Extending your Hadoop Implementation

Viewers also liked

Using JSON/BSON types in your hybrid application environmentAjay Gupte

Q2 2016 Conference Call and Webcast Presentationyamanagold2016

How IBM API Management use Informix and NoSQLAjay Gupte

Fundamental Analysis & Recommendations - QMS Advisors FlexIndex KOSPI 100 BCV

NoSQL support in Informix (JSON storage, Mongo DB API)Keshav Murthy

Made in Canadayamanagold2016

MMFI_R_ReportBonaventure Mugimba

COMPANY PROFILE- CIVIL & SURVEYSanjay Jha

Emergence or HRMsanjayjha

417 pc 05-3_ePatel Meghna

Viewers also liked (10)

Using JSON/BSON types in your hybrid application environment

Q2 2016 Conference Call and Webcast Presentation

How IBM API Management use Informix and NoSQL

Fundamental Analysis & Recommendations - QMS Advisors FlexIndex KOSPI 100

NoSQL support in Informix (JSON storage, Mongo DB API)

Made in Canada

MMFI_R_Report

COMPANY PROFILE- CIVIL & SURVEY

Emergence or HRM

417 pc 05-3_e

Similar to Hadoop in the cloud – The what, why and how from the experts

Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit

Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit

Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit

Hadoop ppt1chariorienit

Hadoopavnishagr

List of Engineering Colleges in UttarakhandRoorkee College of Engineering, Roorkee

Hadoop.pptxarslanhaneef

Hadoop.pptxsonukumar379092

Hadoop PrimerSteve Staso

hadoop distributed file systems complete informationbhargavi804095

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3tcloudcomputing-tw

HDFCloud Workshop: HDF5 in the CloudThe HDF-EOS Tools and Information Center

Introduction to Hadoop AdministrationRamesh Pabba - seeking new projects

HDF Cloud ServicesThe HDF-EOS Tools and Information Center

Aziksa hadoop architecture santosh jhaData Con LA

9.-dados e processamento distribuido-hadoop.pdfManoel Ribeiro

HADOOP TECHNOLOGY pptsravya raju

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld

Hadoop Distributed file system.pdfvishal choudhary

Similar to Hadoop in the cloud – The what, why and how from the experts (20)

Big Data in the Cloud - The What, Why and How from the Experts

Hadoop in the Cloud - The what, why and how from the experts

Hadoop in the Cloud – The What, Why and How from the Experts

Hadoop ppt1

Hadoop

List of Engineering Colleges in Uttarakhand

Hadoop.pptx

Hadoop Primer

hadoop distributed file systems complete information

Tcloud Computing Hadoop Family and Ecosystem Service 2013.Q3

HDFCloud Workshop: HDF5 in the Cloud

Introduction to Hadoop Administration

HDF Cloud Services

Aziksa hadoop architecture santosh jha

9.-dados e processamento distribuido-hadoop.pdf

HADOOP TECHNOLOGY ppt

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...

Hadoop Distributed file system.pdf

Recently uploaded

Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge

WordPress Websites for Engineers: Elevate Your Brandgvaughan

Artificial intelligence in cctv survelliance.pptxhariprasad279825

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Powerpoint exploring the locations used in television show Time Clashcharlottematthew16

Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

CloudStudio User manual (basic edition):comworks

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity

AI as an Interface for Commercial BuildingsMemoori

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

costume and set research powerpoint presentationphoebematthew05

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

Recently uploaded (20)

Designing IA for AI - Information Architecture Conference 2024

WordPress Websites for Engineers: Elevate Your Brand

Artificial intelligence in cctv survelliance.pptx

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Unraveling Multimodality with Large Language Models.pdf

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Powerpoint exploring the locations used in television show Time Clash

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

CloudStudio User manual (basic edition):

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...

Pigging Solutions in Pet Food Manufacturing

Dev Dives: Streamline document processing with UiPath Studio Web

AI as an Interface for Commercial Buildings

Human Factors of XR: Using Human Factors to Design XR Systems

costume and set research powerpoint presentation

Unleash Your Potential - Namagunga Girls Coding Club

Vertex AI Gemini Prompt Engineering Tips

My Hashitalk Indonesia April 2024 Presentation

Hadoop in the cloud – The what, why and how from the experts

1. Hadoop in the cloud – The what, why and how from the experts Nishant Thacker Technical Product Manager – Big Data Microsoft @nishantthacker

2. Hadoop in the Cloud 2

3. Hadoop in the Cloud 3

4. Traditional Hadoop Clusters 4

5. Challenges with implementing Hadoop

6. Hadoop Clusters in the Cloud 6

7. Why Hadoop in the cloud?

8. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Hadoop Clusters Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 8

9. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Cloud Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 9

10. Distributed Storage • Files split across storage • Files replicated • Nearest node responds • Abstracted Administration Hadoop in the Cloud Extensible • APIs to extend functionality • Add new capabilities • Allow for inclusion in custom environments Automated Failover • Unmonitored failover to replicated data • Built for resiliency • Metadata stored for later retrieval Hyper-Scale • Add resources as desired • Built to include commodity configs • Direct correlation of performance and resources Distributed Compute • Distributed processing • Resource Utilization • Cost-Efficient method calls 10

11. Hadoop in the Cloud 11

12. Hadoop in the Cloud - Options

13. Scenarios for deploying Hadoop as hybrid

14. Traditional Hadoop Clusters – On Prem 14 Hadoop Cluster Worker Node HDFS HDFS HDFS Tasks Tasks Tasks Tasks Tasks Tasks Task Tracker Master Node Client Job (jar) file Job (jar) file

15. Hadoop Clusters in the Cloud 10 01

16. HDInsight Cluster Architecture AzureVNet HTTPS traffic ODBC/JDBC WebHCatalog Oozie Ambari Secure gateway AuthN HTTP Proxy Highly available Head nodes Worker nodes WASB 10 01 10 01 10 01 10 01 10 01 10 01 10 01

17. Decoupling Compute from Storage 10 01 Latency? Consistency? Bandwidth? Network

18. Decoupling Compute from Storage 10 01 Network HDD-like latency 50 Tb+ aggregate bandwidth[1] Strong consistency [1] Azure Flat Network Architecture

19. Decoupling - Benefits

20. Blob storage concepts Container Page/blocksBlobAccount 20

21. The Azure Data Lake Approach Ingest all data regardless of requirements Store all data in native format without schema definition Do analysis Using analytic engines like Hadoop Interactive queries Batch queries Machine Learning Data warehouse Real-time analytics Devices

22. Customize cluster? HDInsight cluster provisioning states RDP to cluster, update config files (non-durable) Ad hoc Cluster customization options Hive/Oozie Metastore Storage accounts & VNET’s ScriptAction Via Azure portal Ready for deployment Accepted Cluster storage provisioned AzureVM configuration Running Timed Out Error Cluster operational Configuring HDInsight Cluster customization (custom script running Config values JAR file placement in cluster Via scripting / SDK No Yes

23. Cluster integration options Each cluster surfaces a REST endpoint for integration, secured via basic authN over SSL /thrift – ODBC & JDBC /Templeton – Job Submission, Metadata management /ambari – Cluster health, monitoring /oozie – Job orchestration, scheduling

24. Hadoop in the Cloud 24

25. Cloud Deployments for Big Data 25

26. Data Usage SSRS SharePoint BI Excel BI Power BI Azure Marketplace Data ProcessingStorageEvent ProcessingData Generation MAHOUT HIVE HIVE OOZIE SQOOP PIG SQL Server Analysis Services Azure HDInsight (Hadoop) Azure Machine Learning Data WarehouseAzure Document DB Azure SQL DB HBase on Azure HDInsight Azure Blob Storage Datamarts and other transactional systems Big Data Sources (Raw Unstructured) Log files Azure Website Azure Event Hubs Storm on Azure HDInsight Azure Stream Analytics Microsoft Big Data Solution Cold path for Data Hot path for Data Data Integration AZURE DATA FACTORY

27. Summary 27

28.  For more information visit: http://azure.com/hdinsight  Questions, Feedback or Follow-up: nishant.thacker@microsoft.com

Editor's Notes

Hardware acquisition (Capex up front) Scale constrained to on-premise procurement (resource and capacity planning) Skilled Hadoop Expertise Tuning + Maintenance
Why Hadoop in the cloud? You can deploy Hadoop in a traditional on-site datacenter. Some companies–including Microsoft–also offer Hadoop as a cloud-based service. One obvious question is: why use Hadoop in the cloud? Here's why a growing number of organizations are choosing this option. The cloud saves time and money Open source doesn't mean free. Deploying Hadoop on-premises still requires servers and skilled Hadoop experts to set up, tune, and maintain them. A cloud service lets you spin up a Hadoop cluster in minutes without up-front costs. See how Virginia Tech is using Microsoft's cloud instead of spending millions of dollars to establish their own supercomputing center. The cloud is flexible and scales fast In the Microsoft Azure cloud, you pay only for the compute and storage you use, when you use it. Spin up a Hadoop cluster, analyze your data, then shut it down to stop the meter. We quickly spun up the Azure HDInsight cluster and processed six years worth of data in just a few hours, and then we shut it down&ellipsis; processing the data in the cloud made it very affordable. –Paul Henderson, National Health Service (U.K.) The cloud makes you nimble Create a Hadoop cluster in minutes–and add nodes on-demand. The cloud offers organizations immediate time to value. It was simply so much faster to do this in the cloud with Windows Azure. We were able to implement the solution and start working with data in less than a week. –Morten Meldgaard, Chr. Hansen
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
This topic explores how you can get data into your Big Data solution. It describes several different but typical data ingestion techniques that are generally applicable to any Big Data solution. These techniques include ways to handle streaming data and for automating the ingestion process. While the focus is primarily on Microsoft Azure HDInsight, many of the techniques described here are equally relevant to solutions built on other Big Data frameworks and platforms. The figure shows an overview of the techniques and technologies covered in this section of the guide.
Given that Azure HDInsight implements Hadoop MapReduce on top of Azure Blobs, the concept of Blob Storage is important. Let’s now take a look at the hierarchy of Blob storage The Blob service provides storage for entities, such as binary files and text files. The REST API for the Blob service exposes two resources: Containers Blobs. A container is a set of blobs; every blob must belong to a container. The Blob service defines two types of blobs: Block blobs, which are optimized for streaming. Page blobs, which are optimized for random read/write operations and which provide the ability to write to a range of bytes in a blob. Blobs can be read by calling the Get Blob operation. A client may read the entire blob, or an arbitrary range of bytes. Block blobs less than or equal to 64 MB in size can be uploaded by calling the Put Blob operation. Block blobs larger than 64 MB must be uploaded as a set of blocks, each of which must be less than or equal to 4 MB in size. Page blobs are created and initialized with a maximum size with a call to Put Blob. To write content to a page blob, you call the Put Page operation. The maximum size currently supported for a page blob is 1 TB. Codeplex tools like the Azure Storage Explorer make managing blobs easy. There is also a rich API build to manage storage with PowerShell via the Rest based API. Note: HDInsight currently only supports block blobs. Key Points: The Blob service defines two types of blobs: Block blobs, and Page blobs Accessible via REST APIs, Azure Storage Client library or using Azure drives Stores large amounts of unstructured text or binary data with the fastest read performance Highly scalable, durable, and available file system References: Get Blob: http://msdn.microsoft.com/en-us/library/dd179440.aspx Put Blob: http://msdn.microsoft.com/en-us/library/dd179451.aspx Put Page: http://msdn.microsoft.com/en-us/library/ee691975.aspx Data Management and Business Analytics: http://azure.microsoft.com/en-us/documentation/articles/fundamentals-data-management-business-analytics/#blob
The data lake on the other hand leverages a bottoms-up approach. A data lake is an enterprise wide repository of every type of data collected in a single place. Data of all types can be arbitrarily stored in the data lake prior to any formal definition of requirements or schema for the purposes of operational and exploratory analytics. Advanced analytics can be done using Hadoop, Machine Learning tools, or act as a lower cost data preparation location prior to moving curated data into a data warehouse. In these cases, customers would load data into the data lake prior to defining any transformation logic. This is bottoms up because data is collected first and the data itself gives you the insight and helps derive conclusions or predictive models.

Hadoop in the cloud – The what, why and how from the experts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Hadoop in the cloud – The what, why and how from the experts

Similar to Hadoop in the cloud – The what, why and how from the experts (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Hadoop in the cloud – The what, why and how from the experts

Editor's Notes