SlideShare a Scribd company logo
1 of 40
Download to read offline
Beyond Mission Critical: Virtualizing Big-Data,
Hadoop, HPC, Cloud-scale Apps
Chris Greer, FedEx
Richard McDougall, VMware
VAPP5402
#VAPP5402
© 2013 VMware Inc. All rights reserved
Beyond Mission Critical:
Virtualizing Big-Data, Hadoop and Cloud Apps
Richard McDougall
CTO, Storage and Application Services
Chris Greer,
Enterprise Architect, FedEx
3
Virtualize Everything: Next Generation Apps
Virtual Storage
Arrays
vSphere
SAN/NAS Object / BLOB
Traditional Applications
• Traditional enterprise storage
• HW-based resiliency, QoS
Next Gen Cloud Apps
• Scale out, flash, DAS
• Application specific storage
All SSD
Array
Server-side
Flash
4
The complexity enterprise IT and developers face today
An Idea for a
cool app
Spec a server
config
Justify server
costs
Procurement
process
Wait for HW to
arrive
Wait for IT ops to
Image the server
Install a
Database
LOB Architecture
approval
Central IT
Architectural
approval
Justify more
server for scale
testing
Wait for more
HW
Configure ACLs
and LBs
New infrastructures
New Languages and
Frameworks
New Devices
and Domains
New Data types and
requirements
5
Micro Clouds
Cloud Foundry – Announced Today on vSphere
Data
Services
Other
Services
Msg
Services
.js
Public Clouds
Private Clouds
6
Big Data - Not Just for the Web Giants – Now the Intelligent Enterprise
7
Real-time analysis allows
instant understanding of
market dynamics.
Retailers can have intimate
understanding of their
customers needs and use
direct targeted marketing.
Market Segment Analysis  Personalized Customer Targeting`
8
The Emerging Pattern of Big Data Systems: Retail Example
Real-Time
Streams
Exa-scale Data Store
Parallel Data
Processing
Real-Time
Processing
Machine
Learning
Data Science
Cloud Infrastructure
9
Storage: Plan for Peta-scale Data Storage and Processing
0.01
0.1
1
10
100
1000
2000 2003 2006 2009 2012 2015
Online Apps
Analytics
PB of
Data
Analytics Rapidly Outgrows Traditional Data Size
by 100x
10
Unprecedented Scale
“Data transparency,
amplified by Social Networks
generates data at a
scale never seen before”
- The Human Face of Big Data
We are creating an Exabyte
of data every minute in 2013
Yottabyte by 2030
11
A single GE Jet Engine produces
10 Terabytes of data in one hour
– 90 Petabytes per year.
Enabling early detection of
faults, common mode failures,
product engineering feedback.
Post Mortem  Proactively Maintained Connected Product
12
Cloud Infrastructure Supports Mixed Big Data Workloads
Machine
Learning HadoopReal-Time
Analytics
Cloud Infrastructure
Machine
Learning
Hadoop
Real-Time
Analytics
Management
Network/Security
Storage/Availability
Compute
13
Cloud Infrastructure Supports Multiple Tenants
Cloud Infrastructure
Management
Network/Security
Storage/Availability
Compute
Web User
Analytics
Financial
Analysis
Historical Customer
Behavior
14
Software-defined Datacenter: Compute
Agility / Rapid deployment
Lower Capex
Isolation for resource control
and security
1
2
3
Operational efficiency4
Management
The Core Values of Virtualization Apply to Big Data
Network/Security
Storage/Availability
Compute
15
Strong Isolation between Workloads is Key
Hungry
Workload 1
Reckless
Workload 2
Nosy
Workload 3
Cloud Infrastructure
16
Virtualizing Hadoop
 Shrink and expand
cluster on demand
 Independent scaling of
Compute and data
 Strong multi-tenancy
Elasticity & Multi-tenancy
 High availability for
entire Hadoop stack
 One click to setup
 Battle-tested
High Availability
 Rapid deployment
 One stop command
center
 Easy to
configure/reconfigure
Operational Simplicity
17
Serengeti
Virtual
Hadoop
Manager
(VHM)
Hadoop
Virtualization
Extensions
(HVE)
Big Data Extensions: Core Components
 Core is Open Source
 Tool to simplify virtualized
Hadoop deployment &
operations
Serengeti
 Virtualization changes for
core Hadoop
 Contributed back to Apache
Hadoop
 Advanced resource
management on vSphere
18
Hadoop
batch analysis
Big Data Family of Frameworks
File System/Data Store
Host Host Host Host Host Host
HBase
real-time queries
NoSQL
Cassandra,
Mongo, etc
Big SQL
Impala,
Pivotal HawQ
Compute
layer
Virtualization
Host
Other
Spark,
Shark,
Solr,
Platfora,
Etc,…
19
Traditional Hadoop vs. Elastic Hadoop
Scale-out Network Storage
Traditional Hadoop:
Converged
Compute/Storage
Elastic Compute
Scale-out Network Storage
20
Management
Software-defined Datacenter: Storage
Requirements of Next Generation Storage
Network/Security
Storage/Availability
Compute
10x lower cost of storage
Handle explosive data growth
Support a variety of
application types
1
2
3
Solve the privacy and
security issues
4
21
HDFS Model
ESX ESX ESX
J
T
HDFS or MAPR VM HDFS or MAPR VM HDFS or MAPR VM
Local Disks
SAN/NAS Non-Hadoop VMs
Hadoop Compute VMs
JT: JobTracker
TT: TaskTracker
NN: NameNode
VHM: Virtual Hadoop Manager
N
N
T
T
T
T
T
T
VirtualCenter Management Server
DRS DRS DRSDRS DRS
VHM
Hadoop HDFS VMs
T
T
T
T
T
T
J
T
22
Big-Data using Local Disks
Host
Host
Host
Host
Host
Host
Host
Top of Rack Switch
Servers with
Local Disks
16-24 core server
12-24 SATA 2-4TB Disks
10 GbE adapter
iSCSI/NFS for Shared
Storage for vMotion etc,…
High Performance 10GBE
Switch per Rack
23
Scale-out Storage for Big Data
$-
$0.50
$1.00
$1.50
$2.00
$2.50
$3.00
$3.50
$4.00
$4.50
$5.00
$5.50
0.5 1 2 4 8 16 32 64 128
Cost per GB
Petabytes Deployed
Traditional
SAN/NAS
Distributed
Object
Storage
HDFS
MAPR
CEPH
Scale-out NAS
Isilon, NTAP
24
Big Data Storage
Scale-out Network Storage
Elastic Compute
Scale-out Network Storage
• Hadoop Protocol
• Snapshots
• Posix Apps
• Full NFS Access
• Replication
• Erasure Coding
25
Big Data with Scale-out-NAS
Big-Data using Scale-out NAS
Host
Host
Host
Host
Host
Host
Top of Rack Switch
Scale-out
NAS
Host
Host
Host
Host
Host
Host
Top of Rack Switch
Scale-out
NAS
Temp
Data
Shared
Data
Isilon
Scale-out
NAS
Local
Disk or SSD
In each Host
For Transient Data
26
Chris Greer, FedEx Services
27
Breakthrough Use Cases
 Web Log Analysis
 Initial exploration was around detection of mobile devices accessing the
website.
 Analysis of 570 billion web server log entries took approximately 9 minutes to
complete on a small cluster.
 ZIP code Analysis
 Analysis of data to determine which ZIP codes are the highest source or
destination for shipments.
 Shipment Analysis
 Analysis of shipment information to determine patterns
that may delay a package.
28
Agile Big Data at FedEx
• Trusted Isolation
• Well known auditable
platform
Security
• Deploy in minutes
• Optimize for shift in
workload
characteristics
Agility
• Create true multi-
tenancy
• Mixed workloads
Elasticity
29
Hadoop Service at FedEx: vSphere + Isilon Storage
Scale-out Isilon Cluster
- Shared Data
- NAS + Hadoop
Elastic vSphere Cluster
- Mixed Workloads
- vSphere
- Existing Rack Mount
Servers
30
Agility: Automation of Hadoop Cluster Management
Deploy
Resize
Elastic scaling
Customize
Incorporate
best practices
Manage
Tune configuration
Run
Execute jobs
Access HDFS
31
Monitoring
Agility: Ease of Management Due to Consolidation
Cluster setup
and provisioning
Monitoring
HW procurement
and sizing
Cluster setup
and provisioning
HW procurement
and sizing
32
Elasticity: Mixed Workloads on a Shared Platform
Production
Test
Experimentation
Dept A: Marketing Dept B: Operations
Production
Test
Experimentation
Log files
Social dataTransaction data Historical data
 Common Infrastructure
 Common Infrastructure
can be shared by multiple
logical Hadoop clusters
and prioritized with
VMWare resource pools.
 Data Segregation
 Data that should not be
shared can be kept
separate and leverage
VMWare security controls
for isolation.
33
Security
 Known Security Model
• VMs provide the required levels of Isolation for different workloads
 Trusted Auditable Platform
• Leverage virtualization as the platform
• Known to auditors
• Accepted as a valid deployment model
34
Summary
35
Customers Winning from Consolidated Big Data Platforms
“Dedicated hardware makes no
sense”
“Software-defined Datacenter
enables rapid deployment
multiple tenants and labs”
“Our mixed workloads include
Hadoop, Database, ETL and
App-servers”
“Any performance penalties are
minor”Management
Network/Security
Storage/Availability
Compute
36
Q&A
37
Other VMware Activities Related to This Session
 HOL-SDC-1309 - vSphere Big Data Extensions
 VAPP5484 – Big Data Extensions Advanced Features
 VAPP5626 – Big Data Panel
THANK YOU
Beyond Mission Critical: Virtualizing Big-Data,
Hadoop, HPC, Cloud-scale Apps
Chris Greer, FedEx
Richard McDougall, VMware
VAPP5402
#VAPP5402

More Related Content

What's hot

John Zuniga Resume
John Zuniga ResumeJohn Zuniga Resume
John Zuniga Resume
John Zuniga
 

What's hot (20)

Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
Red Hat Storage Day LA - Why Software-Defined Storage Matters and Web-Scale O...
 
What the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and VisibilityWhat the Enterprise Requires - Business Continuity and Visibility
What the Enterprise Requires - Business Continuity and Visibility
 
Security implementation on hadoop
Security implementation on hadoopSecurity implementation on hadoop
Security implementation on hadoop
 
Novinky v Oracle Database 18c
Novinky v Oracle Database 18cNovinky v Oracle Database 18c
Novinky v Oracle Database 18c
 
Make a Move to AWS Now
Make a Move to AWS Now Make a Move to AWS Now
Make a Move to AWS Now
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
File Server and Storage Consolidation in the Cloud
File Server and Storage Consolidation in the CloudFile Server and Storage Consolidation in the Cloud
File Server and Storage Consolidation in the Cloud
 
Why Software-Defined Storage Matters
Why Software-Defined Storage MattersWhy Software-Defined Storage Matters
Why Software-Defined Storage Matters
 
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
Red Hat Storage Day LA - Performance and Sizing Software Defined Storage
 
Debunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data ManagementDebunking Common Myths of Hadoop Backup & Test Data Management
Debunking Common Myths of Hadoop Backup & Test Data Management
 
Understanding the IBM Power Systems Advantage
Understanding the IBM Power Systems AdvantageUnderstanding the IBM Power Systems Advantage
Understanding the IBM Power Systems Advantage
 
John Zuniga Resume
John Zuniga ResumeJohn Zuniga Resume
John Zuniga Resume
 
Migrate Existing Applications to AWS without Re-engineering
Migrate Existing Applications to AWS without Re-engineeringMigrate Existing Applications to AWS without Re-engineering
Migrate Existing Applications to AWS without Re-engineering
 
10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage10 reasons why to choose Pure Storage
10 reasons why to choose Pure Storage
 
12 Architectural Requirements for Protecting Business Data in the Cloud
12 Architectural Requirements for Protecting Business Data in the Cloud12 Architectural Requirements for Protecting Business Data in the Cloud
12 Architectural Requirements for Protecting Business Data in the Cloud
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage MattersRed Hat Storage Day Atlanta - Why Software Defined Storage Matters
Red Hat Storage Day Atlanta - Why Software Defined Storage Matters
 
Kafka Security
Kafka SecurityKafka Security
Kafka Security
 
20150716 introduction to apache spark v3
20150716 introduction to apache spark v3 20150716 introduction to apache spark v3
20150716 introduction to apache spark v3
 
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaSCloud Migration Paths: Kubernetes, IaaS, or DBaaS
Cloud Migration Paths: Kubernetes, IaaS, or DBaaS
 

Viewers also liked

VMworld2008
VMworld2008VMworld2008
VMworld2008
Nishka
 
Esx Server 3i Presentation[1]
Esx Server 3i Presentation[1]Esx Server 3i Presentation[1]
Esx Server 3i Presentation[1]
Rishi Sharma
 
Linux On V Mware ESXi
Linux On V Mware ESXiLinux On V Mware ESXi
Linux On V Mware ESXi
Masafumi Ohta
 
E tech vmware presentation
E tech vmware presentationE tech vmware presentation
E tech vmware presentation
jpenney
 
Virtualization 360 - Westcoast
Virtualization 360 - WestcoastVirtualization 360 - Westcoast
Virtualization 360 - Westcoast
butest
 
VMware Cost Savings Through Virtualization
VMware Cost Savings Through VirtualizationVMware Cost Savings Through Virtualization
VMware Cost Savings Through Virtualization
hypknight
 

Viewers also liked (20)

Presentation v mware virtualization & cloud vision 2010
Presentation   v mware virtualization & cloud vision 2010Presentation   v mware virtualization & cloud vision 2010
Presentation v mware virtualization & cloud vision 2010
 
Virtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric SiebertVirtualization – A Year in Review with Eric Siebert
Virtualization – A Year in Review with Eric Siebert
 
VMworld2008
VMworld2008VMworld2008
VMworld2008
 
VMware Primer
VMware PrimerVMware Primer
VMware Primer
 
Todd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSATodd Muirhead (@virtualTodd) - VMware vSA
Todd Muirhead (@virtualTodd) - VMware vSA
 
VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101VMworld 2014: Virtualization 101
VMworld 2014: Virtualization 101
 
Lengow - International presentation
Lengow - International presentationLengow - International presentation
Lengow - International presentation
 
Esx Server 3i Presentation[1]
Esx Server 3i Presentation[1]Esx Server 3i Presentation[1]
Esx Server 3i Presentation[1]
 
VMware Overview
VMware OverviewVMware Overview
VMware Overview
 
Linux On V Mware ESXi
Linux On V Mware ESXiLinux On V Mware ESXi
Linux On V Mware ESXi
 
E tech vmware presentation
E tech vmware presentationE tech vmware presentation
E tech vmware presentation
 
vDesk Overview
vDesk OverviewvDesk Overview
vDesk Overview
 
Virtualization 360 - Westcoast
Virtualization 360 - WestcoastVirtualization 360 - Westcoast
Virtualization 360 - Westcoast
 
Backup / Restore to Cloud Storage with esXpress and CloudArray software
Backup / Restore to Cloud Storage with esXpress and CloudArray softwareBackup / Restore to Cloud Storage with esXpress and CloudArray software
Backup / Restore to Cloud Storage with esXpress and CloudArray software
 
Transitioning to vmWare ESXi
Transitioning to vmWare ESXiTransitioning to vmWare ESXi
Transitioning to vmWare ESXi
 
堵俊平:Hadoop virtualization extensions
堵俊平:Hadoop virtualization extensions堵俊平:Hadoop virtualization extensions
堵俊平:Hadoop virtualization extensions
 
VMware Cost Savings Through Virtualization
VMware Cost Savings Through VirtualizationVMware Cost Savings Through Virtualization
VMware Cost Savings Through Virtualization
 
Presentación powe point
Presentación powe pointPresentación powe point
Presentación powe point
 
Po3660 Krogstad Vm World 2008
Po3660 Krogstad Vm World 2008Po3660 Krogstad Vm World 2008
Po3660 Krogstad Vm World 2008
 
Internet of Things Stack - Presentation Version
Internet of Things Stack - Presentation VersionInternet of Things Stack - Presentation Version
Internet of Things Stack - Presentation Version
 

Similar to VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cloud-scale Apps

Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the Cloud
Alluxio, Inc.
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
solarisyourep
 
Architecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentationArchitecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentation
Vlad Ponomarev
 

Similar to VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cloud-scale Apps (20)

1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
From limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiencyFrom limited Hadoop compute capacity to increased data scientist efficiency
From limited Hadoop compute capacity to increased data scientist efficiency
 
Data Orchestration Platform for the Cloud
Data Orchestration Platform for the CloudData Orchestration Platform for the Cloud
Data Orchestration Platform for the Cloud
 
How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...How the Development Bank of Singapore solves on-prem compute capacity challen...
How the Development Bank of Singapore solves on-prem compute capacity challen...
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
VMworld 2013: Dell Solutions for VMware Virtual SAN
VMworld 2013: Dell Solutions for VMware Virtual SAN VMworld 2013: Dell Solutions for VMware Virtual SAN
VMworld 2013: Dell Solutions for VMware Virtual SAN
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
A complete Open Source cloud: Storage, Virt, IaaS, PaaS
A complete Open Source cloud: Storage, Virt, IaaS, PaaSA complete Open Source cloud: Storage, Virt, IaaS, PaaS
A complete Open Source cloud: Storage, Virt, IaaS, PaaS
 
Architecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentationArchitecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentation
 
Slides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data LakesSlides: Accelerating Queries on Cloud Data Lakes
Slides: Accelerating Queries on Cloud Data Lakes
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
Virtualized Big Data Platform at VMware Corp IT @ VMWorld 2015
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 

More from VMworld

More from VMworld (20)

VMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep DiveVMworld 2016: vSphere 6.x Host Resource Deep Dive
VMworld 2016: vSphere 6.x Host Resource Deep Dive
 
VMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for HorizonVMworld 2016: Troubleshooting 101 for Horizon
VMworld 2016: Troubleshooting 101 for Horizon
 
VMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSXVMworld 2016: Advanced Network Services with NSX
VMworld 2016: Advanced Network Services with NSX
 
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco InfrastructureVMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
VMworld 2016: How to Deploy VMware NSX with Cisco Infrastructure
 
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI AutomationVMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
 
VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7VMworld 2016: What's New with Horizon 7
VMworld 2016: What's New with Horizon 7
 
VMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep DiveVMworld 2016: Virtual Volumes Technical Deep Dive
VMworld 2016: Virtual Volumes Technical Deep Dive
 
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
VMworld 2016: Advances in Remote Display Protocol Technology with VMware Blas...
 
VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations! VMworld 2016: The KISS of vRealize Operations!
VMworld 2016: The KISS of vRealize Operations!
 
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
 
VMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts PanelVMworld 2016: Ask the vCenter Server Exerts Panel
VMworld 2016: Ask the vCenter Server Exerts Panel
 
VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way! VMworld 2016: Virtualize Active Directory, the Right Way!
VMworld 2016: Virtualize Active Directory, the Right Way!
 
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
VMworld 2016: Migrating from a hardware based firewall to NSX to improve perf...
 
VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6VMworld 2015: Troubleshooting for vSphere 6
VMworld 2015: Troubleshooting for vSphere 6
 
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
VMworld 2015: Monitoring and Managing Applications with vRealize Operations 6...
 
VMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphereVMworld 2015: Advanced SQL Server on vSphere
VMworld 2015: Advanced SQL Server on vSphere
 
VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!VMworld 2015: Virtualize Active Directory, the Right Way!
VMworld 2015: Virtualize Active Directory, the Right Way!
 
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
VMworld 2015: Site Recovery Manager and Policy Based DR Deep Dive with Engine...
 
VMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SANVMworld 2015: Building a Business Case for Virtual SAN
VMworld 2015: Building a Business Case for Virtual SAN
 
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes ConfigurationsVMworld 2015: Explaining Advanced Virtual Volumes Configurations
VMworld 2015: Explaining Advanced Virtual Volumes Configurations
 

Recently uploaded

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cloud-scale Apps

  • 1. Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cloud-scale Apps Chris Greer, FedEx Richard McDougall, VMware VAPP5402 #VAPP5402
  • 2. © 2013 VMware Inc. All rights reserved Beyond Mission Critical: Virtualizing Big-Data, Hadoop and Cloud Apps Richard McDougall CTO, Storage and Application Services Chris Greer, Enterprise Architect, FedEx
  • 3. 3 Virtualize Everything: Next Generation Apps Virtual Storage Arrays vSphere SAN/NAS Object / BLOB Traditional Applications • Traditional enterprise storage • HW-based resiliency, QoS Next Gen Cloud Apps • Scale out, flash, DAS • Application specific storage All SSD Array Server-side Flash
  • 4. 4 The complexity enterprise IT and developers face today An Idea for a cool app Spec a server config Justify server costs Procurement process Wait for HW to arrive Wait for IT ops to Image the server Install a Database LOB Architecture approval Central IT Architectural approval Justify more server for scale testing Wait for more HW Configure ACLs and LBs New infrastructures New Languages and Frameworks New Devices and Domains New Data types and requirements
  • 5. 5 Micro Clouds Cloud Foundry – Announced Today on vSphere Data Services Other Services Msg Services .js Public Clouds Private Clouds
  • 6. 6 Big Data - Not Just for the Web Giants – Now the Intelligent Enterprise
  • 7. 7 Real-time analysis allows instant understanding of market dynamics. Retailers can have intimate understanding of their customers needs and use direct targeted marketing. Market Segment Analysis  Personalized Customer Targeting`
  • 8. 8 The Emerging Pattern of Big Data Systems: Retail Example Real-Time Streams Exa-scale Data Store Parallel Data Processing Real-Time Processing Machine Learning Data Science Cloud Infrastructure
  • 9. 9 Storage: Plan for Peta-scale Data Storage and Processing 0.01 0.1 1 10 100 1000 2000 2003 2006 2009 2012 2015 Online Apps Analytics PB of Data Analytics Rapidly Outgrows Traditional Data Size by 100x
  • 10. 10 Unprecedented Scale “Data transparency, amplified by Social Networks generates data at a scale never seen before” - The Human Face of Big Data We are creating an Exabyte of data every minute in 2013 Yottabyte by 2030
  • 11. 11 A single GE Jet Engine produces 10 Terabytes of data in one hour – 90 Petabytes per year. Enabling early detection of faults, common mode failures, product engineering feedback. Post Mortem  Proactively Maintained Connected Product
  • 12. 12 Cloud Infrastructure Supports Mixed Big Data Workloads Machine Learning HadoopReal-Time Analytics Cloud Infrastructure Machine Learning Hadoop Real-Time Analytics Management Network/Security Storage/Availability Compute
  • 13. 13 Cloud Infrastructure Supports Multiple Tenants Cloud Infrastructure Management Network/Security Storage/Availability Compute Web User Analytics Financial Analysis Historical Customer Behavior
  • 14. 14 Software-defined Datacenter: Compute Agility / Rapid deployment Lower Capex Isolation for resource control and security 1 2 3 Operational efficiency4 Management The Core Values of Virtualization Apply to Big Data Network/Security Storage/Availability Compute
  • 15. 15 Strong Isolation between Workloads is Key Hungry Workload 1 Reckless Workload 2 Nosy Workload 3 Cloud Infrastructure
  • 16. 16 Virtualizing Hadoop  Shrink and expand cluster on demand  Independent scaling of Compute and data  Strong multi-tenancy Elasticity & Multi-tenancy  High availability for entire Hadoop stack  One click to setup  Battle-tested High Availability  Rapid deployment  One stop command center  Easy to configure/reconfigure Operational Simplicity
  • 17. 17 Serengeti Virtual Hadoop Manager (VHM) Hadoop Virtualization Extensions (HVE) Big Data Extensions: Core Components  Core is Open Source  Tool to simplify virtualized Hadoop deployment & operations Serengeti  Virtualization changes for core Hadoop  Contributed back to Apache Hadoop  Advanced resource management on vSphere
  • 18. 18 Hadoop batch analysis Big Data Family of Frameworks File System/Data Store Host Host Host Host Host Host HBase real-time queries NoSQL Cassandra, Mongo, etc Big SQL Impala, Pivotal HawQ Compute layer Virtualization Host Other Spark, Shark, Solr, Platfora, Etc,…
  • 19. 19 Traditional Hadoop vs. Elastic Hadoop Scale-out Network Storage Traditional Hadoop: Converged Compute/Storage Elastic Compute Scale-out Network Storage
  • 20. 20 Management Software-defined Datacenter: Storage Requirements of Next Generation Storage Network/Security Storage/Availability Compute 10x lower cost of storage Handle explosive data growth Support a variety of application types 1 2 3 Solve the privacy and security issues 4
  • 21. 21 HDFS Model ESX ESX ESX J T HDFS or MAPR VM HDFS or MAPR VM HDFS or MAPR VM Local Disks SAN/NAS Non-Hadoop VMs Hadoop Compute VMs JT: JobTracker TT: TaskTracker NN: NameNode VHM: Virtual Hadoop Manager N N T T T T T T VirtualCenter Management Server DRS DRS DRSDRS DRS VHM Hadoop HDFS VMs T T T T T T J T
  • 22. 22 Big-Data using Local Disks Host Host Host Host Host Host Host Top of Rack Switch Servers with Local Disks 16-24 core server 12-24 SATA 2-4TB Disks 10 GbE adapter iSCSI/NFS for Shared Storage for vMotion etc,… High Performance 10GBE Switch per Rack
  • 23. 23 Scale-out Storage for Big Data $- $0.50 $1.00 $1.50 $2.00 $2.50 $3.00 $3.50 $4.00 $4.50 $5.00 $5.50 0.5 1 2 4 8 16 32 64 128 Cost per GB Petabytes Deployed Traditional SAN/NAS Distributed Object Storage HDFS MAPR CEPH Scale-out NAS Isilon, NTAP
  • 24. 24 Big Data Storage Scale-out Network Storage Elastic Compute Scale-out Network Storage • Hadoop Protocol • Snapshots • Posix Apps • Full NFS Access • Replication • Erasure Coding
  • 25. 25 Big Data with Scale-out-NAS Big-Data using Scale-out NAS Host Host Host Host Host Host Top of Rack Switch Scale-out NAS Host Host Host Host Host Host Top of Rack Switch Scale-out NAS Temp Data Shared Data Isilon Scale-out NAS Local Disk or SSD In each Host For Transient Data
  • 27. 27 Breakthrough Use Cases  Web Log Analysis  Initial exploration was around detection of mobile devices accessing the website.  Analysis of 570 billion web server log entries took approximately 9 minutes to complete on a small cluster.  ZIP code Analysis  Analysis of data to determine which ZIP codes are the highest source or destination for shipments.  Shipment Analysis  Analysis of shipment information to determine patterns that may delay a package.
  • 28. 28 Agile Big Data at FedEx • Trusted Isolation • Well known auditable platform Security • Deploy in minutes • Optimize for shift in workload characteristics Agility • Create true multi- tenancy • Mixed workloads Elasticity
  • 29. 29 Hadoop Service at FedEx: vSphere + Isilon Storage Scale-out Isilon Cluster - Shared Data - NAS + Hadoop Elastic vSphere Cluster - Mixed Workloads - vSphere - Existing Rack Mount Servers
  • 30. 30 Agility: Automation of Hadoop Cluster Management Deploy Resize Elastic scaling Customize Incorporate best practices Manage Tune configuration Run Execute jobs Access HDFS
  • 31. 31 Monitoring Agility: Ease of Management Due to Consolidation Cluster setup and provisioning Monitoring HW procurement and sizing Cluster setup and provisioning HW procurement and sizing
  • 32. 32 Elasticity: Mixed Workloads on a Shared Platform Production Test Experimentation Dept A: Marketing Dept B: Operations Production Test Experimentation Log files Social dataTransaction data Historical data  Common Infrastructure  Common Infrastructure can be shared by multiple logical Hadoop clusters and prioritized with VMWare resource pools.  Data Segregation  Data that should not be shared can be kept separate and leverage VMWare security controls for isolation.
  • 33. 33 Security  Known Security Model • VMs provide the required levels of Isolation for different workloads  Trusted Auditable Platform • Leverage virtualization as the platform • Known to auditors • Accepted as a valid deployment model
  • 35. 35 Customers Winning from Consolidated Big Data Platforms “Dedicated hardware makes no sense” “Software-defined Datacenter enables rapid deployment multiple tenants and labs” “Our mixed workloads include Hadoop, Database, ETL and App-servers” “Any performance penalties are minor”Management Network/Security Storage/Availability Compute
  • 37. 37 Other VMware Activities Related to This Session  HOL-SDC-1309 - vSphere Big Data Extensions  VAPP5484 – Big Data Extensions Advanced Features  VAPP5626 – Big Data Panel
  • 39.
  • 40. Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cloud-scale Apps Chris Greer, FedEx Richard McDougall, VMware VAPP5402 #VAPP5402