SlideShare a Scribd company logo
1 of 34
Download to read offline
Virtualize Big Data to
Make the Elephant
Dance
June Yang, Senior Director of Product Management, VMWare
Dan Baskett, Senior Consultant Technologist, Pivotal

© Copyright 2013 EMC Corporation. All rights reserved.

1
Unstructured Data is exploding… Hadoop is driving growth
Hadoop adoption is ramping

Unstructured data driving growth

Don't know Other
2%
2%
Testing
2%

Complex unstructured data
forecasted to outpace structured
relational data by 10x by 2020

Piloting
18%
Inproduction
23%

2011

2012

2013

2014

2015

2016

Structured

2017

2018

Unstructured

2019

Evaluating
53%

2020

Source: Forrester Survey of 60 CIOs , September 2011

• Unstructured data explosion and Hadoop capabilities causing CIOs to reconsider
Enterprise data strategy
•
•

Gartner predicts +800% data growth over next 5 years
Hadoop’s ability to process raw data at cost presents intriguing value prop for CIOs

© Copyright 2013 EMC Corporation. All rights reserved.

2
Broad Application of Hadoop Technology
Use Cases

Vertical Industries

Log Processing / Click
Stream Analytics

Financial Services

Machine Learning /
sophisticated data mining

Internet Retailer

Web crawling / text
processing

Pharmaceutical / Drug
Discovery

Extract Transform Load
(ETL) replacement

Mobile / Telecom

Image / XML message
processing

Scientific Research

General archiving /
compliance

Social Media

Hadoop is a platform that will revolutionize how Enterprises handle data

© Copyright 2013 EMC Corporation. All rights reserved.

3
The Big Data Journey in the Enterprise
Integrated

Stage 3: Cloud Analytics Platform
• Serve many departments
• Often part of mission critical workflow
• Fully integrated with analytics/BI tools
Stage 2: Hadoop Production
• Serve a few departments
• More use cases
• Growing # and size of clusters
• Core Hadoop + components

Stage1: Hadoop Piloting
• Often start with line of business
• Try 1 or 2 use cases to explore
the value of Hadoop

0 node

© Copyright 2013 EMC Corporation. All rights reserved.

10’s

100’s

Scale
4
Deploy Hadoop Clusters in Minutes

© Copyright 2013 EMC Corporation. All rights reserved.

5
One click to scale out your cluster on the fly

© Copyright 2013 EMC Corporation. All rights reserved.

6
Customize your Hadoop/Hbase Cluster
Customize with Cluster
Specification File

© Copyright 2013 EMC Corporation. All rights reserved.

7
Cluster Spec File Details
Storage configuration

Choice of shared storage or Local disk

High availability option

# of Hadoop nodes
Resource configuration

© Copyright 2013 EMC Corporation. All rights reserved.

Cluster Specification File
"groups":[
{ "name":"master",
"roles":[
"hadoop_namenode",
"hadoop_jobtracker”],
"storage": {
"type": "SHARED”, sizeGB": 20},
"instance_type":MEDIUM,
"instance_num":1,
"ha":true},
{"name":"worker",
"roles":[
"hadoop_datanode",
"hadoop_tasktracker"
],
"instance_type":SMALL,
"instance_num":5,
"ha":false
…

8
Your Choice of Hadoop Distributions and Tools
Distributions

Community Projects

• Flexibility to choose and try out major distributions
• Support for multiple projects
• Open architecture to welcome industry participation
• Contributing Hadoop Virtualization Extensions (HVE) to open source
community
© Copyright 2013 EMC Corporation. All rights reserved.

9
Proactive monitoring with VCOPs
 Proactively monitoring through VCOPs
 Gain comprehensive visibility
 Eliminate manual processes with intelligent automation
 Proactively manage operations
 Alternatively, use monitoring tools like Nagios, Ganglia

© Copyright 2013 EMC Corporation. All rights reserved.

10
Beyond day 1 - Automation of Hadoop Cluster lifecycle management

…

Deploy

Custo
mize

Scaling

Tune
config
uration

Load
data
Execut
e jobs

© Copyright 2013 EMC Corporation. All rights reserved.

11
The Big Data Journey in the Enterprise
Integrated
Stage 2: Hadoop Production
• Serve a few departments
• More use cases
• Growing # and size of clusters
• Core Hadoop + components

Stage1: Hadoop Piloting
 Rapid deployment
 On the fly cluster resizing
 Choice of Hadoop distros
 Automation of cluster lifecycle

0 node

© Copyright 2013 EMC Corporation. All rights reserved.

10’s

100’s

Scale
12
Achieve HA for the Entire Hadoop Stack

Zookeepr

(Coordination)

Pig

(Data Flow)

BI Reporting
Hive

(SQL)

RDBMS
Hive MetaDB

HCatalog

Hcatalog MDB

MapReduce (Job Scheduling/Execution System)
HBase (Key-Value store)
HDFS

(Hadoop Distributed File System)

Jobtracker
Namenode

Management Server

ETL Tools

Server

• vSphere HA is battle-tested high availability technology
• Single mechanism to achieve HA for the entire Hadoop stack
• One click to enable HA and/or FT

© Copyright 2013 EMC Corporation. All rights reserved.

13
Challenges of Running Hadoop in Enterprises
Dept A: recommendation engine

Production

Production

Test

Log files

Experimentation

Transaction data

Dept B: ad targeting

Social data

© Copyright 2013 EMC Corporation. All rights reserved.

On the horizon…
NoSQL

Real time SQL

…

Test

Experimentation

Historical cust behavior

Pain Points:
1. Cluster sprawling
2. Redundant common data in
separate clusters
3. Difficult use the right tool for
the right problem
4. Peak compute and I/O
resource is limited to number
of nodes in each independent
cluster
14
What if you can…
Recommendation engine

Ad targeting

Production

Production

Test

Experimentation

Test

Experimentation

© Copyright 2013 EMC Corporation. All rights reserved.

One physical platform to support multiple virtual
big data clusters

Experimentation
Production
recommendation engine

Test/Dev
Production
Ad Targeting

15
Bigger is Better
 Hadoop is linearly scalable, more nodes, better performance,
for the same job, it will take
– 2 hour to complete on a 50 node cluster
– 1 hour to complete on a 100 node cluster
– 30 min to complete on a 200 node cluster

© Copyright 2013 EMC Corporation. All rights reserved.

16
You may ask


What about differentiated SLAs
–
–



For production Hadoop jobs, need to ensure high priority
Lower priority of experimental Hadoop jobs.

Will I have a noisy neighbor problems with shared infrastructure
approach?

© Copyright 2013 EMC Corporation. All rights reserved.

17
VM Containers with Isolation are a Tried and Tested
Approach
Reckless Workload 2

Hungry Workload 1

Noisy
Workload 3
VMware vSphere + Serengeti
Host

Host

© Copyright 2013 EMC Corporation. All rights reserved.

Host

Host

Host

Host

Host

18
Shared infrastructure: Three big types of Isolation are Required

 Resource Isolation
• Control the greedy noisy neighbor
• Reserve resources to meet needs
 Version Isolation
• Allow concurrent OS, App, Distro versions
 Security Isolation
• Provide privacy between users/groups
• Runtime and data privacy required

VMware vSphere + Serengeti
Host

Host

© Copyright 2013 EMC Corporation. All rights reserved.

Host

Host

Host

Host

Host

19
With virtualization, you can have your cake and eat it
too
 One physical platform to support
multiple virtual big data clusters

Experimentation

Compute
layer
Data
layer

Production
recommendation engine

Test/Dev
Production
Ad Targeting

VMware vSphere + Serengeti

–
–

Low Priority
High Priority

–
–

Share data to minimize copying
Single infrastructure to
maintain
Bigger cluster for better
performance
Share hardware resource to
achieve higher utilization

 Virtualization ensures strong
isolation between clusters.
–
–
–
–

© Copyright 2013 EMC Corporation. All rights reserved.

Resource isolation.
Failure isolation
Configure isolation
Security isolation

20
Elastic Hadoop with Virtualization
VM

Hadoop Node

Combined
Storage/Com
pute

Unmodified Hadoop
node in a VM
 VM lifecycle
determined
by Datanode
 Limited elasticity

© Copyright 2013 EMC Corporation. All rights reserved.

VM

VM

T1

Compute
VM

Storage
Separate Compute from
Storage
 Separate compute
from data
 Stateless compute
 Elastic compute

VM

VM

T2

Storage

Separate Virtual Compute Clusters
per tenant
 Separate virtual compute
 Compute cluster per tenant
 Stronger VM-grade security
and resource isolation

21
Scale in/out Hadoop dynamically
 Deploy separate compute clusters for different tenants sharing HDFS.
 Commission/decommission task trackers according to priority and
available resources
Job
Tracker

Job
Tracker

Compute layer

Compute
VM

Compute
VM

Dynamic resourcepool

Experimentation
Experimentation

Compute
VM

Compute
VM

Compute
VM

Compute
VM

Compute
VM

Compute
VM

Production
recommendation engine

Production
VMware vSphere + Serengeti

Data layer

© Copyright 2013 EMC Corporation. All rights reserved.

22
The Big Data Journey in the Enterprise
Integrated

Stage 3: Cloud Analytics Platform
• Serve many departments
• Often part of mission critical workflow
• Fully integrated with analytics/BI tools
Stage 2: Hadoop Production
 High Availability
 Consolidation
 Differentiated SLAs
 Elastic Scaling

Stage1: Hadoop Piloting
 Rapid deployment
 On the fly cluster resizing
 Choice of Hadoop distros
 Automation of cluster lifecycle

0 node

© Copyright 2013 EMC Corporation. All rights reserved.

10’s

100’s

Scale
23
Business
Intelligence

Cloud Analytics Platform

Machine
Learning

Real Time
Streams

CETAS

Automated
Models
Stream
Processing

E
T
L

Data Visualization
…

Real Time
Structured
Database

Data
Warehouse

Unstructured
and Batch
Processing

HDFS
Compute

© Copyright 2013 EMC Corporation. All rights reserved.

Cloud Infrastructure
Storage

Networking

24
Big Data Tools and Characteristics
Framework

Scale of
data

Scale of
Cluster

Computable
Data?

Local Disks?

Map-reduce:

100s PB

10s to 1,000s

Yes

Yes, for cost,
bandwidth and
availability

Big-SQL:

PB’s

10s to 100s

Some

Yes, for cost and
bandwidth

No-SQL:

Cassandra, hBase, …

Trilions
Of rows

10s to 100s

Some

Yes, for cost and
availability

In-Memory:

Billions of rows

10s-100s

Yes

Primarily
Memory

Hadoop

HawQ,, Aster Data, Impala,
…

Redis, Gemfire, Membase,
…

© Copyright 2013 EMC Corporation. All rights reserved.

25
Choose a platform that…
Allows user to pick the right tools at the right
time
Put resources where needed based on SLA policy

© Copyright 2013 EMC Corporation. All rights reserved.

26
In-house Hadoop as a Service – (Hadoop + Hadoop)
Production
ETL of log files

Ad hoc
data mining

Compute
layer
Data
layer

Production
recommendation engine
HDFS

HDFS

VMware vSphere + Serengeti
Host

© Copyright 2013 EMC Corporation. All rights reserved.

Host

Host

Host

Host

Host

27
Integrated Big Data Production – (Mixed big data workloads)
Hadoop
batch analysis

Compute
layer
Data
layer

HBase
real-time queries
HDFS

NoSQL –
Cassandra
key-value
store

MPP DBMS –
Analysis of
structured data

VMware vSphere + Serengeti
Host

© Copyright 2013 EMC Corporation. All rights reserved.

Host

Host

Host

Host

Host

28
Integrated Hadoop and Webapps – (Big Data + Other Workloads)
Short-lived
Hadoop compute cluster

Compute
layer
Data
layer

Hadoop
compute cluster

Web servers
for ecommerce site

HDFS
VMware vSphere + Serengeti
Host

© Copyright 2013 EMC Corporation. All rights reserved.

Host

Host

Host

Host

Host

29
The Big Data Journey in the Enterprise
Stage 3: Cloud Analytics Platform
 Mixed workloads
 Right tool at the right time
 Flexible and elastic infrastrure

Integrated

Stage 2: Hadoop Production
 High Availability
 Consolidation
 Differentiated SLAs
 Elastic Scaling

Stage1: Hadoop Piloting
 Rapid deployment
 On the fly cluster resizing
 Choice of Hadoop distros
 Automation of cluster lifecycle

0 node

© Copyright 2013 EMC Corporation. All rights reserved.

10’s

100’s

Scale
30
Learn More
 Download and try Serengeti
–

projectserengeti.org

• VMware Hadoop site
–

vmware.com/hadoop

• Hadoop performance on vSphere white
paper
–

http://www.vmware.com/files/pdf/techpaper
/hadoop-vsphere51-32hosts.pdf

• Hadoop virtualization extensions (HVE)
Whitepaper
–

© Copyright 2013 EMC Corporation. All rights reserved.

http://www.vmware.com/files/pdf/techpaper
/hadoop-vsphere51-32hosts.pdf

31
Thank You!
June Yang

Senior Director, VMware
juneyang@vmware.com

© Copyright 2013 EMC Corporation. All rights reserved.

Dan Baskette

Senior Consultant Technologist
dan.baskette@emc.com

32
Pivotal Sessions at EMC World
Session

Presenter

Dates/Times

The Pivotal Platform: A Purpose-Built Platform for Big-DataDriven Applications

Josh Klahr

Tue 5:30 - 6:30, Palazzo E Wed
11:30 - 12:30, Delfino 4005

Pivotal: Data Scientists on the Front Line: Examples of
Data Science in Action

Noelle Sio

Tue 10:00 - 11:00, Lando 4205
Thu 8:30 - 9:30, Palazzo F

Pivotal: Operationalizing 1000-node Hadoop Cluster –
Analytics Workbench

Clinton Ooi
Bhavin Modi

Tue 11:30 - 12:30, Palazzo L Thu
10:00- 11:00 am, Delfino 4001A

Pivotal: for Powerful Processing of Unstructured Data For
Valuable Insights

SK
Krishnamurthy

Mon 4:00 - 5:00, Lando 4201 A
Tue 4:00 - 5:00, Palazzo M

Pivotal: Big & Fast data – merging real-time data and deep
analytics

Michael
Crutcher

Mon 1:00 - 2:00, Lando 4201 A
Wed 10:00 - 11:00, Palazzo M

Pivotal: Virtualize Big Data to Make The Elephant Dance

June Yang
Dan Baskette

Mon 11:30 - 12:30, Marcello
4401A Wed 4:00 - 5:00, Palazzo
E

Hadoop Design Patterns

Don Miner

Mon 2:30 - 3:30, Palazzo F Wed
8:30 - 9:30, Delfino 4005

© Copyright 2013 EMC Corporation. All rights reserved.

33
Pivotal: Virtualize Big Data to Make the Elephant Dance

More Related Content

What's hot

Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudLeons Petražickis
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Ibm watson - who what why
Ibm   watson - who what whyIbm   watson - who what why
Ibm watson - who what whyRick Bouter
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesMapR Technologies
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetupiwrigley
 
Learn Big Data & Hadoop
Learn Big Data & Hadoop Learn Big Data & Hadoop
Learn Big Data & Hadoop Edureka!
 
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Principled Technologies
 
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...Principled Technologies
 
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelCloudera Japan
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenterPrincipled Technologies
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14iwrigley
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)outstanding59
 
Microsoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft Private Cloud
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 

What's hot (19)

Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the CloudBest Practices for Deploying Hadoop (BigInsights) in the Cloud
Best Practices for Deploying Hadoop (BigInsights) in the Cloud
 
A Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data ImplementationA Mayo Clinic Big Data Implementation
A Mayo Clinic Big Data Implementation
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Ibm watson - who what why
Ibm   watson - who what whyIbm   watson - who what why
Ibm watson - who what why
 
Productionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best PracticesProductionizing Hadoop: 7 Architectural Best Practices
Productionizing Hadoop: 7 Architectural Best Practices
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics MeetupIntroduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
Introduction to Hadoop and Cloudera, Louisville BI & Big Data Analytics Meetup
 
Learn Big Data & Hadoop
Learn Big Data & Hadoop Learn Big Data & Hadoop
Learn Big Data & Hadoop
 
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
Dell PowerEdge M820 blades: Balancing performance, density, and high availabi...
 
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...
Update your private cloud with 14th generation Dell EMC PowerEdge FC640 serve...
 
Train, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning modelTrain, predict, serve: How to go into production your machine learning model
Train, predict, serve: How to go into production your machine learning model
 
Run more applications without expanding your datacenter
Run more applications without expanding your datacenterRun more applications without expanding your datacenter
Run more applications without expanding your datacenter
 
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
An Introduction to Hadoop and Cloudera: Nashville Cloudera User Group, 10/23/14
 
Hadoop on Virtual Machines
Hadoop on Virtual MachinesHadoop on Virtual Machines
Hadoop on Virtual Machines
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
 
Microsoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database DatasheetMicrosoft SQL Azure - Cloud Based Database Datasheet
Microsoft SQL Azure - Cloud Based Database Datasheet
 
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the EnterpriseDeploying Big-Data-as-a-Service (BDaaS) in the Enterprise
Deploying Big-Data-as-a-Service (BDaaS) in the Enterprise
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 

Viewers also liked

Catching the moving targets
Catching the moving targetsCatching the moving targets
Catching the moving targetsResearch Now
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKBEBI
 
тестээр үнэлэх
тестээр үнэлэхтестээр үнэлэх
тестээр үнэлэхpvsa_8990
 
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...EMC
 
RSA Monthly Online Fraud Report -- February 2014
RSA Monthly Online Fraud Report -- February 2014RSA Monthly Online Fraud Report -- February 2014
RSA Monthly Online Fraud Report -- February 2014EMC
 
Federated Approach for Interoperating AEC/FM Ontologies
Federated Approach for Interoperating AEC/FM OntologiesFederated Approach for Interoperating AEC/FM Ontologies
Federated Approach for Interoperating AEC/FM OntologiesAna Roxin
 
Monopsony market structure
Monopsony market structureMonopsony market structure
Monopsony market structureTravis Klein
 
Biynees khemjee awah
Biynees khemjee awahBiynees khemjee awah
Biynees khemjee awahpvsa_8990
 
How Does Long-term Care Insurance Work?
How Does Long-term Care Insurance Work?How Does Long-term Care Insurance Work?
How Does Long-term Care Insurance Work?Laurel Blond
 
Mit2 092 f09_lec18
Mit2 092 f09_lec18Mit2 092 f09_lec18
Mit2 092 f09_lec18Rahman Hakim
 
Wed greek contributions
Wed greek contributionsWed greek contributions
Wed greek contributionsTravis Klein
 

Viewers also liked (20)

10 roses for_u
10 roses for_u10 roses for_u
10 roses for_u
 
Catching the moving targets
Catching the moving targetsCatching the moving targets
Catching the moving targets
 
Gusta
GustaGusta
Gusta
 
Finland
FinlandFinland
Finland
 
Automatic Annotation in UniProtKB
Automatic Annotation in UniProtKBAutomatic Annotation in UniProtKB
Automatic Annotation in UniProtKB
 
Math
MathMath
Math
 
тестээр үнэлэх
тестээр үнэлэхтестээр үнэлэх
тестээр үнэлэх
 
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...
White Paper: EMC Security Design Principles for Multi-Tenant As-a-Service Env...
 
Monopoly types
Monopoly typesMonopoly types
Monopoly types
 
Das
DasDas
Das
 
мультимедийные технологии
мультимедийные технологиимультимедийные технологии
мультимедийные технологии
 
RSA Monthly Online Fraud Report -- February 2014
RSA Monthly Online Fraud Report -- February 2014RSA Monthly Online Fraud Report -- February 2014
RSA Monthly Online Fraud Report -- February 2014
 
Federated Approach for Interoperating AEC/FM Ontologies
Federated Approach for Interoperating AEC/FM OntologiesFederated Approach for Interoperating AEC/FM Ontologies
Federated Approach for Interoperating AEC/FM Ontologies
 
Monopsony market structure
Monopsony market structureMonopsony market structure
Monopsony market structure
 
Fri rights of man
Fri rights of manFri rights of man
Fri rights of man
 
Biynees khemjee awah
Biynees khemjee awahBiynees khemjee awah
Biynees khemjee awah
 
How Does Long-term Care Insurance Work?
How Does Long-term Care Insurance Work?How Does Long-term Care Insurance Work?
How Does Long-term Care Insurance Work?
 
Mit2 092 f09_lec18
Mit2 092 f09_lec18Mit2 092 f09_lec18
Mit2 092 f09_lec18
 
Gedeelddoor pime
Gedeelddoor pimeGedeelddoor pime
Gedeelddoor pime
 
Wed greek contributions
Wed greek contributionsWed greek contributions
Wed greek contributions
 

Similar to Pivotal: Virtualize Big Data to Make the Elephant Dance

Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)outstanding59
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Bhupesh Bansal
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop User Group
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deckKeithETD_CTO
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Hortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big datasolarisyourep
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big dataxKinAnx
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...The Linux Foundation
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseCloudera, Inc.
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanJim Kaskade
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
Architecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentationArchitecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentationVlad Ponomarev
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. Dscoopnewsgroup
 

Similar to Pivotal: Virtualize Big Data to Make the Elephant Dance (20)

Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
 
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
Voldemort & Hadoop @ Linkedin, Hadoop User Group Jan 2010
 
Hadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedInHadoop and Voldemort @ LinkedIn
Hadoop and Voldemort @ LinkedIn
 
Solving Big Data Problems
Solving Big Data ProblemsSolving Big Data Problems
Solving Big Data Problems
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
EMC config Hadoop
EMC config HadoopEMC config Hadoop
EMC config Hadoop
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
Presentation architecting virtualized infrastructure for big data
Presentation   architecting virtualized infrastructure for big dataPresentation   architecting virtualized infrastructure for big data
Presentation architecting virtualized infrastructure for big data
 
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
LF Collab Summit 2015: ARM Servers for the Next Generation Date Center and Cl...
 
Hadoop: Extending your Data Warehouse
Hadoop: Extending your Data WarehouseHadoop: Extending your Data Warehouse
Hadoop: Extending your Data Warehouse
 
Vmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps IronfanVmware Serengeti - Based on Infochimps Ironfan
Vmware Serengeti - Based on Infochimps Ironfan
 
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
Architecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentationArchitecting virtualized infrastructure for big data presentation
Architecting virtualized infrastructure for big data presentation
 
BIG DATA
BIG DATABIG DATA
BIG DATA
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. DImperative Induced Innovation - Patrick W. Dowd, Ph. D
Imperative Induced Innovation - Patrick W. Dowd, Ph. D
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDEMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOEMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremioEMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereEMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewEMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeEMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityEMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeEMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsEMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookEMC
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS BreachEMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 
2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach2014 Cybercrime Roundup: The Year of the POS Breach
2014 Cybercrime Roundup: The Year of the POS Breach
 

Recently uploaded

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Recently uploaded (20)

Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Pivotal: Virtualize Big Data to Make the Elephant Dance

  • 1. Virtualize Big Data to Make the Elephant Dance June Yang, Senior Director of Product Management, VMWare Dan Baskett, Senior Consultant Technologist, Pivotal © Copyright 2013 EMC Corporation. All rights reserved. 1
  • 2. Unstructured Data is exploding… Hadoop is driving growth Hadoop adoption is ramping Unstructured data driving growth Don't know Other 2% 2% Testing 2% Complex unstructured data forecasted to outpace structured relational data by 10x by 2020 Piloting 18% Inproduction 23% 2011 2012 2013 2014 2015 2016 Structured 2017 2018 Unstructured 2019 Evaluating 53% 2020 Source: Forrester Survey of 60 CIOs , September 2011 • Unstructured data explosion and Hadoop capabilities causing CIOs to reconsider Enterprise data strategy • • Gartner predicts +800% data growth over next 5 years Hadoop’s ability to process raw data at cost presents intriguing value prop for CIOs © Copyright 2013 EMC Corporation. All rights reserved. 2
  • 3. Broad Application of Hadoop Technology Use Cases Vertical Industries Log Processing / Click Stream Analytics Financial Services Machine Learning / sophisticated data mining Internet Retailer Web crawling / text processing Pharmaceutical / Drug Discovery Extract Transform Load (ETL) replacement Mobile / Telecom Image / XML message processing Scientific Research General archiving / compliance Social Media Hadoop is a platform that will revolutionize how Enterprises handle data © Copyright 2013 EMC Corporation. All rights reserved. 3
  • 4. The Big Data Journey in the Enterprise Integrated Stage 3: Cloud Analytics Platform • Serve many departments • Often part of mission critical workflow • Fully integrated with analytics/BI tools Stage 2: Hadoop Production • Serve a few departments • More use cases • Growing # and size of clusters • Core Hadoop + components Stage1: Hadoop Piloting • Often start with line of business • Try 1 or 2 use cases to explore the value of Hadoop 0 node © Copyright 2013 EMC Corporation. All rights reserved. 10’s 100’s Scale 4
  • 5. Deploy Hadoop Clusters in Minutes © Copyright 2013 EMC Corporation. All rights reserved. 5
  • 6. One click to scale out your cluster on the fly © Copyright 2013 EMC Corporation. All rights reserved. 6
  • 7. Customize your Hadoop/Hbase Cluster Customize with Cluster Specification File © Copyright 2013 EMC Corporation. All rights reserved. 7
  • 8. Cluster Spec File Details Storage configuration Choice of shared storage or Local disk High availability option # of Hadoop nodes Resource configuration © Copyright 2013 EMC Corporation. All rights reserved. Cluster Specification File "groups":[ { "name":"master", "roles":[ "hadoop_namenode", "hadoop_jobtracker”], "storage": { "type": "SHARED”, sizeGB": 20}, "instance_type":MEDIUM, "instance_num":1, "ha":true}, {"name":"worker", "roles":[ "hadoop_datanode", "hadoop_tasktracker" ], "instance_type":SMALL, "instance_num":5, "ha":false … 8
  • 9. Your Choice of Hadoop Distributions and Tools Distributions Community Projects • Flexibility to choose and try out major distributions • Support for multiple projects • Open architecture to welcome industry participation • Contributing Hadoop Virtualization Extensions (HVE) to open source community © Copyright 2013 EMC Corporation. All rights reserved. 9
  • 10. Proactive monitoring with VCOPs  Proactively monitoring through VCOPs  Gain comprehensive visibility  Eliminate manual processes with intelligent automation  Proactively manage operations  Alternatively, use monitoring tools like Nagios, Ganglia © Copyright 2013 EMC Corporation. All rights reserved. 10
  • 11. Beyond day 1 - Automation of Hadoop Cluster lifecycle management … Deploy Custo mize Scaling Tune config uration Load data Execut e jobs © Copyright 2013 EMC Corporation. All rights reserved. 11
  • 12. The Big Data Journey in the Enterprise Integrated Stage 2: Hadoop Production • Serve a few departments • More use cases • Growing # and size of clusters • Core Hadoop + components Stage1: Hadoop Piloting  Rapid deployment  On the fly cluster resizing  Choice of Hadoop distros  Automation of cluster lifecycle 0 node © Copyright 2013 EMC Corporation. All rights reserved. 10’s 100’s Scale 12
  • 13. Achieve HA for the Entire Hadoop Stack Zookeepr (Coordination) Pig (Data Flow) BI Reporting Hive (SQL) RDBMS Hive MetaDB HCatalog Hcatalog MDB MapReduce (Job Scheduling/Execution System) HBase (Key-Value store) HDFS (Hadoop Distributed File System) Jobtracker Namenode Management Server ETL Tools Server • vSphere HA is battle-tested high availability technology • Single mechanism to achieve HA for the entire Hadoop stack • One click to enable HA and/or FT © Copyright 2013 EMC Corporation. All rights reserved. 13
  • 14. Challenges of Running Hadoop in Enterprises Dept A: recommendation engine Production Production Test Log files Experimentation Transaction data Dept B: ad targeting Social data © Copyright 2013 EMC Corporation. All rights reserved. On the horizon… NoSQL Real time SQL … Test Experimentation Historical cust behavior Pain Points: 1. Cluster sprawling 2. Redundant common data in separate clusters 3. Difficult use the right tool for the right problem 4. Peak compute and I/O resource is limited to number of nodes in each independent cluster 14
  • 15. What if you can… Recommendation engine Ad targeting Production Production Test Experimentation Test Experimentation © Copyright 2013 EMC Corporation. All rights reserved. One physical platform to support multiple virtual big data clusters Experimentation Production recommendation engine Test/Dev Production Ad Targeting 15
  • 16. Bigger is Better  Hadoop is linearly scalable, more nodes, better performance, for the same job, it will take – 2 hour to complete on a 50 node cluster – 1 hour to complete on a 100 node cluster – 30 min to complete on a 200 node cluster © Copyright 2013 EMC Corporation. All rights reserved. 16
  • 17. You may ask  What about differentiated SLAs – –  For production Hadoop jobs, need to ensure high priority Lower priority of experimental Hadoop jobs. Will I have a noisy neighbor problems with shared infrastructure approach? © Copyright 2013 EMC Corporation. All rights reserved. 17
  • 18. VM Containers with Isolation are a Tried and Tested Approach Reckless Workload 2 Hungry Workload 1 Noisy Workload 3 VMware vSphere + Serengeti Host Host © Copyright 2013 EMC Corporation. All rights reserved. Host Host Host Host Host 18
  • 19. Shared infrastructure: Three big types of Isolation are Required  Resource Isolation • Control the greedy noisy neighbor • Reserve resources to meet needs  Version Isolation • Allow concurrent OS, App, Distro versions  Security Isolation • Provide privacy between users/groups • Runtime and data privacy required VMware vSphere + Serengeti Host Host © Copyright 2013 EMC Corporation. All rights reserved. Host Host Host Host Host 19
  • 20. With virtualization, you can have your cake and eat it too  One physical platform to support multiple virtual big data clusters Experimentation Compute layer Data layer Production recommendation engine Test/Dev Production Ad Targeting VMware vSphere + Serengeti – – Low Priority High Priority – – Share data to minimize copying Single infrastructure to maintain Bigger cluster for better performance Share hardware resource to achieve higher utilization  Virtualization ensures strong isolation between clusters. – – – – © Copyright 2013 EMC Corporation. All rights reserved. Resource isolation. Failure isolation Configure isolation Security isolation 20
  • 21. Elastic Hadoop with Virtualization VM Hadoop Node Combined Storage/Com pute Unmodified Hadoop node in a VM  VM lifecycle determined by Datanode  Limited elasticity © Copyright 2013 EMC Corporation. All rights reserved. VM VM T1 Compute VM Storage Separate Compute from Storage  Separate compute from data  Stateless compute  Elastic compute VM VM T2 Storage Separate Virtual Compute Clusters per tenant  Separate virtual compute  Compute cluster per tenant  Stronger VM-grade security and resource isolation 21
  • 22. Scale in/out Hadoop dynamically  Deploy separate compute clusters for different tenants sharing HDFS.  Commission/decommission task trackers according to priority and available resources Job Tracker Job Tracker Compute layer Compute VM Compute VM Dynamic resourcepool Experimentation Experimentation Compute VM Compute VM Compute VM Compute VM Compute VM Compute VM Production recommendation engine Production VMware vSphere + Serengeti Data layer © Copyright 2013 EMC Corporation. All rights reserved. 22
  • 23. The Big Data Journey in the Enterprise Integrated Stage 3: Cloud Analytics Platform • Serve many departments • Often part of mission critical workflow • Fully integrated with analytics/BI tools Stage 2: Hadoop Production  High Availability  Consolidation  Differentiated SLAs  Elastic Scaling Stage1: Hadoop Piloting  Rapid deployment  On the fly cluster resizing  Choice of Hadoop distros  Automation of cluster lifecycle 0 node © Copyright 2013 EMC Corporation. All rights reserved. 10’s 100’s Scale 23
  • 24. Business Intelligence Cloud Analytics Platform Machine Learning Real Time Streams CETAS Automated Models Stream Processing E T L Data Visualization … Real Time Structured Database Data Warehouse Unstructured and Batch Processing HDFS Compute © Copyright 2013 EMC Corporation. All rights reserved. Cloud Infrastructure Storage Networking 24
  • 25. Big Data Tools and Characteristics Framework Scale of data Scale of Cluster Computable Data? Local Disks? Map-reduce: 100s PB 10s to 1,000s Yes Yes, for cost, bandwidth and availability Big-SQL: PB’s 10s to 100s Some Yes, for cost and bandwidth No-SQL: Cassandra, hBase, … Trilions Of rows 10s to 100s Some Yes, for cost and availability In-Memory: Billions of rows 10s-100s Yes Primarily Memory Hadoop HawQ,, Aster Data, Impala, … Redis, Gemfire, Membase, … © Copyright 2013 EMC Corporation. All rights reserved. 25
  • 26. Choose a platform that… Allows user to pick the right tools at the right time Put resources where needed based on SLA policy © Copyright 2013 EMC Corporation. All rights reserved. 26
  • 27. In-house Hadoop as a Service – (Hadoop + Hadoop) Production ETL of log files Ad hoc data mining Compute layer Data layer Production recommendation engine HDFS HDFS VMware vSphere + Serengeti Host © Copyright 2013 EMC Corporation. All rights reserved. Host Host Host Host Host 27
  • 28. Integrated Big Data Production – (Mixed big data workloads) Hadoop batch analysis Compute layer Data layer HBase real-time queries HDFS NoSQL – Cassandra key-value store MPP DBMS – Analysis of structured data VMware vSphere + Serengeti Host © Copyright 2013 EMC Corporation. All rights reserved. Host Host Host Host Host 28
  • 29. Integrated Hadoop and Webapps – (Big Data + Other Workloads) Short-lived Hadoop compute cluster Compute layer Data layer Hadoop compute cluster Web servers for ecommerce site HDFS VMware vSphere + Serengeti Host © Copyright 2013 EMC Corporation. All rights reserved. Host Host Host Host Host 29
  • 30. The Big Data Journey in the Enterprise Stage 3: Cloud Analytics Platform  Mixed workloads  Right tool at the right time  Flexible and elastic infrastrure Integrated Stage 2: Hadoop Production  High Availability  Consolidation  Differentiated SLAs  Elastic Scaling Stage1: Hadoop Piloting  Rapid deployment  On the fly cluster resizing  Choice of Hadoop distros  Automation of cluster lifecycle 0 node © Copyright 2013 EMC Corporation. All rights reserved. 10’s 100’s Scale 30
  • 31. Learn More  Download and try Serengeti – projectserengeti.org • VMware Hadoop site – vmware.com/hadoop • Hadoop performance on vSphere white paper – http://www.vmware.com/files/pdf/techpaper /hadoop-vsphere51-32hosts.pdf • Hadoop virtualization extensions (HVE) Whitepaper – © Copyright 2013 EMC Corporation. All rights reserved. http://www.vmware.com/files/pdf/techpaper /hadoop-vsphere51-32hosts.pdf 31
  • 32. Thank You! June Yang Senior Director, VMware juneyang@vmware.com © Copyright 2013 EMC Corporation. All rights reserved. Dan Baskette Senior Consultant Technologist dan.baskette@emc.com 32
  • 33. Pivotal Sessions at EMC World Session Presenter Dates/Times The Pivotal Platform: A Purpose-Built Platform for Big-DataDriven Applications Josh Klahr Tue 5:30 - 6:30, Palazzo E Wed 11:30 - 12:30, Delfino 4005 Pivotal: Data Scientists on the Front Line: Examples of Data Science in Action Noelle Sio Tue 10:00 - 11:00, Lando 4205 Thu 8:30 - 9:30, Palazzo F Pivotal: Operationalizing 1000-node Hadoop Cluster – Analytics Workbench Clinton Ooi Bhavin Modi Tue 11:30 - 12:30, Palazzo L Thu 10:00- 11:00 am, Delfino 4001A Pivotal: for Powerful Processing of Unstructured Data For Valuable Insights SK Krishnamurthy Mon 4:00 - 5:00, Lando 4201 A Tue 4:00 - 5:00, Palazzo M Pivotal: Big & Fast data – merging real-time data and deep analytics Michael Crutcher Mon 1:00 - 2:00, Lando 4201 A Wed 10:00 - 11:00, Palazzo M Pivotal: Virtualize Big Data to Make The Elephant Dance June Yang Dan Baskette Mon 11:30 - 12:30, Marcello 4401A Wed 4:00 - 5:00, Palazzo E Hadoop Design Patterns Don Miner Mon 2:30 - 3:30, Palazzo F Wed 8:30 - 9:30, Delfino 4005 © Copyright 2013 EMC Corporation. All rights reserved. 33