SlideShare a Scribd company logo
Greenplum Analytics
                                            Workbench


                                               APURVA DESAI




© Copyright 2012 EMC Corporation. All rights reserved.            1
Overview




© Copyright 2012 EMC Corporation. All rights reserved.              2
What is Hadoop?
 What is Hadoop?
        –    Distributed computing paradigm
        –    File system – HDFS
        –    Processing framework –Map Reduce
        –    Languages – PIG, HIVE
        –    Key Value Store – Hbase
 Why is it important?
        – BIG Data is everywhere
        – BIG Data is mostly unstructured
        – Need affordable, scalable no-sql processing


© Copyright 2012 EMC Corporation. All rights reserved.   3
Analytics Workbench - Motivation
 Open source
        – Hadoop industry is nascent
        – BIG Data development needs scale


 Greenplum
        – Innovation & Experimentation platform
        – Contribute to the community
        – GPDB & GPHD - Mixed mode environment




© Copyright 2012 EMC Corporation. All rights reserved.   4
Greenplum Vision




© Copyright 2012 EMC Corporation. All rights reserved.   5
Buildout Pre-requisites
 Hardware systems integration


 Hadoop experience


 Program Management


 Partner ecosystem

          Greenplum has Inhouse Expertise

© Copyright 2012 EMC Corporation. All rights reserved.   6
Team Introduction
                                                          System Integration
                                                           – Greg, Eric, Don, Dave,
                                                             Patrick



                                                          Program Management
                                                           – Mike, Joe



                                                          Hadoop
                                                           – Apurva, Judes, Clinton,
                                                             Chandra, Ashwin




© Copyright 2012 EMC Corporation. All rights reserved.                                 7
Partners
                                                          Intel
                                                            – 2000 Westmere CPUs

                                                          Mellanox
                                                            – 1,000+ NICs
                                                            – 72 IB switches

                                                          Micron
                                                            – 6,000 8GB DRAM

                                                          Seagate
                                                            – 12,000 2TB Drives

                                                          Supermicro
                                                            – 1000 Chasis/MB


© Copyright 2012 EMC Corporation. All rights reserved.                             8
Partners
                                                          Switch
                                                           – Hosting Facilities


                                                          VMware
                                                           – Operational Support
                                                           – Rubicon




© Copyright 2012 EMC Corporation. All rights reserved.                             9
Peek @ the Cluster




© Copyright 2012 EMC Corporation. All rights reserved.   10
Cluster Statistics
 Largest cluster for Apache Hadoop validation!

 # Of Physical Hosts : > 1,000 (> 10,000 with VMs)
 # Of Racks : 54 (50 just for the DataNodes)
 # Of Processors : > 24,000
 Amount Of RAM : > 48TB
 Amount of Disk Capacity : > 24PB
        – “Equivalent to nearly half of the entire written works of
          mankind from the beginning of recorded history”



© Copyright 2012 EMC Corporation. All rights reserved.                11
Namenode




© Copyright 2012 EMC Corporation. All rights reserved.   12
Job Tracker




© Copyright 2012 EMC Corporation. All rights reserved.   13
CPU




© Copyright 2012 EMC Corporation. All rights reserved.   14
Use Cases




© Copyright 2012 EMC Corporation. All rights reserved.          15
Hadoop Review




© Copyright 2012 EMC Corporation. All rights reserved.   16
Hadoop Shuffle




© Copyright 2012 EMC Corporation. All rights reserved.   17
Initial Use Cases
 Apache Hadoop Validation
 Mellanox UDA
 Terasort Benchmark




© Copyright 2012 EMC Corporation. All rights reserved.   18
Apache Hadoop Validation
 Purpose
        – Run Apache Hadoop Validation at Scale
        – Validate cluster configuration


 Various Configurations Validated
        – Standard Out Of The Box Configs
        – Configs Modified For IO Intensive Processing




© Copyright 2012 EMC Corporation. All rights reserved.   19
Apache Hadoop Preliminary Results
                                       Apache Hadoop-1.0.0 validation
                          1.2


                           1


                          0.8
   Execution Time (Min)




                          0.6


                          0.4                                           1000 Nodes


                          0.2


                           0




© Copyright 2012 EMC Corporation. All rights reserved.                               20
Apache Hadoop Findings
 Apache BigTop for integration tests
 Functional validation passed as expected


 Next Steps
        – Identify integration cases
        – Contribute back to BigTop
        – Stabilize Hadoop 0.23




© Copyright 2012 EMC Corporation. All rights reserved.   21
Mellanox UDA - Overview
                                                          RDMA in Hadoop Shuffle stage
                                                          Register Map & Reduce task buffer
                                                          Hadoop JT for Task completion
                                                          cp sorted maptask o/p  reduce i/p
                                                          Perform in-memory merge @reduce
                                                          Avoid disk spills for large inputs
                                                          Reduce CPU load for sort & merge
                                                          GP + Mellanox collaboration
                                                            – Open Sourcing UDA




© Copyright 2012 EMC Corporation. All rights reserved.                                          22
Mellanox UDA Preliminary Results
 Preliminary UDA results provided by Mellanox
 Show improvement with UDA vs Vanilla Hadoop.
 Better CPU utilization
 Reduced execution time


 Next Steps
        – Run on Analytics Workbench schedule for June 2012
        – Configuration on the workbench to turn it on/off




© Copyright 2012 EMC Corporation. All rights reserved.        23
TeraSort Benchmark
 Industry standard benchmark
 Good validation of configuration
 3 Steps
        – Teragen – Generate 1TB of data
        – Terasort – Sort generated data
        – Teravalidate – Validate the sort
 Measure time for each step




© Copyright 2012 EMC Corporation. All rights reserved.   24
TeraSort Benchmark Preliminary Results
                              Apache Hadoop-1.0.0 validation - TeraSort
                          9

                          8

                          7
   Exection Time in Sec




                          6

                          5

                                                                                                TeraGen
                          4
                                                                                                TeraSort
                          3

                          2

                          1

                          0
                                       1 TB                                             10 TB
                                                         # of TB Generated and Sorted




© Copyright 2012 EMC Corporation. All rights reserved.                                                     25
TeraSort Benchmark Findings
 Minimal tuning of configuration
 Results are within expected range.
 Next Steps
        – Tune the cluster for optimal performance
        – Use the benchmark for every new release




© Copyright 2012 EMC Corporation. All rights reserved.   26
Lessons Learnt




© Copyright 2012 EMC Corporation. All rights reserved.   27
Buildout Progress
                             1200
                                                                                         racked   ready
                             1000
           Number of nodes




                             800


                             600


                             400


                             200


                                0
                               Dec '11   Jan '12         Feb '12   Mar '12   April '12
                                                          Month




© Copyright 2012 EMC Corporation. All rights reserved.                                                    28
―Real‖ Hadoop Cluster




© Copyright 2012 EMC Corporation. All rights reserved.   29
Categories
 Racking & Stacking                                      Hadoop Deployment


 Networking                                              Post deployment


 Non Hadoop Hosts                                        Process


 Base OS Setup




© Copyright 2012 EMC Corporation. All rights reserved.                         30
In Closing




© Copyright 2012 EMC Corporation. All rights reserved.           31
Upcoming work
 Workbench Tasks
        –    Load various data sets
        –    Load GPDB, Hive, Hbase, Zookeeper, etc.
        –    Load Chorus, Command center, UAP stack
        –    VM provisioning
        –    Various audits
 On-boarding candidates
        –    HD Education
        –    Apache Hadoop Build & Validate
        –    Mellanox UDA
        –    Intel HiBench
        –    Big data benchmarking
        –    Hi resolution image processing, etc. etc.



© Copyright 2012 EMC Corporation. All rights reserved.   32
A day in the life @ Switch




© Copyright 2012 EMC Corporation. All rights reserved.   33
Q&A




© Copyright 2012 EMC Corporation. All rights reserved.         34
Other Relevant Greenplum Sessions
Session                                                  Presenter          Times
Unified Analytics Platform Introduction                  Brian Wilson       Tues 10:00-11:00   Thurs 1:00-2:00
Greenplum Database Overview                              Michael Crutcher   Mon 8:30-9:30      Wed 10:00-11:00
Greenplum Hadoop Overview                                Susheel Kaushik    Mon 10:00-11:00    Wed 4:15-5:15
Greenplum DCA Overview                                   Hanxi Chen         Mon 4:00-5:00      Thurs 10:00-11:00
Greenplum Analytics Workbench                            Apurva Desai       Wed 8:30-9:30      Thurs 10:00-11:00
Analytics on Hadoop                                      Don Miner          Tues 11:30-12:30   Thurs 8:30-9:30
Optimizing Greenplum Database on VMware                  Kevin O’Leary      Mon 4:00-5:00      Tues 4:15-5:15
Virtualized Infrastructure
Big Data Driven Businesses in Action:                    Mike Maxey         Wed 4:15-5:15      Thurs 11:30-12:30
Creating Real Business Value Using
Greenplum UAP (Panel w/4 Customers)
Analytics for Business Value: Collaboration              Josh Klahr         Mon 10:00-11:00    Wed 2:45-3:45
Disruptive Data Science — How Data                       Annika Jimenez     Tues 4:15-5:15     Thurs 11:30-12:30
Science and Big Data are Transforming                    David Dietrich
Business, IT and People




© Copyright 2012 EMC Corporation. All rights reserved.                                                             35
Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?

More Related Content

What's hot

SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloudaidanshribman
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddin
Sal Marcus
 
Avamar 7 2010
Avamar 7 2010Avamar 7 2010
Avamar 7 2010
Phani Kumar
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends
InnoTech
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
Altoros
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011UGIF
 
30a accessing your cluster
30a accessing your cluster30a accessing your cluster
30a accessing your clustermapr-academy
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Impetus Technologies
 
EMC Deduplication Fundamentals
EMC Deduplication FundamentalsEMC Deduplication Fundamentals
EMC Deduplication Fundamentals
emcbaltics
 
Avamar Run Book - 5-14-2015_v3
Avamar Run Book - 5-14-2015_v3Avamar Run Book - 5-14-2015_v3
Avamar Run Book - 5-14-2015_v3Bill Oliver
 
Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performanceDataWorks Summit
 
Presentation deduplication backup software and system
Presentation   deduplication backup software and systemPresentation   deduplication backup software and system
Presentation deduplication backup software and system
xKinAnx
 
Debugging and Configuration Best Practices for Oracle Linux
Debugging and Configuration Best Practices for Oracle LinuxDebugging and Configuration Best Practices for Oracle Linux
Debugging and Configuration Best Practices for Oracle LinuxTerry Wang
 
Database performance with Dell PowerEdge PCIe Express Flash SSDs
Database performance with Dell PowerEdge PCIe Express Flash SSDsDatabase performance with Dell PowerEdge PCIe Express Flash SSDs
Database performance with Dell PowerEdge PCIe Express Flash SSDs
Principled Technologies
 
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
HSA Foundation
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
HSA Foundation
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
Adrian Cockcroft
 

What's hot (20)

SAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego CloudSAP Virtualization Week 2012 - The Lego Cloud
SAP Virtualization Week 2012 - The Lego Cloud
 
Collaborate07kmohiuddin
Collaborate07kmohiuddinCollaborate07kmohiuddin
Collaborate07kmohiuddin
 
Avamar 7 2010
Avamar 7 2010Avamar 7 2010
Avamar 7 2010
 
Top Technology Trends
Top Technology Trends Top Technology Trends
Top Technology Trends
 
How to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop ClusterHow to Increase Performance of Your Hadoop Cluster
How to Increase Performance of Your Hadoop Cluster
 
Ugif 04 2011 storage prov-pot_march_2011
Ugif 04 2011   storage prov-pot_march_2011Ugif 04 2011   storage prov-pot_march_2011
Ugif 04 2011 storage prov-pot_march_2011
 
30a accessing your cluster
30a accessing your cluster30a accessing your cluster
30a accessing your cluster
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
EMC Deduplication Fundamentals
EMC Deduplication FundamentalsEMC Deduplication Fundamentals
EMC Deduplication Fundamentals
 
Avamar Run Book - 5-14-2015_v3
Avamar Run Book - 5-14-2015_v3Avamar Run Book - 5-14-2015_v3
Avamar Run Book - 5-14-2015_v3
 
B17 Eliminating the database bottleneck
B17 Eliminating the database bottleneckB17 Eliminating the database bottleneck
B17 Eliminating the database bottleneck
 
Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
 
Presentation deduplication backup software and system
Presentation   deduplication backup software and systemPresentation   deduplication backup software and system
Presentation deduplication backup software and system
 
Debugging and Configuration Best Practices for Oracle Linux
Debugging and Configuration Best Practices for Oracle LinuxDebugging and Configuration Best Practices for Oracle Linux
Debugging and Configuration Best Practices for Oracle Linux
 
Database performance with Dell PowerEdge PCIe Express Flash SSDs
Database performance with Dell PowerEdge PCIe Express Flash SSDsDatabase performance with Dell PowerEdge PCIe Express Flash SSDs
Database performance with Dell PowerEdge PCIe Express Flash SSDs
 
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
AFDS 2012 Phil Rogers Keynote: THE PROGRAMMER’S GUIDE TO A UNIVERSE OF POSSIB...
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 
Commercial track 1_The Power of UDP
Commercial track 1_The Power of UDPCommercial track 1_The Power of UDP
Commercial track 1_The Power of UDP
 
50a volumes
50a volumes50a volumes
50a volumes
 
Solaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and TuningSolaris Linux Performance, Tools and Tuning
Solaris Linux Performance, Tools and Tuning
 

Viewers also liked

White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
 White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian... White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
EMC
 
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
EMC
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
EMC
 
Actian Vector Whitepaper
 Actian Vector Whitepaper Actian Vector Whitepaper
Actian Vector Whitepaper
Edgar Alejandro Villegas
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
Alessandro Salvatico
 
Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi
Sachin Aggarwal
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
datamantra
 
Jump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIJump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIActian Corporation
 
Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview
Actian Corporation
 
Turning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business ValueTurning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business Value
Actian Corporation
 
MySQL Workbench for DFW Unix Users Group
MySQL Workbench for DFW Unix Users GroupMySQL Workbench for DFW Unix Users Group
MySQL Workbench for DFW Unix Users Group
Dave Stokes
 
Iig excel 2010_exercise_vn
Iig excel 2010_exercise_vnIig excel 2010_exercise_vn
Iig excel 2010_exercise_vn
Chi Lê Yến
 
Workbench "Always on the Job!"© software-as-a-service for social collaboration
Workbench "Always on the Job!"© software-as-a-service for social collaborationWorkbench "Always on the Job!"© software-as-a-service for social collaboration
Workbench "Always on the Job!"© software-as-a-service for social collaboration
tom termini
 
Lap+trinh+vba
Lap+trinh+vbaLap+trinh+vba
Lap+trinh+vba
xitrumball
 
greenplum installation guide - 4 node VM
greenplum installation guide - 4 node VM greenplum installation guide - 4 node VM
greenplum installation guide - 4 node VM
seungdon Choi
 
Vba cho ppt
Vba cho pptVba cho ppt
Vba cho ppt
xongdzomuong
 
Bài giảng ACCESS - VBA
Bài giảng ACCESS - VBABài giảng ACCESS - VBA
Bài giảng ACCESS - VBA
hg4ever
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
datamantra
 
MySQL Administration and Monitoring
MySQL Administration and MonitoringMySQL Administration and Monitoring
MySQL Administration and MonitoringMark Leith
 

Viewers also liked (20)

White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
 White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian... White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
White Paper: Backup and Recovery of the EMC Greenplum Data Computing Applian...
 
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
White Paper: Monitoring EMC Greenplum DCA with Nagios - EMC Greenplum Data Co...
 
Greenplum Database Overview
Greenplum Database Overview Greenplum Database Overview
Greenplum Database Overview
 
Actian Vector Whitepaper
 Actian Vector Whitepaper Actian Vector Whitepaper
Actian Vector Whitepaper
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
 
Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi Data Science with Spark by Saeed Aghabozorgi
Data Science with Spark by Saeed Aghabozorgi
 
Platform for Data Scientists
Platform for Data ScientistsPlatform for Data Scientists
Platform for Data Scientists
 
Jump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROIJump start your analytics investments and accelerate analytics ROI
Jump start your analytics investments and accelerate analytics ROI
 
Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview Analytics at the Speed of Thought: Actian Express Overview
Analytics at the Speed of Thought: Actian Express Overview
 
Turning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business ValueTurning Your Data Lake into Measurable Business Value
Turning Your Data Lake into Measurable Business Value
 
1. Ms Excel Ung Dung Trong Kinh Te (Phan I)
1. Ms Excel Ung Dung Trong Kinh Te (Phan I)1. Ms Excel Ung Dung Trong Kinh Te (Phan I)
1. Ms Excel Ung Dung Trong Kinh Te (Phan I)
 
MySQL Workbench for DFW Unix Users Group
MySQL Workbench for DFW Unix Users GroupMySQL Workbench for DFW Unix Users Group
MySQL Workbench for DFW Unix Users Group
 
Iig excel 2010_exercise_vn
Iig excel 2010_exercise_vnIig excel 2010_exercise_vn
Iig excel 2010_exercise_vn
 
Workbench "Always on the Job!"© software-as-a-service for social collaboration
Workbench "Always on the Job!"© software-as-a-service for social collaborationWorkbench "Always on the Job!"© software-as-a-service for social collaboration
Workbench "Always on the Job!"© software-as-a-service for social collaboration
 
Lap+trinh+vba
Lap+trinh+vbaLap+trinh+vba
Lap+trinh+vba
 
greenplum installation guide - 4 node VM
greenplum installation guide - 4 node VM greenplum installation guide - 4 node VM
greenplum installation guide - 4 node VM
 
Vba cho ppt
Vba cho pptVba cho ppt
Vba cho ppt
 
Bài giảng ACCESS - VBA
Bài giảng ACCESS - VBABài giảng ACCESS - VBA
Bài giảng ACCESS - VBA
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
 
MySQL Administration and Monitoring
MySQL Administration and MonitoringMySQL Administration and Monitoring
MySQL Administration and Monitoring
 

Similar to Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?

Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
EMC
 
Operate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineOperate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineDataWorks Summit
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
outstanding59
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldRichard McDougall
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
outstanding59
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
EMC
 
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
Yahoo Developer Network
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Romeo Kienzler
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big Data
Cloudera, Inc.
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
Benoit Hudzia
 
A27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesA27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesInsight Technology, Inc.
 
Transform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC TechnologiesTransform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC Technologies
Cenk Ersoy
 
Greenplum Database on HDFS
Greenplum Database on HDFSGreenplum Database on HDFS
Greenplum Database on HDFSDataWorks Summit
 
Virtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In ChineseVirtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In Chinese
天青 王
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data Access
DataWorks Summit
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
Xiao Qin
 
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
Emulex Corporation
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Etu Solution
 

Similar to Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You? (20)

HugNov14
HugNov14HugNov14
HugNov14
 
Hadoop 101
Hadoop 101Hadoop 101
Hadoop 101
 
Operate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmineOperate your hadoop cluster like a high eff goldmine
Operate your hadoop cluster like a high eff goldmine
 
Greenplum feature
Greenplum featureGreenplum feature
Greenplum feature
 
App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)App cap2956v2-121001194956-phpapp01 (1)
App cap2956v2-121001194956-phpapp01 (1)
 
Inside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworldInside the Hadoop Machine @ VMworld
Inside the Hadoop Machine @ VMworld
 
App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)App Cap2956v2 121001194956 Phpapp01 (1)
App Cap2956v2 121001194956 Phpapp01 (1)
 
Hadoop Overview
Hadoop Overview Hadoop Overview
Hadoop Overview
 
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
November 2014 HUG: Lessons from Hadoop 2+Java8 migration at LinkedIn
 
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
Information Retrieval, Applied Statistics and Mathematics onBigData - German ...
 
Data Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big DataData Science Day New York: The Platform for Big Data
Data Science Day New York: The Platform for Big Data
 
Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012Lego Cloud SAP Virtualization Week 2012
Lego Cloud SAP Virtualization Week 2012
 
A27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practicesA27 Vectorwise Performance Considerations_implementation_best_practices
A27 Vectorwise Performance Considerations_implementation_best_practices
 
Transform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC TechnologiesTransform Your SAP Landscape Using EMC Technologies
Transform Your SAP Landscape Using EMC Technologies
 
Greenplum Database on HDFS
Greenplum Database on HDFSGreenplum Database on HDFS
Greenplum Database on HDFS
 
Virtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In ChineseVirtual Hadoop Introduction In Chinese
Virtual Hadoop Introduction In Chinese
 
In-Place analytics with Unified Data Access
In-Place analytics with Unified Data AccessIn-Place analytics with Unified Data Access
In-Place analytics with Unified Data Access
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters Boosting Hadoop Performance with  Emulex OneConnect® 10Gb Ethernet Adapters
Boosting Hadoop Performance with Emulex OneConnect® 10Gb Ethernet Adapters
 
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
Track B-3 解構大數據架構 - 大數據系統的伺服器與網路資源規劃
 

More from EMC

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
EMC
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
EMC
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
EMC
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
EMC
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
EMC
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
EMC
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lakeEMC
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
EMC
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
EMC
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
EMC
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
EMC
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic EMC
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
EMC
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
EMC
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
EMC
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
EMC
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesEMC
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
EMC
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
EMC
 

More from EMC (20)

INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUDINDUSTRY-LEADING  TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
INDUSTRY-LEADING TECHNOLOGY FOR LONG TERM RETENTION OF BACKUPS IN THE CLOUD
 
Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote Cloud Foundry Summit Berlin Keynote
Cloud Foundry Summit Berlin Keynote
 
EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX EMC GLOBAL DATA PROTECTION INDEX
EMC GLOBAL DATA PROTECTION INDEX
 
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIOTransforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
Transforming Desktop Virtualization with Citrix XenDesktop and EMC XtremIO
 
Citrix ready-webinar-xtremio
Citrix ready-webinar-xtremioCitrix ready-webinar-xtremio
Citrix ready-webinar-xtremio
 
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
EMC FORUM RESEARCH GLOBAL RESULTS - 10,451 RESPONSES ACROSS 33 COUNTRIES
 
EMC with Mirantis Openstack
EMC with Mirantis OpenstackEMC with Mirantis Openstack
EMC with Mirantis Openstack
 
Modern infrastructure for business data lake
Modern infrastructure for business data lakeModern infrastructure for business data lake
Modern infrastructure for business data lake
 
Force Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop ElsewhereForce Cyber Criminals to Shop Elsewhere
Force Cyber Criminals to Shop Elsewhere
 
Pivotal : Moments in Container History
Pivotal : Moments in Container History Pivotal : Moments in Container History
Pivotal : Moments in Container History
 
Data Lake Protection - A Technical Review
Data Lake Protection - A Technical ReviewData Lake Protection - A Technical Review
Data Lake Protection - A Technical Review
 
Mobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or FoeMobile E-commerce: Friend or Foe
Mobile E-commerce: Friend or Foe
 
Virtualization Myths Infographic
Virtualization Myths Infographic Virtualization Myths Infographic
Virtualization Myths Infographic
 
Intelligence-Driven GRC for Security
Intelligence-Driven GRC for SecurityIntelligence-Driven GRC for Security
Intelligence-Driven GRC for Security
 
The Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure AgeThe Trust Paradox: Access Management and Trust in an Insecure Age
The Trust Paradox: Access Management and Trust in an Insecure Age
 
EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015EMC Technology Day - SRM University 2015
EMC Technology Day - SRM University 2015
 
EMC Academic Summit 2015
EMC Academic Summit 2015EMC Academic Summit 2015
EMC Academic Summit 2015
 
Data Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education ServicesData Science and Big Data Analytics Book from EMC Education Services
Data Science and Big Data Analytics Book from EMC Education Services
 
Using EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere EnvironmentsUsing EMC Symmetrix Storage in VMware vSphere Environments
Using EMC Symmetrix Storage in VMware vSphere Environments
 
Using EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBookUsing EMC VNX storage with VMware vSphereTechBook
Using EMC VNX storage with VMware vSphereTechBook
 

Recently uploaded

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 

Recently uploaded (20)

Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 

Greenplum Analytics Workbench - What Can a Private Hadoop Cloud Do For You?

  • 1. Greenplum Analytics Workbench APURVA DESAI © Copyright 2012 EMC Corporation. All rights reserved. 1
  • 2. Overview © Copyright 2012 EMC Corporation. All rights reserved. 2
  • 3. What is Hadoop?  What is Hadoop? – Distributed computing paradigm – File system – HDFS – Processing framework –Map Reduce – Languages – PIG, HIVE – Key Value Store – Hbase  Why is it important? – BIG Data is everywhere – BIG Data is mostly unstructured – Need affordable, scalable no-sql processing © Copyright 2012 EMC Corporation. All rights reserved. 3
  • 4. Analytics Workbench - Motivation  Open source – Hadoop industry is nascent – BIG Data development needs scale  Greenplum – Innovation & Experimentation platform – Contribute to the community – GPDB & GPHD - Mixed mode environment © Copyright 2012 EMC Corporation. All rights reserved. 4
  • 5. Greenplum Vision © Copyright 2012 EMC Corporation. All rights reserved. 5
  • 6. Buildout Pre-requisites  Hardware systems integration  Hadoop experience  Program Management  Partner ecosystem Greenplum has Inhouse Expertise © Copyright 2012 EMC Corporation. All rights reserved. 6
  • 7. Team Introduction  System Integration – Greg, Eric, Don, Dave, Patrick  Program Management – Mike, Joe  Hadoop – Apurva, Judes, Clinton, Chandra, Ashwin © Copyright 2012 EMC Corporation. All rights reserved. 7
  • 8. Partners  Intel – 2000 Westmere CPUs  Mellanox – 1,000+ NICs – 72 IB switches  Micron – 6,000 8GB DRAM  Seagate – 12,000 2TB Drives  Supermicro – 1000 Chasis/MB © Copyright 2012 EMC Corporation. All rights reserved. 8
  • 9. Partners  Switch – Hosting Facilities  VMware – Operational Support – Rubicon © Copyright 2012 EMC Corporation. All rights reserved. 9
  • 10. Peek @ the Cluster © Copyright 2012 EMC Corporation. All rights reserved. 10
  • 11. Cluster Statistics Largest cluster for Apache Hadoop validation!  # Of Physical Hosts : > 1,000 (> 10,000 with VMs)  # Of Racks : 54 (50 just for the DataNodes)  # Of Processors : > 24,000  Amount Of RAM : > 48TB  Amount of Disk Capacity : > 24PB – “Equivalent to nearly half of the entire written works of mankind from the beginning of recorded history” © Copyright 2012 EMC Corporation. All rights reserved. 11
  • 12. Namenode © Copyright 2012 EMC Corporation. All rights reserved. 12
  • 13. Job Tracker © Copyright 2012 EMC Corporation. All rights reserved. 13
  • 14. CPU © Copyright 2012 EMC Corporation. All rights reserved. 14
  • 15. Use Cases © Copyright 2012 EMC Corporation. All rights reserved. 15
  • 16. Hadoop Review © Copyright 2012 EMC Corporation. All rights reserved. 16
  • 17. Hadoop Shuffle © Copyright 2012 EMC Corporation. All rights reserved. 17
  • 18. Initial Use Cases  Apache Hadoop Validation  Mellanox UDA  Terasort Benchmark © Copyright 2012 EMC Corporation. All rights reserved. 18
  • 19. Apache Hadoop Validation  Purpose – Run Apache Hadoop Validation at Scale – Validate cluster configuration  Various Configurations Validated – Standard Out Of The Box Configs – Configs Modified For IO Intensive Processing © Copyright 2012 EMC Corporation. All rights reserved. 19
  • 20. Apache Hadoop Preliminary Results Apache Hadoop-1.0.0 validation 1.2 1 0.8 Execution Time (Min) 0.6 0.4 1000 Nodes 0.2 0 © Copyright 2012 EMC Corporation. All rights reserved. 20
  • 21. Apache Hadoop Findings  Apache BigTop for integration tests  Functional validation passed as expected  Next Steps – Identify integration cases – Contribute back to BigTop – Stabilize Hadoop 0.23 © Copyright 2012 EMC Corporation. All rights reserved. 21
  • 22. Mellanox UDA - Overview  RDMA in Hadoop Shuffle stage  Register Map & Reduce task buffer  Hadoop JT for Task completion  cp sorted maptask o/p  reduce i/p  Perform in-memory merge @reduce  Avoid disk spills for large inputs  Reduce CPU load for sort & merge  GP + Mellanox collaboration – Open Sourcing UDA © Copyright 2012 EMC Corporation. All rights reserved. 22
  • 23. Mellanox UDA Preliminary Results  Preliminary UDA results provided by Mellanox  Show improvement with UDA vs Vanilla Hadoop.  Better CPU utilization  Reduced execution time  Next Steps – Run on Analytics Workbench schedule for June 2012 – Configuration on the workbench to turn it on/off © Copyright 2012 EMC Corporation. All rights reserved. 23
  • 24. TeraSort Benchmark  Industry standard benchmark  Good validation of configuration  3 Steps – Teragen – Generate 1TB of data – Terasort – Sort generated data – Teravalidate – Validate the sort  Measure time for each step © Copyright 2012 EMC Corporation. All rights reserved. 24
  • 25. TeraSort Benchmark Preliminary Results Apache Hadoop-1.0.0 validation - TeraSort 9 8 7 Exection Time in Sec 6 5 TeraGen 4 TeraSort 3 2 1 0 1 TB 10 TB # of TB Generated and Sorted © Copyright 2012 EMC Corporation. All rights reserved. 25
  • 26. TeraSort Benchmark Findings  Minimal tuning of configuration  Results are within expected range.  Next Steps – Tune the cluster for optimal performance – Use the benchmark for every new release © Copyright 2012 EMC Corporation. All rights reserved. 26
  • 27. Lessons Learnt © Copyright 2012 EMC Corporation. All rights reserved. 27
  • 28. Buildout Progress 1200 racked ready 1000 Number of nodes 800 600 400 200 0 Dec '11 Jan '12 Feb '12 Mar '12 April '12 Month © Copyright 2012 EMC Corporation. All rights reserved. 28
  • 29. ―Real‖ Hadoop Cluster © Copyright 2012 EMC Corporation. All rights reserved. 29
  • 30. Categories  Racking & Stacking  Hadoop Deployment  Networking  Post deployment  Non Hadoop Hosts  Process  Base OS Setup © Copyright 2012 EMC Corporation. All rights reserved. 30
  • 31. In Closing © Copyright 2012 EMC Corporation. All rights reserved. 31
  • 32. Upcoming work  Workbench Tasks – Load various data sets – Load GPDB, Hive, Hbase, Zookeeper, etc. – Load Chorus, Command center, UAP stack – VM provisioning – Various audits  On-boarding candidates – HD Education – Apache Hadoop Build & Validate – Mellanox UDA – Intel HiBench – Big data benchmarking – Hi resolution image processing, etc. etc. © Copyright 2012 EMC Corporation. All rights reserved. 32
  • 33. A day in the life @ Switch © Copyright 2012 EMC Corporation. All rights reserved. 33
  • 34. Q&A © Copyright 2012 EMC Corporation. All rights reserved. 34
  • 35. Other Relevant Greenplum Sessions Session Presenter Times Unified Analytics Platform Introduction Brian Wilson Tues 10:00-11:00 Thurs 1:00-2:00 Greenplum Database Overview Michael Crutcher Mon 8:30-9:30 Wed 10:00-11:00 Greenplum Hadoop Overview Susheel Kaushik Mon 10:00-11:00 Wed 4:15-5:15 Greenplum DCA Overview Hanxi Chen Mon 4:00-5:00 Thurs 10:00-11:00 Greenplum Analytics Workbench Apurva Desai Wed 8:30-9:30 Thurs 10:00-11:00 Analytics on Hadoop Don Miner Tues 11:30-12:30 Thurs 8:30-9:30 Optimizing Greenplum Database on VMware Kevin O’Leary Mon 4:00-5:00 Tues 4:15-5:15 Virtualized Infrastructure Big Data Driven Businesses in Action: Mike Maxey Wed 4:15-5:15 Thurs 11:30-12:30 Creating Real Business Value Using Greenplum UAP (Panel w/4 Customers) Analytics for Business Value: Collaboration Josh Klahr Mon 10:00-11:00 Wed 2:45-3:45 Disruptive Data Science — How Data Annika Jimenez Tues 4:15-5:15 Thurs 11:30-12:30 Science and Big Data are Transforming David Dietrich Business, IT and People © Copyright 2012 EMC Corporation. All rights reserved. 35