SlideShare a Scribd company logo
1 of 1
Download to read offline
Metrics Weightage Sub - Metrics Criteria
Sub-
Weightage CDH HW/HDP MapR Pivotal HD
Scalability /Fault tolerance Yes Yes Yes Yes
Multi-tenancy- Resource Pooling
1 - groups and resource pooling without YARN
2 - groups and resource pooling and YARN
3 - groups and resource pooling and YARN (significant
contributor) / groups and resource pooling and YARN
+other prop 3% 2 3 1 3
Open source Hadoop based on
products introduced
1 - 1 products introduced
2 - 2-3 products introduced
3 - >=4 products introduced 7% 2 3 0 3
Closed source products built or Closed
source products made open including
portability
1 - 1 products
2 - 2-3 products
3 - >=4 products 2% 3 2 3 3
Cloud based products introduced
1- 1-2 products introduced
2 - >2 products introduced
3 - hadoop integration products+other prop products
introduced 1% 1 1 2 3
No. of committer seats including PMC
1- =0-25 committers,
2 - >25 and <=50 committers,
3 - >50 and 25+PMC committers 3% 3 3 1 2
Support and training provided
1 - OK
2 - good support and training
3 - Excellent support and training 3% 3 2 3 2
Revisions after release
0 - Multiple even after GA
2- Makes the product available only after suitable testing 3% 0 2 2 2
SQL Focus : Open source /Closed
source
0 - Closed source
2 - Open source 3% 0 2 2 2
0.45 0.56 0.29 0.57
Data management -
data lifecycle management, data
replication between HDFS and Hive,
governance, lineage, traceability and
data discovery, process coordination
and scheduling, leveraging existing
products like Oozie and Zookeeper
100% open source framework.
Allow other plug ins.
workflow orchestration /automation (using Oozie
underneath).
Dataset replication.
Dataset retention.
Hive /Hcat integration.
Dashboard /entity viewing.
Integration with system management tool. 2% 2 2 0 1
Data Ingestion - Tools offered etc
1 - Sqoop, Sqoop2 and Flume
2 - Additional 2% 2 2 2 1
Data storage - own, with other systems
1 - HDFS
2 - HDFS and others/prop 2% 1 1 2 2
Realtime Data or OLTP - using Storm,
Spark, or Gemfire, SQLfire
1 - Not sure
2 -Spark or Storm or Prop 2% 2 2 1 2
Streaming Data like Spark Streaming,
Storm
1 - Not sure
2- Spark
3 - Spark+storm 1% 2 3 1 3
Workload Management via Oozie,
Hawq or other tools
1 - Only oozie or only HAWQ
2 - Oozie+integration 3% 1 2 1 1
Data Frameworks working together and
contribution eg: Datastax, Databricks,
MS REEF
1 - very few or through few partnerships
2 - Multiple 1% 2 2 1 1
Data Analytics like Acunu, Rev R,
0 - only tieups
2 - tieups+prop 3% 0 0 0 2
Search - Integration with Search Tool
etc
1 - Prop or external
2 - Prop+external 3% 2 1 1 1
Batch Data Processing-MapReduce and
YARN
1 - Own MR
2 - Only MR+YARN
3 - MapReduce innovation and YARN+Tez or MR
innovation+YARN 5% 3 3 1 3
Multi-cluster management using prop
tools built
1 - good
2 - better
3 - best 2% 3 1 1 2
Monitoring and Managing cluster - like
Cloudera manager, Ambari, Command
Center
1 - Closed source /proprietary
2 - Open sourced
3 - Open sourced and better monitoring product / Closed
source and better monitoring 7% 3 3 2 2
Backup and Recovery/ DR: Availability
and replicaton
1 - Restart required
2 - Autorecovery of nodes or XDR
3 - Autorecovery and XDR 5% 2 2 3 2
CBO on SQL product (cost based
optimizer)
0 - No or not in current version
1 - Yes 2% 0 1 0 1
Security: Data security - Internal
1 - Not sure or None
2 - Good or prop
3 - Better 3% 3 3 2 1
Security: Access/Authentication
Security
External Security:
0 - Not sure and only Kerberos, LDAP, AD
1 - Tie-ups with vendors = Kerberos, LDAP, AD 3% 1 1 1 1
Security: System management
1- Good and prop
2- Better and prop 1% 2 1 1 2
Security: Data governance and audit
1 - Not sure
2 - Good and prop
3 - Better and prop 3% 2 3 2 1
0.99 1.00 0.70 0.84
No SQL vendors like Cassandra, Redis,
1 - <3 or not sure
2 - Prop
3 - >=3 2% 3 3 1 3
Document DBs like MongoDB,
CouchBase
1 - <3 or not sure
2 - few
3 - >=3 2% 3 3 1 3
Graphical DBs like GraphX, InfiniDB,
Giraph
1 - <3 or not sure
2 - Prop
3 - >=3 1% 3 3 1 2
Inmemory DBs like gridgain, Hana
1 - not sure
2 - no specific integration
2 - prop and specific integration 4% 2 2 1 3
MPP Databases like Greenplum,
Vertica, Netezza
1 - not sure
2 - integrates with others
3 - Prop 5% 2 2 1 3
Analytics Databases like Marklogic
1 - <3 or not sure
2 - Prop
3 - >=3 3% 1 1 1 2Messaging tech. like Kafka, Trident,
Kinesis, Spark streaming. BI tools like
Cognos, business objects. ETL tools
like Syncsort, Talend. Data
Visualization, dashboard and reporting
tech like Tableau, Datameer, Ayasdi.
Analytical products/libraries like R,
SAS, Weka. Data Security like
Protegrity, Dataguise, Vormetric.
Configuration management like Chef,
Puppet (for cluster and XDR replication)
etc. Search tools - Solr, ElasticSearch
like Solr, ElasticSearch. RDBMS and
other integration like Oracle, DB2, etc.
List of Connectors, drivers, API.
1 - integrates with fewer technologies
2 - prop and integrates with few other technologies where
prop option is not there
3 - integrates with most better known technologies 8% 3 3 1 3
0.60 0.60 0.25 0.71
Cost and Licensing Policy +
Relationship we have
Not included to remove bias on price /relationship. So all
are 0 0% 0 0 0 0
TOTAL 100% 100% 2.04 2.16 1.24 2.04
Industry Speak / Industry Norm
Our take: No one size fits all.
HADOOP framework, feature set comparison and Performance
Architectural philosophy /open
source /proprietary
25%
The industry norm is having two implementations… eg: Cloudera and Hortonworks or Hortonworks and Pivotal or Cloudera and Pivotal based on their requirements. This also helps reduce
dependency on any one vendor and being tied to one set of technologies.
Since we are looking at the entire stack/suite of products, Pivotal has a product suite/technologies in its datalake. Pivotal CommandCenter, Cloud Foundry, GPDB, HAWQ, MADlib, SQLFire, GemFire,
GemFire XD, Spring support, HAMSTER. Pivotal adheres to open-source Hadoop and has added CommandCenter and features around the Hadoop ecosystem. It did not have Hadoop commiters
before but recently has hired numerous professionals in this matter. Cloudera is becoming more and more closed source as it introduced EDH and Impala. Hortonworks believes in the open source
philosophy which is great. Speaking with Cloudera and Hortonworks executives, the question is: THE VISION and ROADMAP.... Go-forward Strategy. Can they move beyond building wrapper around
Hadoop. Cloudera and Hortonworks do not have the deep pockets or capability to go beyond Hadoop currently. MapR offers tremendous advantages since it bypasses MapReduce and hits the prop
MapR engine(auto-node feature) but the new features take one or two months to be incorporated since it is closed and prop. Also, supporting legacy versions can be a challenge with Cloudera and
MapR where customization is done.
Hadoop framework, featureset
comparison and Performance
and Management
50%
Integration with other
technologies or prop
technologies provided and
connectors, Partnership
/Vendor strategic relationship
25%

More Related Content

What's hot

Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2
ClouderaUserGroups
 
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Databricks
 
Analyzing twitter data with hadoop
Analyzing twitter data with hadoopAnalyzing twitter data with hadoop
Analyzing twitter data with hadoop
Joey Echeverria
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
 

What's hot (20)

Oracle Rac Performance Tunning Tips&Tricks
Oracle Rac Performance Tunning Tips&TricksOracle Rac Performance Tunning Tips&Tricks
Oracle Rac Performance Tunning Tips&Tricks
 
IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...
IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...
IOUG Collaborate 18 - Get the Oracle Performance Diagnostics Capabilities You...
 
IOUG Collaborate 18 - Data Guard for Beginners
IOUG Collaborate 18 - Data Guard for BeginnersIOUG Collaborate 18 - Data Guard for Beginners
IOUG Collaborate 18 - Data Guard for Beginners
 
Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )Oracle GoldenGate Microservices Overview ( with Demo )
Oracle GoldenGate Microservices Overview ( with Demo )
 
dplyr Interfaces to Large-Scale Data
dplyr Interfaces to Large-Scale Datadplyr Interfaces to Large-Scale Data
dplyr Interfaces to Large-Scale Data
 
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
 
Pa cloudera manager-api's_extensibility_v2
Pa   cloudera manager-api's_extensibility_v2Pa   cloudera manager-api's_extensibility_v2
Pa cloudera manager-api's_extensibility_v2
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
 
Hadoop 2.0 YARN webinar
Hadoop 2.0 YARN webinar Hadoop 2.0 YARN webinar
Hadoop 2.0 YARN webinar
 
Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)Python in the Hadoop Ecosystem (Rock Health presentation)
Python in the Hadoop Ecosystem (Rock Health presentation)
 
Replicate data between environments
Replicate data between environmentsReplicate data between environments
Replicate data between environments
 
Introduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache HadoopIntroduction to Cloudera's Administrator Training for Apache Hadoop
Introduction to Cloudera's Administrator Training for Apache Hadoop
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
 
OGCE RT Rroject Review
OGCE RT Rroject ReviewOGCE RT Rroject Review
OGCE RT Rroject Review
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
 
Analyzing twitter data with hadoop
Analyzing twitter data with hadoopAnalyzing twitter data with hadoop
Analyzing twitter data with hadoop
 
Spark
SparkSpark
Spark
 
Streamline it management
Streamline it managementStreamline it management
Streamline it management
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 

Similar to Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integration demo

Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
Edureka!
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
Cloudera, Inc.
 
Bigdataappliance datasheet-1883358
Bigdataappliance datasheet-1883358Bigdataappliance datasheet-1883358
Bigdataappliance datasheet-1883358
Ory Chhean
 

Similar to Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integration demo (20)

zData Inc. Big Data Consulting and Services - Overview and Summary
zData Inc. Big Data Consulting and Services - Overview and SummaryzData Inc. Big Data Consulting and Services - Overview and Summary
zData Inc. Big Data Consulting and Services - Overview and Summary
 
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution StrategyEnterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
Enterprise Hadoop is Here to Stay: Plan Your Evolution Strategy
 
Hadoop Administration pdf
Hadoop Administration pdfHadoop Administration pdf
Hadoop Administration pdf
 
BigData Security - A Point of View
BigData Security - A Point of ViewBigData Security - A Point of View
BigData Security - A Point of View
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)
 
Real time analytics
Real time analyticsReal time analytics
Real time analytics
 
Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...Govern This! Data Discovery and the application of data governance with new s...
Govern This! Data Discovery and the application of data governance with new s...
 
BigData_Krishna Kumar Sharma
BigData_Krishna Kumar SharmaBigData_Krishna Kumar Sharma
BigData_Krishna Kumar Sharma
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready state
 
Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014Teradata - Presentation at Hortonworks Booth - Strata 2014
Teradata - Presentation at Hortonworks Booth - Strata 2014
 
Hadoop - Overview
Hadoop - OverviewHadoop - Overview
Hadoop - Overview
 
Teradata Loom Introductory Presentation
Teradata Loom Introductory PresentationTeradata Loom Introductory Presentation
Teradata Loom Introductory Presentation
 
SQL in Hadoop To Boldly Go Where no Data Warehouse Has Gone Before
SQL in Hadoop  To Boldly Go Where no Data Warehouse Has Gone BeforeSQL in Hadoop  To Boldly Go Where no Data Warehouse Has Gone Before
SQL in Hadoop To Boldly Go Where no Data Warehouse Has Gone Before
 
Introduction to Data Analyst Training
Introduction to Data Analyst TrainingIntroduction to Data Analyst Training
Introduction to Data Analyst Training
 
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
Big Data Customer Education Webcast: The Latest Advancements in Syncsort DMX ...
 
Data Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data EngineeringData Engineer's Lunch #55: Get Started in Data Engineering
Data Engineer's Lunch #55: Get Started in Data Engineering
 
Extending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via APIExtending and Automating Cloudera Manager via API
Extending and Automating Cloudera Manager via API
 
Instant hadoop of your own
Instant hadoop of your ownInstant hadoop of your own
Instant hadoop of your own
 
Hadoop project design and a usecase
Hadoop project design and  a usecaseHadoop project design and  a usecase
Hadoop project design and a usecase
 
Bigdataappliance datasheet-1883358
Bigdataappliance datasheet-1883358Bigdataappliance datasheet-1883358
Bigdataappliance datasheet-1883358
 

More from nkabra

Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia university
nkabra
 

More from nkabra (12)

How i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutiqueHow i helped rue la la become a one stop ecommerce boutique
How i helped rue la la become a one stop ecommerce boutique
 
How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...How geo phy built a proprietary automated valuation platform for the commerci...
How geo phy built a proprietary automated valuation platform for the commerci...
 
How fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learningHow fleet advantage analytics uses predic engine and iot with machine learning
How fleet advantage analytics uses predic engine and iot with machine learning
 
Building a data science team at michelin tyres
Building a data science team at michelin tyresBuilding a data science team at michelin tyres
Building a data science team at michelin tyres
 
Inmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia universityInmemory db nick kabra june 2013 discussion at columbia university
Inmemory db nick kabra june 2013 discussion at columbia university
 
Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014Comparisons of no sql databases march 2014
Comparisons of no sql databases march 2014
 
Harvard case studies presentation 09102013
Harvard case studies presentation 09102013Harvard case studies presentation 09102013
Harvard case studies presentation 09102013
 
Hadoop compression analysis strata conference
Hadoop compression analysis strata conferenceHadoop compression analysis strata conference
Hadoop compression analysis strata conference
 
Hadoop compression strata conference
Hadoop compression strata conferenceHadoop compression strata conference
Hadoop compression strata conference
 
Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013Future of big data nick kabra speaker compendium march 2013
Future of big data nick kabra speaker compendium march 2013
 
Solr and ElasticSearch demo and speaker feb 2014
Solr  and ElasticSearch demo and speaker feb 2014Solr  and ElasticSearch demo and speaker feb 2014
Solr and ElasticSearch demo and speaker feb 2014
 
Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013Big data in marketing at harvard business club nick1 june 15 2013
Big data in marketing at harvard business club nick1 june 15 2013
 

Recently uploaded

Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Recently uploaded (20)

Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
ELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptxELKO dropshipping via API with DroFx.pptx
ELKO dropshipping via API with DroFx.pptx
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 

Hadoop comparative scorecard nick kabra sr mgmt 04042014 and stack integration demo

  • 1. Metrics Weightage Sub - Metrics Criteria Sub- Weightage CDH HW/HDP MapR Pivotal HD Scalability /Fault tolerance Yes Yes Yes Yes Multi-tenancy- Resource Pooling 1 - groups and resource pooling without YARN 2 - groups and resource pooling and YARN 3 - groups and resource pooling and YARN (significant contributor) / groups and resource pooling and YARN +other prop 3% 2 3 1 3 Open source Hadoop based on products introduced 1 - 1 products introduced 2 - 2-3 products introduced 3 - >=4 products introduced 7% 2 3 0 3 Closed source products built or Closed source products made open including portability 1 - 1 products 2 - 2-3 products 3 - >=4 products 2% 3 2 3 3 Cloud based products introduced 1- 1-2 products introduced 2 - >2 products introduced 3 - hadoop integration products+other prop products introduced 1% 1 1 2 3 No. of committer seats including PMC 1- =0-25 committers, 2 - >25 and <=50 committers, 3 - >50 and 25+PMC committers 3% 3 3 1 2 Support and training provided 1 - OK 2 - good support and training 3 - Excellent support and training 3% 3 2 3 2 Revisions after release 0 - Multiple even after GA 2- Makes the product available only after suitable testing 3% 0 2 2 2 SQL Focus : Open source /Closed source 0 - Closed source 2 - Open source 3% 0 2 2 2 0.45 0.56 0.29 0.57 Data management - data lifecycle management, data replication between HDFS and Hive, governance, lineage, traceability and data discovery, process coordination and scheduling, leveraging existing products like Oozie and Zookeeper 100% open source framework. Allow other plug ins. workflow orchestration /automation (using Oozie underneath). Dataset replication. Dataset retention. Hive /Hcat integration. Dashboard /entity viewing. Integration with system management tool. 2% 2 2 0 1 Data Ingestion - Tools offered etc 1 - Sqoop, Sqoop2 and Flume 2 - Additional 2% 2 2 2 1 Data storage - own, with other systems 1 - HDFS 2 - HDFS and others/prop 2% 1 1 2 2 Realtime Data or OLTP - using Storm, Spark, or Gemfire, SQLfire 1 - Not sure 2 -Spark or Storm or Prop 2% 2 2 1 2 Streaming Data like Spark Streaming, Storm 1 - Not sure 2- Spark 3 - Spark+storm 1% 2 3 1 3 Workload Management via Oozie, Hawq or other tools 1 - Only oozie or only HAWQ 2 - Oozie+integration 3% 1 2 1 1 Data Frameworks working together and contribution eg: Datastax, Databricks, MS REEF 1 - very few or through few partnerships 2 - Multiple 1% 2 2 1 1 Data Analytics like Acunu, Rev R, 0 - only tieups 2 - tieups+prop 3% 0 0 0 2 Search - Integration with Search Tool etc 1 - Prop or external 2 - Prop+external 3% 2 1 1 1 Batch Data Processing-MapReduce and YARN 1 - Own MR 2 - Only MR+YARN 3 - MapReduce innovation and YARN+Tez or MR innovation+YARN 5% 3 3 1 3 Multi-cluster management using prop tools built 1 - good 2 - better 3 - best 2% 3 1 1 2 Monitoring and Managing cluster - like Cloudera manager, Ambari, Command Center 1 - Closed source /proprietary 2 - Open sourced 3 - Open sourced and better monitoring product / Closed source and better monitoring 7% 3 3 2 2 Backup and Recovery/ DR: Availability and replicaton 1 - Restart required 2 - Autorecovery of nodes or XDR 3 - Autorecovery and XDR 5% 2 2 3 2 CBO on SQL product (cost based optimizer) 0 - No or not in current version 1 - Yes 2% 0 1 0 1 Security: Data security - Internal 1 - Not sure or None 2 - Good or prop 3 - Better 3% 3 3 2 1 Security: Access/Authentication Security External Security: 0 - Not sure and only Kerberos, LDAP, AD 1 - Tie-ups with vendors = Kerberos, LDAP, AD 3% 1 1 1 1 Security: System management 1- Good and prop 2- Better and prop 1% 2 1 1 2 Security: Data governance and audit 1 - Not sure 2 - Good and prop 3 - Better and prop 3% 2 3 2 1 0.99 1.00 0.70 0.84 No SQL vendors like Cassandra, Redis, 1 - <3 or not sure 2 - Prop 3 - >=3 2% 3 3 1 3 Document DBs like MongoDB, CouchBase 1 - <3 or not sure 2 - few 3 - >=3 2% 3 3 1 3 Graphical DBs like GraphX, InfiniDB, Giraph 1 - <3 or not sure 2 - Prop 3 - >=3 1% 3 3 1 2 Inmemory DBs like gridgain, Hana 1 - not sure 2 - no specific integration 2 - prop and specific integration 4% 2 2 1 3 MPP Databases like Greenplum, Vertica, Netezza 1 - not sure 2 - integrates with others 3 - Prop 5% 2 2 1 3 Analytics Databases like Marklogic 1 - <3 or not sure 2 - Prop 3 - >=3 3% 1 1 1 2Messaging tech. like Kafka, Trident, Kinesis, Spark streaming. BI tools like Cognos, business objects. ETL tools like Syncsort, Talend. Data Visualization, dashboard and reporting tech like Tableau, Datameer, Ayasdi. Analytical products/libraries like R, SAS, Weka. Data Security like Protegrity, Dataguise, Vormetric. Configuration management like Chef, Puppet (for cluster and XDR replication) etc. Search tools - Solr, ElasticSearch like Solr, ElasticSearch. RDBMS and other integration like Oracle, DB2, etc. List of Connectors, drivers, API. 1 - integrates with fewer technologies 2 - prop and integrates with few other technologies where prop option is not there 3 - integrates with most better known technologies 8% 3 3 1 3 0.60 0.60 0.25 0.71 Cost and Licensing Policy + Relationship we have Not included to remove bias on price /relationship. So all are 0 0% 0 0 0 0 TOTAL 100% 100% 2.04 2.16 1.24 2.04 Industry Speak / Industry Norm Our take: No one size fits all. HADOOP framework, feature set comparison and Performance Architectural philosophy /open source /proprietary 25% The industry norm is having two implementations… eg: Cloudera and Hortonworks or Hortonworks and Pivotal or Cloudera and Pivotal based on their requirements. This also helps reduce dependency on any one vendor and being tied to one set of technologies. Since we are looking at the entire stack/suite of products, Pivotal has a product suite/technologies in its datalake. Pivotal CommandCenter, Cloud Foundry, GPDB, HAWQ, MADlib, SQLFire, GemFire, GemFire XD, Spring support, HAMSTER. Pivotal adheres to open-source Hadoop and has added CommandCenter and features around the Hadoop ecosystem. It did not have Hadoop commiters before but recently has hired numerous professionals in this matter. Cloudera is becoming more and more closed source as it introduced EDH and Impala. Hortonworks believes in the open source philosophy which is great. Speaking with Cloudera and Hortonworks executives, the question is: THE VISION and ROADMAP.... Go-forward Strategy. Can they move beyond building wrapper around Hadoop. Cloudera and Hortonworks do not have the deep pockets or capability to go beyond Hadoop currently. MapR offers tremendous advantages since it bypasses MapReduce and hits the prop MapR engine(auto-node feature) but the new features take one or two months to be incorporated since it is closed and prop. Also, supporting legacy versions can be a challenge with Cloudera and MapR where customization is done. Hadoop framework, featureset comparison and Performance and Management 50% Integration with other technologies or prop technologies provided and connectors, Partnership /Vendor strategic relationship 25%