SlideShare a Scribd company logo
Altiscale
Big Data-as-a-Service
Paul Tibaldi RSD & Ajay Jha SA
• Market Background
• Who is Altiscale?
• Why are we different/better?
• Hadoop Admin
• Apache Hadoop Stack
• Platform/Access/Demo
• Q/A
2
Big Data As A Service
Market Background
4
Interest in Big Data is growing fast
5
Big Data in The Cloud is Accelerating
On-
Premises
32%
Cloud
Only
23%
Cloud
Plus On-
Premises
29%
Source: “Hadoop Expansion Boosts Cloud and Unsupported On-Premises Deployments,” Merv Adrian, Nick Huedecker, 3 September 2015
But the journey has dangers
Gartner:
70% of independent
Big Data implementations
will fail to meet revenue
and cost objectives,
through 2018.
Who is Altiscale?
Altiscale Data Cloud GA in 2014
Financed by top-tier technology investors
Recognized innovator in Hadoop-as-a-Service
About Altiscale
About Altiscale
Led by experienced, renowned Hadoop team from Yahoo!
• Raymie Stata, CEO. Former Yahoo! CTO,
well-known advocate of Apache Software Foundation
• David Chaiken, CTO. Former Yahoo! Chief Architect
Built and managed by veterans of Big Data, SaaS, and
enterprise software
• From Google, Netflix, LinkedIn, VMware, Oracle, and Yahoo!
40,000 nodes
500 PB
1,000 users
$ billions at stake
Raymie Stata, CEO David Chaiken, CTO Ricardo Jenez
VP of Engineering
Charles Wimmer
Head of Operations
Big data built for speed
Fast time to value—days not months
Easier, faster scalability—with elastic scaling
Operations support—so your jobs get done
Lower TCO—for fast investment payback
11
Unmatched Security
Altiscale is the only provider
that delivers integrated security
encompassing its Big Data platform offering
Complete best of breed
Big Data is complex.
It gets more complicated as you scale.
Big Data-as-a-Service
The Altiscale Data Cloud Core
Altiscale Data Cloud is 100% based on Apache open source.
Our current Altiscale Data Cloud 4.0 release is composed of the following Apache components and
versions:
• Apache Hadoop 2.7.1
• Apache Spark 1.5*
• Apache Hive (& HCatalog) 1.2
• Apache Tez 0.7.0
• Apache Pig 0.15.1
• Apache Oozie 4.2.0
• Apache Flume 1.5.2
• Avro 1.7.4
• JDK/JRE 7 (Sun/Oracle version)
• HttpFS
In addition to the above, we also support the three latest versions of Spark to our customers. That
allows our customers the options of a conservative approach as well as a the option to work with
the “bleeding edge” fast moving Spark community.
Concurrency with Apache Versioning
Hire an expert to take care of the cluster
• Hardware setup and Cluster installation
• Address hardware failure
• Upgrade Hadoop stack
• Tuning config parameters
• yarn-site.xml  ex : yarn.nodemanager.resource.memory-mb
• mapred-site.xml  ex : mapreduce.task.io.sort.mb
• hdfs-site.xml  ex : dfs.blocksize
Hadoop Administration
Accessing the cloud
 Spark example
• Build Spark code laptop using maven
• Build the jar and copy over Altiscale’s workbench (Gateway) node.
• Launch Spark job on YARN.
• Monitor using Resource Manager
Quick Spark Demo
20
Thank You!

More Related Content

What's hot

Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
DataWorks Summit
 
Hadoop for the Masses
Hadoop for the MassesHadoop for the Masses
Hadoop for the Masses
DataWorks Summit/Hadoop Summit
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
DataWorks Summit
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
Jeffrey T. Pollock
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to ProductionThe DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to Production
DataWorks Summit/Hadoop Summit
 
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven ArchitectureAddressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
DataWorks Summit
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Rittman Analytics
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
DataWorks Summit
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
DataWorks Summit
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
DataWorks Summit
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
DataWorks Summit
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Cloudera, Inc.
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
Cloudera, Inc.
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Cloudera, Inc.
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
DataWorks Summit
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
Cloudera, Inc.
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

Cloudera, Inc.
 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
DataWorks Summit/Hadoop Summit
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive EnterpriseSmart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
DataWorks Summit
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
Cloudera, Inc.
 

What's hot (20)

Data-In-Motion Unleashed
Data-In-Motion UnleashedData-In-Motion Unleashed
Data-In-Motion Unleashed
 
Hadoop for the Masses
Hadoop for the MassesHadoop for the Masses
Hadoop for the Masses
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration2017 OpenWorld Keynote for Data Integration
2017 OpenWorld Keynote for Data Integration
 
The DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to ProductionThe DAP - Where YARN, HBase, Kafka and Spark go to Production
The DAP - Where YARN, HBase, Kafka and Spark go to Production
 
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven ArchitectureAddressing Enterprise Customer Pain Points with a Data Driven Architecture
Addressing Enterprise Customer Pain Points with a Data Driven Architecture
 
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)
 
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
Journey to the Data Lake: How Progressive Paved a Faster, Smoother Path to In...
 
Built-In Security for the Cloud
Built-In Security for the CloudBuilt-In Security for the Cloud
Built-In Security for the Cloud
 
Scaling Data Science on Big Data
Scaling Data Science on Big DataScaling Data Science on Big Data
Scaling Data Science on Big Data
 
Innovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data WarehouseInnovation in the Enterprise Rent-A-Car Data Warehouse
Innovation in the Enterprise Rent-A-Car Data Warehouse
 
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for productionFaster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
Faster Batch Processing with Cloudera 5.7: Hive-on-Spark is ready for production
 
Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop Harnessing the Power of Apache Hadoop
Harnessing the Power of Apache Hadoop
 
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoProExtreme Sports & Beyond: Exploring a new frontier in data with GoPro
Extreme Sports & Beyond: Exploring a new frontier in data with GoPro
 
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
 
Data Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the EnterpriseData Science and Machine Learning for the Enterprise
Data Science and Machine Learning for the Enterprise
 
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera

Supercharge Splunk with Cloudera

 
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizonHadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
Hadoop and Friends as Key Enabler of the IoE - Continental's Dynamic eHorizon
 
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive EnterpriseSmart Enterprise Big Data Bus for the Modern Responsive Enterprise
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise
 
Part 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to EndPart 3: Models in Production: A Look From Beginning to End
Part 3: Models in Production: A Look From Beginning to End
 

Viewers also liked

Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop MeetupHadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Mark Kerzner
 
ODPi is Now Open for Business: Here's What it Means
ODPi is Now Open for Business: Here's What it MeansODPi is Now Open for Business: Here's What it Means
ODPi is Now Open for Business: Here's What it Means
PivotalOpenSourceHub
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
Linaro
 
Hadoop on ec2
Hadoop on ec2Hadoop on ec2
Hadoop on ec2
Mark Kerzner
 
Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2IMC Institute
 
Toorcamp 2016
Toorcamp 2016Toorcamp 2016
Toorcamp 2016
Mark Kerzner
 
Cloudera search
Cloudera searchCloudera search
Cloudera search
Mark Kerzner
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
Mark Kerzner
 
Apache Cassandra Certification
Apache Cassandra CertificationApache Cassandra Certification
Apache Cassandra Certification
Vskills
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
benjaminwootton
 
The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop
MapR Technologies
 
SAP HANA Cloud Portal - Overview Presentation
SAP HANA Cloud Portal - Overview PresentationSAP HANA Cloud Portal - Overview Presentation
SAP HANA Cloud Portal - Overview Presentation
SAP Portal
 
Cost of Ownership for Hadoop Implementation
Cost of Ownership for Hadoop ImplementationCost of Ownership for Hadoop Implementation
Cost of Ownership for Hadoop ImplementationDataWorks Summit
 
ROI of Big Data Analytics Native on Hadoop
ROI of Big Data Analytics Native on HadoopROI of Big Data Analytics Native on Hadoop
ROI of Big Data Analytics Native on Hadoop
DataWorks Summit
 
Building a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and LuceneBuilding a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and Lucene
Mirko Calvaresi
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
EMC
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
Philippe Julio
 

Viewers also liked (17)

Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop MeetupHadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
Hadoop as a service presented by Ajay Jha at Houston Hadoop Meetup
 
ODPi is Now Open for Business: Here's What it Means
ODPi is Now Open for Business: Here's What it MeansODPi is Now Open for Business: Here's What it Means
ODPi is Now Open for Business: Here's What it Means
 
BKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing HadoopBKK16-400B ODPI - Standardizing Hadoop
BKK16-400B ODPI - Standardizing Hadoop
 
Hadoop on ec2
Hadoop on ec2Hadoop on ec2
Hadoop on ec2
 
Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2Set up Hadoop Cluster on Amazon EC2
Set up Hadoop Cluster on Amazon EC2
 
Toorcamp 2016
Toorcamp 2016Toorcamp 2016
Toorcamp 2016
 
Cloudera search
Cloudera searchCloudera search
Cloudera search
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Apache Cassandra Certification
Apache Cassandra CertificationApache Cassandra Certification
Apache Cassandra Certification
 
Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2Configuring Your First Hadoop Cluster On EC2
Configuring Your First Hadoop Cluster On EC2
 
The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop The TCO Calculator - Estimate the True Cost of Hadoop
The TCO Calculator - Estimate the True Cost of Hadoop
 
SAP HANA Cloud Portal - Overview Presentation
SAP HANA Cloud Portal - Overview PresentationSAP HANA Cloud Portal - Overview Presentation
SAP HANA Cloud Portal - Overview Presentation
 
Cost of Ownership for Hadoop Implementation
Cost of Ownership for Hadoop ImplementationCost of Ownership for Hadoop Implementation
Cost of Ownership for Hadoop Implementation
 
ROI of Big Data Analytics Native on Hadoop
ROI of Big Data Analytics Native on HadoopROI of Big Data Analytics Native on Hadoop
ROI of Big Data Analytics Native on Hadoop
 
Building a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and LuceneBuilding a distributed search system with Hadoop and Lucene
Building a distributed search system with Hadoop and Lucene
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 

Similar to Hadoop Hadoop & Spark meetup - Altiscale

Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
Riccardo Romani
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
Jeffrey T. Pollock
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
Inside Analysis
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
Joan Novino
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
Matt Lord
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
xKinAnx
 
HP Helion Webinar #4 - Open stack the magic pill
HP Helion Webinar #4 - Open stack the magic pillHP Helion Webinar #4 - Open stack the magic pill
HP Helion Webinar #4 - Open stack the magic pill
BeMyApp
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
airisData
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
Inside Analysis
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
NoSQLmatters
 
Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP Haven
DataWorks Summit
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019
Adam Doyle
 
Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?
DataWorks Summit
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
Pactera_US
 

Similar to Hadoop Hadoop & Spark meetup - Altiscale (20)

Oracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and ArchitectureOracle Cloud : Big Data Use Cases and Architecture
Oracle Cloud : Big Data Use Cases and Architecture
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
 
Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0Hot Technologies of 2013: Hadoop 2.0
Hot Technologies of 2013: Hadoop 2.0
 
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016Azure Cafe Marketplace with Hortonworks March 31 2016
Azure Cafe Marketplace with Hortonworks March 31 2016
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Presentation big dataappliance-overview_oow_v3
Presentation   big dataappliance-overview_oow_v3Presentation   big dataappliance-overview_oow_v3
Presentation big dataappliance-overview_oow_v3
 
HP Helion Webinar #4 - Open stack the magic pill
HP Helion Webinar #4 - Open stack the magic pillHP Helion Webinar #4 - Open stack the magic pill
HP Helion Webinar #4 - Open stack the magic pill
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Parquet and AVRO
Parquet and AVROParquet and AVRO
Parquet and AVRO
 
Level Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop AccelerationLevel Up – How to Achieve Hadoop Acceleration
Level Up – How to Achieve Hadoop Acceleration
 
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
 
Carpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP HavenCarpe Datum: Building Big Data Analytical Applications with HP Haven
Carpe Datum: Building Big Data Analytical Applications with HP Haven
 
Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019Big Data Retrospective - STL Big Data IDEA Jan 2019
Big Data Retrospective - STL Big Data IDEA Jan 2019
 
Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?Open Source SQL for Hadoop: Where are we and Where are we Going?
Open Source SQL for Hadoop: Where are we and Where are we Going?
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Munich HUG 21.11.2013
Munich HUG 21.11.2013Munich HUG 21.11.2013
Munich HUG 21.11.2013
 

More from Mark Kerzner

IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
Mark Kerzner
 
Joe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFiJoe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFi
Mark Kerzner
 
FreeEed popcorn overview
FreeEed popcorn overviewFreeEed popcorn overview
FreeEed popcorn overviewMark Kerzner
 
FreeEed presentation
FreeEed presentationFreeEed presentation
FreeEed presentationMark Kerzner
 
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Mark Kerzner
 
Night owl by Boyd Meyer of PROS
Night owl by Boyd Meyer of PROS Night owl by Boyd Meyer of PROS
Night owl by Boyd Meyer of PROS Mark Kerzner
 
SHMcloud vision
SHMcloud visionSHMcloud vision
SHMcloud vision
Mark Kerzner
 
Porting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpPorting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpMark Kerzner
 
Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2
Mark Kerzner
 
Open source e_discovery
Open source e_discoveryOpen source e_discovery
Open source e_discovery
Mark Kerzner
 
FreEed - Open Source eDiscovery
FreEed - Open Source eDiscoveryFreEed - Open Source eDiscovery
FreEed - Open Source eDiscovery
Mark Kerzner
 
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaHouston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Mark Kerzner
 
Google Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandGoogle Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandMark Kerzner
 
Fun art with fruit and vegetable
Fun art with fruit and vegetableFun art with fruit and vegetable
Fun art with fruit and vegetableMark Kerzner
 
Carnavale de Venice
Carnavale de VeniceCarnavale de Venice
Carnavale de VeniceMark Kerzner
 
Holocaust Memorial Tato
Holocaust Memorial TatoHolocaust Memorial Tato
Holocaust Memorial Tato
Mark Kerzner
 
Yehuda Pen
Yehuda PenYehuda Pen
Yehuda Pen
Mark Kerzner
 
Thailand Visite
Thailand VisiteThailand Visite
Thailand Visite
Mark Kerzner
 
Venice views with music
Venice views with musicVenice views with music
Venice views with music
Mark Kerzner
 

More from Mark Kerzner (20)

IBM Strategy for Spark
IBM Strategy for SparkIBM Strategy for Spark
IBM Strategy for Spark
 
Joe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFiJoe Witt presentation on Apache NiFi
Joe Witt presentation on Apache NiFi
 
FreeEed popcorn overview
FreeEed popcorn overviewFreeEed popcorn overview
FreeEed popcorn overview
 
FreeEed presentation
FreeEed presentationFreeEed presentation
FreeEed presentation
 
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
Nutch + Hadoop scaled, for crawling protected web sites (hint: Selenium)
 
Night owl by Boyd Meyer of PROS
Night owl by Boyd Meyer of PROS Night owl by Boyd Meyer of PROS
Night owl by Boyd Meyer of PROS
 
SHMcloud vision
SHMcloud visionSHMcloud vision
SHMcloud vision
 
Porting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdpPorting your hadoop app to horton works hdp
Porting your hadoop app to horton works hdp
 
Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2Automated Hadoop Cluster Construction on EC2
Automated Hadoop Cluster Construction on EC2
 
Open source e_discovery
Open source e_discoveryOpen source e_discovery
Open source e_discovery
 
FreEed - Open Source eDiscovery
FreEed - Open Source eDiscoveryFreEed - Open Source eDiscovery
FreEed - Open Source eDiscovery
 
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of ClouderaHouston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
Houston Hadoop Meetup Presentation by Vikram Oberoi of Cloudera
 
Google Office in Zurich, Switzerland
Google Office in Zurich, SwitzerlandGoogle Office in Zurich, Switzerland
Google Office in Zurich, Switzerland
 
Fun art with fruit and vegetable
Fun art with fruit and vegetableFun art with fruit and vegetable
Fun art with fruit and vegetable
 
Carnavale de Venice
Carnavale de VeniceCarnavale de Venice
Carnavale de Venice
 
Holocaust Memorial Tato
Holocaust Memorial TatoHolocaust Memorial Tato
Holocaust Memorial Tato
 
Yehuda Pen
Yehuda PenYehuda Pen
Yehuda Pen
 
Mark Chagall
Mark ChagallMark Chagall
Mark Chagall
 
Thailand Visite
Thailand VisiteThailand Visite
Thailand Visite
 
Venice views with music
Venice views with musicVenice views with music
Venice views with music
 

Recently uploaded

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 

Recently uploaded (20)

Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 

Hadoop Hadoop & Spark meetup - Altiscale

  • 2. • Market Background • Who is Altiscale? • Why are we different/better? • Hadoop Admin • Apache Hadoop Stack • Platform/Access/Demo • Q/A 2 Big Data As A Service
  • 4. 4 Interest in Big Data is growing fast
  • 5. 5 Big Data in The Cloud is Accelerating On- Premises 32% Cloud Only 23% Cloud Plus On- Premises 29% Source: “Hadoop Expansion Boosts Cloud and Unsupported On-Premises Deployments,” Merv Adrian, Nick Huedecker, 3 September 2015
  • 6. But the journey has dangers Gartner: 70% of independent Big Data implementations will fail to meet revenue and cost objectives, through 2018.
  • 8. Altiscale Data Cloud GA in 2014 Financed by top-tier technology investors Recognized innovator in Hadoop-as-a-Service About Altiscale
  • 9. About Altiscale Led by experienced, renowned Hadoop team from Yahoo! • Raymie Stata, CEO. Former Yahoo! CTO, well-known advocate of Apache Software Foundation • David Chaiken, CTO. Former Yahoo! Chief Architect Built and managed by veterans of Big Data, SaaS, and enterprise software • From Google, Netflix, LinkedIn, VMware, Oracle, and Yahoo! 40,000 nodes 500 PB 1,000 users $ billions at stake Raymie Stata, CEO David Chaiken, CTO Ricardo Jenez VP of Engineering Charles Wimmer Head of Operations
  • 10. Big data built for speed Fast time to value—days not months Easier, faster scalability—with elastic scaling Operations support—so your jobs get done Lower TCO—for fast investment payback
  • 11. 11 Unmatched Security Altiscale is the only provider that delivers integrated security encompassing its Big Data platform offering
  • 13. Big Data is complex. It gets more complicated as you scale.
  • 15. The Altiscale Data Cloud Core
  • 16. Altiscale Data Cloud is 100% based on Apache open source. Our current Altiscale Data Cloud 4.0 release is composed of the following Apache components and versions: • Apache Hadoop 2.7.1 • Apache Spark 1.5* • Apache Hive (& HCatalog) 1.2 • Apache Tez 0.7.0 • Apache Pig 0.15.1 • Apache Oozie 4.2.0 • Apache Flume 1.5.2 • Avro 1.7.4 • JDK/JRE 7 (Sun/Oracle version) • HttpFS In addition to the above, we also support the three latest versions of Spark to our customers. That allows our customers the options of a conservative approach as well as a the option to work with the “bleeding edge” fast moving Spark community. Concurrency with Apache Versioning
  • 17. Hire an expert to take care of the cluster • Hardware setup and Cluster installation • Address hardware failure • Upgrade Hadoop stack • Tuning config parameters • yarn-site.xml  ex : yarn.nodemanager.resource.memory-mb • mapred-site.xml  ex : mapreduce.task.io.sort.mb • hdfs-site.xml  ex : dfs.blocksize Hadoop Administration
  • 19.  Spark example • Build Spark code laptop using maven • Build the jar and copy over Altiscale’s workbench (Gateway) node. • Launch Spark job on YARN. • Monitor using Resource Manager Quick Spark Demo