SlideShare a Scribd company logo
1 of 68
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A Deep Dive into What’s New with
Amazon EMR
Damon Cortesi
Big Data Architect
Amazon Web Services
A N T 3 4 0
Abhishek Sinha
Principal Product Manager
Amazon Web Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR 5.x Release Velocity
Existing Apps
• Hadoop
• Flink
• Ganglia
• Hbase
• HBase on S3
• Hive & Hcatalog
• Hue
• Mahout
• Oozie
• Phoenix
• Pig
Existing Apps
• Presto
• Spark
• Sqoop
• Tez
• Zeppelin
• Zookeeper
New Apps this year
• JupyterHub
• Livy
• Tensorflow
• MXNET
Notable New Features
• Detailed guide to
migrate to HBase on S3
• EMRFS Role Auditing in
CloudTrail
• Tensorflow
optimizations
• SparkMagic and Livy
User Impersonation
• FairScheduler support
for YARN labels
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Big Data Blog
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Persistent Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Two architectural patterns
Transient Clusters
• Large-scale
transformation
• ETL to other DWH or
Data Lake
• Building ML jobs
Persistent Clusters
• Notebooks
• Experimentation
• Ad-hoc jobs
• Streaming
• Continuous
transformation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Run stateless, Automate everything, Enable self-service”
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Transient Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Run Stateless
• Maintain metastores off cluster
• Faster startup time lowers cost
Amazon RDS
Amazon
Redshift
Amazon
Athena
AWS Glue
AWS Glue
Data Catalog
EMR cluster EMR cluster EMR cluster
Cluster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Connecting to Hive Metastore
aws emr create-cluster
--release-label emr-5.19.0
--instance-type m4.large --instance-count 2 --applications Name=Hive
--configurations ./hiveConfiguration.json
--use-default-roles
Note: Limit access to elasticmapreduce:DescribeCluster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use AWS Glue Data Catalog as Common Metadata Store
• Support for Spark, Hive, and
Presto
• Auto-generate schema and
partitions
• Managed table updates
• Fine-grained access control to
databases and tables
• Cross-account data catalog
access
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Connecting to the AWS Glue Data Catalog
[
{
"Classification": "hive-site",
"Properties": {
"hive.metastore.client.factory.class": “
com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory",
"hive.metastore.glue.catalogid": "acct-id”
}
}
]
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fine Grained Access Control on
Glue Data Catalog
"arn:aws:glue:us-east-1:123456789012:catalog",
"arn:aws:glue:us-east-
1:123456789012:database/finegrainaccess",
"arn:aws:glue:us-east-
1:123456789012:tables/finegrainaccess/dev_*"
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cross-Account AWS Glue Data Catalog
Access
Data Engineering Account
Glue Data
Catalog
Team A (AWS Account)
Team A (AWS Account)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Migrate from Hive Metastore to AWS Glue Data Catalog
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Using Amazon S3 to persist your data
• Decoupling Storage and Compute
• Read depends on aggregate throughput
• Write committers
• File Output Committer
• File Output Committer 2
• Direct Write Committer
• EMRFS S3-optimized Committer (Spark and Parquet only)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMRFS performance
• Committer Performance
• Allows you to enable speculative
execution, which improves straggler
performance
• Does not need EMRFS Consistent
View
• Faster performance with EMRFS
Consistent View enabled
Optimized committer performance improvements
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Integration with S3-Select
spark
.read
.format("s3selectCSV") // "s3selectJson" for Json
.schema(...) // optional, but recommended
.options(...) // optional
.load("s3://path/to/my/datafiles")
Available in Spark, Presto and Hive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Presto/S3Select TPCDS-100 Query speed-up
-20%
-10%
0%
10%
20%
30%
40%
50%
60%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Options to submit jobs
Amazon EMR
Step API
Submit a Spark
application
Amazon EMR
AWS Data Pipeline
Airflow, Luigi, or other
schedulers on EC2
Create a pipeline
to schedule job
submission or create
complex workflows
AWS Step Functions
Use AWS Lambda to
submit applications to
EMR Step API or directly
to Spark on your cluster
Use Oozie on your
cluster to build
DAGs of jobs
Livy Server
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless Job submission: Step Functions & Livy
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Serverless Job submission
Task state – Invokes a Lambda function.
The first Task state submits the Spark job on Amazon EMR,
Next Task state is used to retrieve the previous Spark job status.
Wait state – Pauses the state machine until a job completes
execution.
Choice state – Each Spark job execution can return a failure,
an error, or a success state. Choice state to create a rule
that specifies the next action or Next Step
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Visual Graph Step Functions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spark Application History on the EMR Console
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Advanced orchestration
Trigger jobs using CloudWatch Events
Based on arrival of an event
Based on an schedule
Alerting in case of Failures
Parallel Execution Steps
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale up with Spot Instances
10 node cluster running for 14 hours
Cost = 1.0 * 10 * 14 = $140
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale Up Cluster with Spot Instances
Add 10 more nodes on Spot
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale Up Cluster with Spot Instances
20 node cluster running for 7 hours
Cost = 1.0 * 10 * 7 = $70
= 0.5 * 10 * 7 = $35
Total $105
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale Up Cluster with Spot Instances
50 % less run-time ( 14  7)
25% less cost (140  105)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Scale Up Cluster with Spot Instances
On Demand
Nodes for worst
case
Spot Nodes to
save cost
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What do customers tell us about Spot
Capacity Related
“I need a instance type and the capacity is not available”
“I am AZ agnostic, which AZ should I use to get the cheapest
capacity”
Interruption Related
“I was able to receive the capacity, but I was interrupted”
“I lost the cluster because only one instance was interrupted”
“How does Spark behave with interruption”
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Spot Interruption
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3 Reasons for Spot Clusters to fail
1. Requested Capacity in a certain instance type/AZ was not available
2. Requested Capacity in a certain AZ was not available
3. Termination of a single instance caused the entire cluster to fail
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3 Reasons for Spot Clusters to fail
1. Requested Capacity in a certain instance type/AZ was not available
2. Requested Capacity in a certain AZ was not available
Requested capacity available in different AZ or similar instance type
3. Termination of a single instance caused the cluster to fail
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3 Reasons for Spot Clusters to fail
1. Requested Capacity in a certain instance type/AZ was not available
2. Requested Capacity in a certain AZ was not available
Requested capacity available in different AZ or similar instance type
3. Termination of a single instance caused the cluster to fail
New Implementation that changes the way we provision Spot instances
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
New Implementation
• Faster startup time
• Single node termination does not terminate the cluster
• Optional Spot bids
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Instance Fleets
• Can mix different instance types in one group
• Can mix different markets (OD or Spot) in one group
• Each instance can have different EBS volume options
• Choose the total capacity (VCPUs) that you want (say 64)
• Diversify your instance types ( c3.xlarge, c4.xlarge, c5.xlarge – all with 4
vcpu and 8 GB RAM)
• Don’t specify and AZ and we will find the cheapest one
• Template this configuration
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Graceful decommissioning in Spark
• Decommissioning nodes
• Blacklisting Node
• Contributed to open source
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
How do we template best practices across the
organization?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Self-service with EMR & AWS Service Catalog
Use Case: Make it easy for your customers to launch EMR clusters on
AWS while:
• Remove/Reduce EMR/AWS learning curve
• Reduce on-boarding time
• Ensuring Security & Standards
• Adhere to Budgets
• Integrating with Internal Processes & Approval workflows
• Integrate with best practices
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR with AWS Service Catalog
Standardize
Enforce Consistency and
Compliance
Limit Access
Enforce Tagging, Security
Groups
Developer Autonomy
One-Stop Shop
Automate Deployments
Agile Governance
Configure Consume
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Administrator Interaction
Creates portfolio and
assigns product portfolio
1
Adds constraints, grant access
and add tags
2 Creates
product
Authors
template
Portfolio BPortfolio A
• Users and Roles
• Constraints
• Tags
Service Catalog
AWS
Service Catalog
Administrator
4
3
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Browse
Products
5
4 3
2
1
Portfolio
AWS
Service Catalog
End-users
Select version,
Provision Product,
configure
parameters
Deploy
Notifications
and outputs
Notifications and outputs
4
Scheduled
functions
AWS
Administrato
r
End User Interaction
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Before: EMR Cluster Request Process
Data Scientist
IT Team
AWS Account
Logs into
applications
on cluster
EMR Cluster
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR Self-Service with Service Catalog (SC)
Data Scientist
IT Team
AWS Account
AWS
Service Catalog
EMR Cluster
Requests cluster
AWS Management
Console
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Persistent Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
“Scale up and down”
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing EMR Notebooks
• De-couple Notebooks from Clusters
• Based on Open Source Jupyter Notebooks
• Attach to a cluster to run jobs
• Multiple users can attach to the same cluster (set Autoscaling on a
cluster)
• Detach and Attach to other clusters
• Save notebooks to Amazon S3
• Tag-based permissions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Notebooks
Off-cluster notebook based on Jupyter notebook application
Amazon EMR clusters
User S3 bucket
AWS Management Console
for EMR
EMR managed notebook based on
Jupyter notebook
users
Customer
VPC
EMR VPC
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Auto Scaling Clusters
Scaling options
Threshold
CloudWatch or custom metric
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Autoscaling Clusters
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Auto Scaling
• EMR scales-in at YARN task completion
• Selectively removes nodes with no running tasks
• yarn.resourcemanager.decommissioning.timeout
• Default timeout is one hour
• Dynamically scale storage (HDFS and EBS)
• Spark scale-in contributions
• Spark specific blacklisting of tasks
• Unregistering cached data and shuffle blocks
• Advanced error handling
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Multi-Master support for EMR Applications
Application Multi-master behavior HA details Notes
YARN HA Active/Standby with automatic failover and recovery Limited to YARN ResourceManager service
HDFS HA
Active/Standby with automatic failover using quorum
journaling
Limited to HDFS NameNode/metadata service
HBase HA Active/Standby utilizing zookeeper
Zookeeper HA Ensemble with automatic quorum
Ganglia Available on all masters HA agnostic
Hive
HA (service components
only)
Active/Standby Limited to Metastore and HiveServer2; Requires external metastore database or catalog
Spark Job specific Supported by YARN and HDFS HA Requires job designed for fault recovery
Flink Job specific
YARN HA session supported by YARN, HDFS and Zookeeper
HA
Requires job designed for fault recovery
Livy Non-HA, single master only State recovery supported
Oozie Non-HA, single master only
Hue Non-HA, single master only
Zeppelin Non-HA, single master only
JupyterHub Non-HA, single master only
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Reconfiguration of EMR Applications
Reconfigure applications on a running cluster
Reconfiguration applied to each instance group
Rolling restart of data nodes to prevent data loss
Automatic revert to last successfully applied version on failure
aws emr modify-instance-groups --instance-groups InstanceGroupId=ig-
123,Configurations=file://new-configurations.json
Coming Soon
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

What's hot (20)

Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
 
Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)Work Anywhere with Amazon Workspaces (Level: 200)
Work Anywhere with Amazon Workspaces (Level: 200)
 
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
Deep Learning Applications Using TensorFlow, ft. Advanced Microgrid Solutions...
 
Optimizing Amazon EBS for Performance (CMP317-R2) - AWS re:Invent 2018
Optimizing Amazon EBS for Performance (CMP317-R2) - AWS re:Invent 2018Optimizing Amazon EBS for Performance (CMP317-R2) - AWS re:Invent 2018
Optimizing Amazon EBS for Performance (CMP317-R2) - AWS re:Invent 2018
 
Migrating Data to the Cloud: Exploring Your Options from AWS (STG205-R1) - AW...
Migrating Data to the Cloud: Exploring Your Options from AWS (STG205-R1) - AW...Migrating Data to the Cloud: Exploring Your Options from AWS (STG205-R1) - AW...
Migrating Data to the Cloud: Exploring Your Options from AWS (STG205-R1) - AW...
 
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
Design, Deploy, and Optimize Microsoft SQL Server on AWS (WIN324-R1) - AWS re...
 
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
Accelerating Application Development with Amazon Aurora (DAT312-R2) - AWS re:...
 
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
Reliability of the Cloud: How AWS Achieves High Availability (ARC317-R1) - AW...
 
SRV313 Introduction to Building Web Apps on AWS
 SRV313 Introduction to Building Web Apps on AWS SRV313 Introduction to Building Web Apps on AWS
SRV313 Introduction to Building Web Apps on AWS
 
AWS Greengrass, Containers, and Your Dev Process for Edge Apps (GPSWS404) - A...
AWS Greengrass, Containers, and Your Dev Process for Edge Apps (GPSWS404) - A...AWS Greengrass, Containers, and Your Dev Process for Edge Apps (GPSWS404) - A...
AWS Greengrass, Containers, and Your Dev Process for Edge Apps (GPSWS404) - A...
 
Introduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day JerusalemIntroduction to Serverless on AWS - Builders Day Jerusalem
Introduction to Serverless on AWS - Builders Day Jerusalem
 
SRV205 Architectures and Strategies for Building Modern Applications on AWS
 SRV205 Architectures and Strategies for Building Modern Applications on AWS SRV205 Architectures and Strategies for Building Modern Applications on AWS
SRV205 Architectures and Strategies for Building Modern Applications on AWS
 
Deep Dive on MySQL Databases on Amazon RDS (DAT322) - AWS re:Invent 2018
Deep Dive on MySQL Databases on Amazon RDS (DAT322) - AWS re:Invent 2018Deep Dive on MySQL Databases on Amazon RDS (DAT322) - AWS re:Invent 2018
Deep Dive on MySQL Databases on Amazon RDS (DAT322) - AWS re:Invent 2018
 
Building an AWS Greengrass Machine Learning Solution at the Edge (IOT403-R1) ...
Building an AWS Greengrass Machine Learning Solution at the Edge (IOT403-R1) ...Building an AWS Greengrass Machine Learning Solution at the Edge (IOT403-R1) ...
Building an AWS Greengrass Machine Learning Solution at the Edge (IOT403-R1) ...
 
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
Better, Faster, Cheaper – Cost Optimizing Compute with Amazon EC2 Fleet #savi...
 
Achieving Business Value with AWS - AWS Online Tech Talks
Achieving Business Value with AWS - AWS Online Tech TalksAchieving Business Value with AWS - AWS Online Tech Talks
Achieving Business Value with AWS - AWS Online Tech Talks
 
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
Building Serverless Analytics Solutions with Amazon QuickSight (ANT391) - AWS...
 
DEM18 How SendBird Built a Serverless Log-Processing Pipeline in a Week
DEM18 How SendBird Built a Serverless Log-Processing Pipeline in a WeekDEM18 How SendBird Built a Serverless Log-Processing Pipeline in a Week
DEM18 How SendBird Built a Serverless Log-Processing Pipeline in a Week
 
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
Running Lean Architectures: How to Optimize for Cost Efficiency (ARC202-R2) -...
 
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
[NEW LAUNCH!] Introducing Amazon Managed Streaming for Kafka (Amazon MSK) (AN...
 

Similar to A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018

Similar to A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018 (20)

Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
Amazon EMR: Optimize Transient Clusters for Data Processing & ETL (ANT341) - ...
 
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
Best practices for optimizing your EC2 costs with Spot Instances | AWS Floor28
 
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
Running Amazon EKS Workloads on Amazon EC2 Spot Instances (CMP403-R1) - AWS r...
 
AWS Black Belt Online Seminar 2018 re:Invent Recap: Compute, Container and Ne...
AWS Black Belt Online Seminar 2018 re:Invent Recap: Compute, Container and Ne...AWS Black Belt Online Seminar 2018 re:Invent Recap: Compute, Container and Ne...
AWS Black Belt Online Seminar 2018 re:Invent Recap: Compute, Container and Ne...
 
AWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day IsraelAWS Lambda use cases and best practices - Builders Day Israel
AWS Lambda use cases and best practices - Builders Day Israel
 
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
 
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
Migrate Your Hadoop/Spark Workload to Amazon EMR and Architect It for Securit...
 
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
Capacity Management Made Easy with Amazon EC2 Auto Scaling (CMP377) - AWS re:...
 
Mastering Kubernetes on AWS - Tel Aviv Summit
Mastering Kubernetes on AWS - Tel Aviv SummitMastering Kubernetes on AWS - Tel Aviv Summit
Mastering Kubernetes on AWS - Tel Aviv Summit
 
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
以 Amazon EC2 Spot 執行個體有效控制專案成本 (Level: 200)
 
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdfCome scalare da zero ai tuoi primi 10 milioni di utenti.pdf
Come scalare da zero ai tuoi primi 10 milioni di utenti.pdf
 
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
2018 10-19-jc conf-embrace-legacy-java-ee-by-aws-serverless
 
AWSome Day - Solutions Architecture Best Practices
AWSome Day - Solutions Architecture Best PracticesAWSome Day - Solutions Architecture Best Practices
AWSome Day - Solutions Architecture Best Practices
 
AWSome Day MODULE 5 - Autoscaling and Next Steps
AWSome Day MODULE 5 - Autoscaling and Next StepsAWSome Day MODULE 5 - Autoscaling and Next Steps
AWSome Day MODULE 5 - Autoscaling and Next Steps
 
Amazon EC2 Spot Instances
Amazon EC2 Spot InstancesAmazon EC2 Spot Instances
Amazon EC2 Spot Instances
 
Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%Run Production Workloads on Spot, Save up to 90%
Run Production Workloads on Spot, Save up to 90%
 
20190306 AWS Black Belt Online Seminar Amazon EC2スポットインスタンス
20190306 AWS Black Belt Online Seminar Amazon EC2スポットインスタンス20190306 AWS Black Belt Online Seminar Amazon EC2スポットインスタンス
20190306 AWS Black Belt Online Seminar Amazon EC2スポットインスタンス
 
Scaling from zero to millions of users
Scaling from zero to millions of usersScaling from zero to millions of users
Scaling from zero to millions of users
 
Serverless best practices plus design principles 20m version
Serverless   best practices plus design principles 20m versionServerless   best practices plus design principles 20m version
Serverless best practices plus design principles 20m version
 
Serverless Architectural Patterns: Collision 2018
Serverless Architectural Patterns: Collision 2018Serverless Architectural Patterns: Collision 2018
Serverless Architectural Patterns: Collision 2018
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

A Deep Dive into What's New with Amazon EMR (ANT340-R1) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. A Deep Dive into What’s New with Amazon EMR Damon Cortesi Big Data Architect Amazon Web Services A N T 3 4 0 Abhishek Sinha Principal Product Manager Amazon Web Services
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMR 5.x Release Velocity Existing Apps • Hadoop • Flink • Ganglia • Hbase • HBase on S3 • Hive & Hcatalog • Hue • Mahout • Oozie • Phoenix • Pig Existing Apps • Presto • Spark • Sqoop • Tez • Zeppelin • Zookeeper New Apps this year • JupyterHub • Livy • Tensorflow • MXNET Notable New Features • Detailed guide to migrate to HBase on S3 • EMRFS Role Auditing in CloudTrail • Tensorflow optimizations • SparkMagic and Livy User Impersonation • FairScheduler support for YARN labels
  • 4. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Big Data Blog
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Persistent Clusters
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Two architectural patterns Transient Clusters • Large-scale transformation • ETL to other DWH or Data Lake • Building ML jobs Persistent Clusters • Notebooks • Experimentation • Ad-hoc jobs • Streaming • Continuous transformation
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Run stateless, Automate everything, Enable self-service”
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 17. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Transient Clusters
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Run Stateless • Maintain metastores off cluster • Faster startup time lowers cost Amazon RDS Amazon Redshift Amazon Athena AWS Glue AWS Glue Data Catalog EMR cluster EMR cluster EMR cluster Cluster
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Connecting to Hive Metastore aws emr create-cluster --release-label emr-5.19.0 --instance-type m4.large --instance-count 2 --applications Name=Hive --configurations ./hiveConfiguration.json --use-default-roles Note: Limit access to elasticmapreduce:DescribeCluster
  • 20. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Use AWS Glue Data Catalog as Common Metadata Store • Support for Spark, Hive, and Presto • Auto-generate schema and partitions • Managed table updates • Fine-grained access control to databases and tables • Cross-account data catalog access
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Connecting to the AWS Glue Data Catalog [ { "Classification": "hive-site", "Properties": { "hive.metastore.client.factory.class": “ com.amazonaws.glue.catalog.metastore.AWSGlueDataCatalogHiveClientFactory", "hive.metastore.glue.catalogid": "acct-id” } } ]
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Fine Grained Access Control on Glue Data Catalog "arn:aws:glue:us-east-1:123456789012:catalog", "arn:aws:glue:us-east- 1:123456789012:database/finegrainaccess", "arn:aws:glue:us-east- 1:123456789012:tables/finegrainaccess/dev_*"
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cross-Account AWS Glue Data Catalog Access Data Engineering Account Glue Data Catalog Team A (AWS Account) Team A (AWS Account)
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Migrate from Hive Metastore to AWS Glue Data Catalog
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Using Amazon S3 to persist your data • Decoupling Storage and Compute • Read depends on aggregate throughput • Write committers • File Output Committer • File Output Committer 2 • Direct Write Committer • EMRFS S3-optimized Committer (Spark and Parquet only)
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMRFS performance • Committer Performance • Allows you to enable speculative execution, which improves straggler performance • Does not need EMRFS Consistent View • Faster performance with EMRFS Consistent View enabled Optimized committer performance improvements
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Integration with S3-Select spark .read .format("s3selectCSV") // "s3selectJson" for Json .schema(...) // optional, but recommended .options(...) // optional .load("s3://path/to/my/datafiles") Available in Spark, Presto and Hive
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Presto/S3Select TPCDS-100 Query speed-up -20% -10% 0% 10% 20% 30% 40% 50% 60% 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Options to submit jobs Amazon EMR Step API Submit a Spark application Amazon EMR AWS Data Pipeline Airflow, Luigi, or other schedulers on EC2 Create a pipeline to schedule job submission or create complex workflows AWS Step Functions Use AWS Lambda to submit applications to EMR Step API or directly to Spark on your cluster Use Oozie on your cluster to build DAGs of jobs Livy Server
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless Job submission: Step Functions & Livy
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless Job submission Task state – Invokes a Lambda function. The first Task state submits the Spark job on Amazon EMR, Next Task state is used to retrieve the previous Spark job status. Wait state – Pauses the state machine until a job completes execution. Choice state – Each Spark job execution can return a failure, an error, or a success state. Choice state to create a rule that specifies the next action or Next Step
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Visual Graph Step Functions
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Spark Application History on the EMR Console
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Advanced orchestration Trigger jobs using CloudWatch Events Based on arrival of an event Based on an schedule Alerting in case of Failures Parallel Execution Steps
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale up with Spot Instances 10 node cluster running for 14 hours Cost = 1.0 * 10 * 14 = $140
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale Up Cluster with Spot Instances Add 10 more nodes on Spot
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale Up Cluster with Spot Instances 20 node cluster running for 7 hours Cost = 1.0 * 10 * 7 = $70 = 0.5 * 10 * 7 = $35 Total $105
  • 38. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale Up Cluster with Spot Instances 50 % less run-time ( 14  7) 25% less cost (140  105)
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Scale Up Cluster with Spot Instances On Demand Nodes for worst case Spot Nodes to save cost
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What do customers tell us about Spot Capacity Related “I need a instance type and the capacity is not available” “I am AZ agnostic, which AZ should I use to get the cheapest capacity” Interruption Related “I was able to receive the capacity, but I was interrupted” “I lost the cluster because only one instance was interrupted” “How does Spark behave with interruption”
  • 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Spot Interruption
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3 Reasons for Spot Clusters to fail 1. Requested Capacity in a certain instance type/AZ was not available 2. Requested Capacity in a certain AZ was not available 3. Termination of a single instance caused the entire cluster to fail
  • 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3 Reasons for Spot Clusters to fail 1. Requested Capacity in a certain instance type/AZ was not available 2. Requested Capacity in a certain AZ was not available Requested capacity available in different AZ or similar instance type 3. Termination of a single instance caused the cluster to fail
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. 3 Reasons for Spot Clusters to fail 1. Requested Capacity in a certain instance type/AZ was not available 2. Requested Capacity in a certain AZ was not available Requested capacity available in different AZ or similar instance type 3. Termination of a single instance caused the cluster to fail New Implementation that changes the way we provision Spot instances
  • 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. New Implementation • Faster startup time • Single node termination does not terminate the cluster • Optional Spot bids
  • 47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Instance Fleets • Can mix different instance types in one group • Can mix different markets (OD or Spot) in one group • Each instance can have different EBS volume options • Choose the total capacity (VCPUs) that you want (say 64) • Diversify your instance types ( c3.xlarge, c4.xlarge, c5.xlarge – all with 4 vcpu and 8 GB RAM) • Don’t specify and AZ and we will find the cheapest one • Template this configuration
  • 48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Graceful decommissioning in Spark • Decommissioning nodes • Blacklisting Node • Contributed to open source
  • 49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. How do we template best practices across the organization?
  • 50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Self-service with EMR & AWS Service Catalog Use Case: Make it easy for your customers to launch EMR clusters on AWS while: • Remove/Reduce EMR/AWS learning curve • Reduce on-boarding time • Ensuring Security & Standards • Adhere to Budgets • Integrating with Internal Processes & Approval workflows • Integrate with best practices
  • 51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMR with AWS Service Catalog Standardize Enforce Consistency and Compliance Limit Access Enforce Tagging, Security Groups Developer Autonomy One-Stop Shop Automate Deployments Agile Governance Configure Consume
  • 52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Administrator Interaction Creates portfolio and assigns product portfolio 1 Adds constraints, grant access and add tags 2 Creates product Authors template Portfolio BPortfolio A • Users and Roles • Constraints • Tags Service Catalog AWS Service Catalog Administrator 4 3
  • 53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Browse Products 5 4 3 2 1 Portfolio AWS Service Catalog End-users Select version, Provision Product, configure parameters Deploy Notifications and outputs Notifications and outputs 4 Scheduled functions AWS Administrato r End User Interaction
  • 54. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Before: EMR Cluster Request Process Data Scientist IT Team AWS Account Logs into applications on cluster EMR Cluster
  • 55. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. EMR Self-Service with Service Catalog (SC) Data Scientist IT Team AWS Account AWS Service Catalog EMR Cluster Requests cluster AWS Management Console
  • 56. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 57. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Persistent Clusters
  • 58. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. “Scale up and down”
  • 59. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing EMR Notebooks • De-couple Notebooks from Clusters • Based on Open Source Jupyter Notebooks • Attach to a cluster to run jobs • Multiple users can attach to the same cluster (set Autoscaling on a cluster) • Detach and Attach to other clusters • Save notebooks to Amazon S3 • Tag-based permissions
  • 60. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notebooks Off-cluster notebook based on Jupyter notebook application Amazon EMR clusters User S3 bucket AWS Management Console for EMR EMR managed notebook based on Jupyter notebook users Customer VPC EMR VPC
  • 61. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Auto Scaling Clusters Scaling options Threshold CloudWatch or custom metric
  • 62. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Autoscaling Clusters
  • 63. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Auto Scaling • EMR scales-in at YARN task completion • Selectively removes nodes with no running tasks • yarn.resourcemanager.decommissioning.timeout • Default timeout is one hour • Dynamically scale storage (HDFS and EBS) • Spark scale-in contributions • Spark specific blacklisting of tasks • Unregistering cached data and shuffle blocks • Advanced error handling
  • 64. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Multi-Master support for EMR Applications Application Multi-master behavior HA details Notes YARN HA Active/Standby with automatic failover and recovery Limited to YARN ResourceManager service HDFS HA Active/Standby with automatic failover using quorum journaling Limited to HDFS NameNode/metadata service HBase HA Active/Standby utilizing zookeeper Zookeeper HA Ensemble with automatic quorum Ganglia Available on all masters HA agnostic Hive HA (service components only) Active/Standby Limited to Metastore and HiveServer2; Requires external metastore database or catalog Spark Job specific Supported by YARN and HDFS HA Requires job designed for fault recovery Flink Job specific YARN HA session supported by YARN, HDFS and Zookeeper HA Requires job designed for fault recovery Livy Non-HA, single master only State recovery supported Oozie Non-HA, single master only Hue Non-HA, single master only Zeppelin Non-HA, single master only JupyterHub Non-HA, single master only
  • 65. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Reconfiguration of EMR Applications Reconfigure applications on a running cluster Reconfiguration applied to each instance group Rolling restart of data nodes to prevent data loss Automatic revert to last successfully applied version on failure aws emr modify-instance-groups --instance-groups InstanceGroupId=ig- 123,Configurations=file://new-configurations.json Coming Soon
  • 66. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 67. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thank you!
  • 68. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.