SlideShare a Scribd company logo
1 of 10
Download to read offline
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Using Amazon EMR Notebooks to
develop Apache Spark applications
Radhika Ravirala
Specialist solutions architect
Amazon EMR, Amazon Athena, and AWS Glue
A D B 2 0 2
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Agenda
• Amazon EMR Notebooks overview
• Why should I use it?
• How it works – Demo!
• Accessing notebooks
• Q&A
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon EMR Notebooks
A managed analytics environment based on Jupyter notebooks
Amazon EMR clusters
User Amazon Simpler
Storage Service (Amazon
S3) bucket
AWS Management Console
for EMR
Amazon EMR–managed notebook
based on Jupyter notebook
Users
Customer
virtual private
cloud (VPC)
Amazon EMR
VPC
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
As a data scientist or developer
• Build applications, prepare and visualize data, collaborate with peers, and run interactive analysis through
PySpark, Spark SQL, Spark R, and Scala
• Create multiple notebooks instantly from the console, attach them to a minimum 1-node EMR cluster and
immediately start experimenting with Apache Spark
• Monitor the progress of your job from within the notebook with the integrated Spark monitor
• Visualize your results in rich graphical plots using the preinstalled open-source libraries from Anaconda
• Detach your notebook from clusters and re-attach to a different cluster suitable for your workload
• Durably persist your work to Amazon S3 and share with others; easily retrieve saved work from the
console
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
As an IT administrator
• Easily setup a multi-tenant cluster for your data scientists and developers and make most use of your EMR
cluster
• No need to deploy, maintain, and upgrade software or notebook instances
• Provide secure access to notebooks without providing multiuser access to the master node
• Safely terminate your cluster without fear of losing notebooks
• Set up fine-grained access control to notebooks and clusters through AWS Identity and Access
Management (IAM) policies
• Track and audit your notebook users by enforcing user impersonation
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
How do I access EMR notebooks?
• Log into the AWS Management Console for Amazon EMR
• Create a IAM user policy allowing permissions to use notebooks
• Use the default service role for notebooks with an AWS managed policy or provide your custom service
role with custom policies for the notebook
• Launch or have a running cluster with Amazon EMR release 5.18.0 or later with Spark and Livy installed
• Provide an Amazon S3 path in the same region as your Amazon EMR cluster
• Click on “create notebook” button to open notebook user interface
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Radhika Ravirala

More Related Content

What's hot

Running Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS SummitRunning Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS SummitAmazon Web Services
 
Securely deliver applications with AWS - SVC305 - Atlanta AWS Summit
Securely deliver applications with AWS - SVC305 - Atlanta AWS SummitSecurely deliver applications with AWS - SVC305 - Atlanta AWS Summit
Securely deliver applications with AWS - SVC305 - Atlanta AWS SummitAmazon Web Services
 
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS SummitIntroducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS SummitAmazon Web Services
 
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS Summit
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS SummitAI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS Summit
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS SummitAmazon Web Services
 
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...Amazon Web Services
 
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...Amazon Web Services
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Amazon Web Services
 
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...Amazon Web Services
 
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...Amazon Web Services
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Amazon Web Services
 
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Amazon Web Services
 
Increasing the value of video with machine learning & AWS Media Services - SV...
Increasing the value of video with machine learning & AWS Media Services - SV...Increasing the value of video with machine learning & AWS Media Services - SV...
Increasing the value of video with machine learning & AWS Media Services - SV...Amazon Web Services
 
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...Amazon Web Services
 
Scalable serverless architectures using event-driven design - MAD301 - Atlant...
Scalable serverless architectures using event-driven design - MAD301 - Atlant...Scalable serverless architectures using event-driven design - MAD301 - Atlant...
Scalable serverless architectures using event-driven design - MAD301 - Atlant...Amazon Web Services
 
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...Amazon Web Services
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Amazon Web Services
 
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...Amazon Web Services
 
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...Amazon Web Services
 
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...Amazon Web Services
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Amazon Web Services
 

What's hot (20)

Running Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS SummitRunning Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
Running Amazon EC2 workloads at scale - CMP301 - New York AWS Summit
 
Securely deliver applications with AWS - SVC305 - Atlanta AWS Summit
Securely deliver applications with AWS - SVC305 - Atlanta AWS SummitSecurely deliver applications with AWS - SVC305 - Atlanta AWS Summit
Securely deliver applications with AWS - SVC305 - Atlanta AWS Summit
 
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS SummitIntroducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS Summit
Introducing Open Distro for Elasticsearch - ADB201 - Atlanta AWS Summit
 
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS Summit
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS SummitAI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS Summit
AI Powered Speech Analytics for Amazon Connect - SVC305 - New York AWS Summit
 
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...
Visually developing IoT applications using AWS IoT Things Graph - SVC207 - Ch...
 
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...
Accelerating your Cloud Migration with VMware Cloud on AWS - SVC210 - Atlanta...
 
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
Sizing Amazon Elasticsearch Service for your workload - ADB303 - Santa Clara ...
 
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...
CI/CD best practices for building modern applications - MAD302 - Atlanta AWS ...
 
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
Driving Overall Equipment Effectiveness with AWS IoT SiteWise - SVC213 - Chic...
 
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
Introduction to the Well-Architected Framework and Tool - SVC212 - Chicago AW...
 
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
Making CI/CD pipelines safer with application monitoring and tracing - MAD202...
 
Increasing the value of video with machine learning & AWS Media Services - SV...
Increasing the value of video with machine learning & AWS Media Services - SV...Increasing the value of video with machine learning & AWS Media Services - SV...
Increasing the value of video with machine learning & AWS Media Services - SV...
 
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...
Move desktops & applications to AWS with Amazon WorkSpaces & AppStream 2.0 - ...
 
Scalable serverless architectures using event-driven design - MAD301 - Atlant...
Scalable serverless architectures using event-driven design - MAD301 - Atlant...Scalable serverless architectures using event-driven design - MAD301 - Atlant...
Scalable serverless architectures using event-driven design - MAD301 - Atlant...
 
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...
Mythical Mysfits: Build & collaborate on a modern web application on AWS - MA...
 
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
Next generation intelligent data lakes, powered by GraphQL & AWS AppSync - MA...
 
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
A tale of two customers - Simplified data protection with Veeam, N2WS & AWS -...
 
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...
Migrate your Oracle and SQL Server databases to Amazon RDS - ADB210 - New Yor...
 
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...
Deep dive on AWS Cloud storage offerings - What to use, where, and why - STG3...
 
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...Machine learning for developers & data scientists with Amazon SageMaker - AIM...
Machine learning for developers & data scientists with Amazon SageMaker - AIM...
 

Similar to Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - Atlanta AWS Summit

How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...Amazon Web Services
 
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Amazon Web Services
 
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...Amazon Web Services
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統Amazon Web Services
 
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...Amazon Web Services
 
Building application and migrating workload to AWS
Building application and migrating workload to AWSBuilding application and migrating workload to AWS
Building application and migrating workload to AWSAmazon Web Services
 
Serverless data prep with AWS Glue - ADB306 - New York AWS Summit
Serverless data prep with AWS Glue - ADB306 - New York AWS SummitServerless data prep with AWS Glue - ADB306 - New York AWS Summit
Serverless data prep with AWS Glue - ADB306 - New York AWS SummitAmazon Web Services
 
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...Amazon Web Services
 
Control your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsControl your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsAmazon Web Services
 
Breaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersBreaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersAmazon Web Services
 
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...Amazon Web Services Japan
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersAmazon Web Services
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...AWS Summits
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Amazon Web Services
 
Application Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternApplication Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternTom Laszewski
 
Building well architected .NET applications - SVC209 - Atlanta AWS Summit
Building well architected .NET applications - SVC209 - Atlanta AWS SummitBuilding well architected .NET applications - SVC209 - Atlanta AWS Summit
Building well architected .NET applications - SVC209 - Atlanta AWS SummitAmazon Web Services
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitAmazon Web Services
 
AWS ECS Workshop A Journey to Modern Applications
AWS ECS Workshop A Journey to Modern ApplicationsAWS ECS Workshop A Journey to Modern Applications
AWS ECS Workshop A Journey to Modern ApplicationsAmazon Web Services
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesAmazon Web Services
 
Wild rydes serverless website workshop
Wild rydes   serverless website workshopWild rydes   serverless website workshop
Wild rydes serverless website workshopAmazon Web Services
 

Similar to Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - Atlanta AWS Summit (20)

How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
How to Use Jupyter Notebooks with Amazon EMR for Better Productivity (ANT387)...
 
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
 
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...
Big Data Meets Machine Learning: Architecting Spark Environment for Data Scie...
 
利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統利用 Fargate - 無伺服器的容器環境建置高可用的系統
利用 Fargate - 無伺服器的容器環境建置高可用的系統
 
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...
Build, Deploy, and Serve Machine-Learning Models on Streaming Data Using Amaz...
 
Building application and migrating workload to AWS
Building application and migrating workload to AWSBuilding application and migrating workload to AWS
Building application and migrating workload to AWS
 
Serverless data prep with AWS Glue - ADB306 - New York AWS Summit
Serverless data prep with AWS Glue - ADB306 - New York AWS SummitServerless data prep with AWS Glue - ADB306 - New York AWS Summit
Serverless data prep with AWS Glue - ADB306 - New York AWS Summit
 
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...
Ditching the overhead - Moving Apache Kafka workloads into Amazon MSK - ADB30...
 
Control your cloud environment with AWS management tools
Control your cloud environment with AWS management toolsControl your cloud environment with AWS management tools
Control your cloud environment with AWS management tools
 
Breaking Up the Monolith with Containers
Breaking Up the Monolith with ContainersBreaking Up the Monolith with Containers
Breaking Up the Monolith with Containers
 
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
20191127 AWS Black Belt Online Seminar Amazon CloudWatch Container Insights で...
 
Scaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M UsersScaling Up To and Beyond 10M Users
Scaling Up To and Beyond 10M Users
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
Microservices on AWS: Architectural Patterns and Best Practices | AWS Summit ...
 
Application Modernization using the Strangler Pattern
Application Modernization using the Strangler PatternApplication Modernization using the Strangler Pattern
Application Modernization using the Strangler Pattern
 
Building well architected .NET applications - SVC209 - Atlanta AWS Summit
Building well architected .NET applications - SVC209 - Atlanta AWS SummitBuilding well architected .NET applications - SVC209 - Atlanta AWS Summit
Building well architected .NET applications - SVC209 - Atlanta AWS Summit
 
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS SummitBuild your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
Build your own log analytics solution on AWS - ADB301 - Atlanta AWS Summit
 
AWS ECS Workshop A Journey to Modern Applications
AWS ECS Workshop A Journey to Modern ApplicationsAWS ECS Workshop A Journey to Modern Applications
AWS ECS Workshop A Journey to Modern Applications
 
Getting Started with Serverless Architectures
Getting Started with Serverless ArchitecturesGetting Started with Serverless Architectures
Getting Started with Serverless Architectures
 
Wild rydes serverless website workshop
Wild rydes   serverless website workshopWild rydes   serverless website workshop
Wild rydes serverless website workshop
 

More from Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Using Amazon EMR Notebooks to develop Apache Spark applications - ADB202 - Atlanta AWS Summit

  • 1. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Using Amazon EMR Notebooks to develop Apache Spark applications Radhika Ravirala Specialist solutions architect Amazon EMR, Amazon Athena, and AWS Glue A D B 2 0 2
  • 2. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Agenda • Amazon EMR Notebooks overview • Why should I use it? • How it works – Demo! • Accessing notebooks • Q&A
  • 3. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon EMR Notebooks A managed analytics environment based on Jupyter notebooks Amazon EMR clusters User Amazon Simpler Storage Service (Amazon S3) bucket AWS Management Console for EMR Amazon EMR–managed notebook based on Jupyter notebook Users Customer virtual private cloud (VPC) Amazon EMR VPC
  • 4. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 5. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T As a data scientist or developer • Build applications, prepare and visualize data, collaborate with peers, and run interactive analysis through PySpark, Spark SQL, Spark R, and Scala • Create multiple notebooks instantly from the console, attach them to a minimum 1-node EMR cluster and immediately start experimenting with Apache Spark • Monitor the progress of your job from within the notebook with the integrated Spark monitor • Visualize your results in rich graphical plots using the preinstalled open-source libraries from Anaconda • Detach your notebook from clusters and re-attach to a different cluster suitable for your workload • Durably persist your work to Amazon S3 and share with others; easily retrieve saved work from the console
  • 6. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T As an IT administrator • Easily setup a multi-tenant cluster for your data scientists and developers and make most use of your EMR cluster • No need to deploy, maintain, and upgrade software or notebook instances • Provide secure access to notebooks without providing multiuser access to the master node • Safely terminate your cluster without fear of losing notebooks • Set up fine-grained access control to notebooks and clusters through AWS Identity and Access Management (IAM) policies • Track and audit your notebook users by enforcing user impersonation
  • 7. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 8. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T How do I access EMR notebooks? • Log into the AWS Management Console for Amazon EMR • Create a IAM user policy allowing permissions to use notebooks • Use the default service role for notebooks with an AWS managed policy or provide your custom service role with custom policies for the notebook • Launch or have a running cluster with Amazon EMR release 5.18.0 or later with Spark and Livy installed • Provide an Amazon S3 path in the same region as your Amazon EMR cluster • Click on “create notebook” button to open notebook user interface
  • 9. S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 10. © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T Thank you! S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Radhika Ravirala