SlideShare a Scribd company logo
1 of 15
Download to read offline
AWS SummitEMEA
Claudio Pontili
AWS Senior Cloud Solution Architect
Claudio.pontili@besharp.it
AServerless approachtoBigDataonAWS
Claudio Pontili
10+ years of experience on AWS
Senior Cloud Solution Architect
AWS Authorized Instructor Champion
Claudio.pontili@besharp.it
https://www.linkedin.com/in/claudiopontili/
Agenda
• Using Lambda for ETLs
• Glue ETLs
• CI/CD to deploy code inside Lambdas and Glue Jobs
• Datawarehousing on Aurora Serverless v1
• A full serverless Big Data Architecture
• What we’ve learned
Using Lambda for ETLs 1/2
• You can use Python + Panda library
• A lambda can have 10 GB of memory and a lot of CPU
power
• A lambda can run for 15 minutes
• Max deployment package 50 MB (zipped)
• Container image code package size 10 GB
• /tmp directory storage 512 MB
Using Lambda for ETLs 2/2
Glue Jobs, Data Catalog and Crawler 1/2
• Fully managed Data Catlog and Extract-Transform-Load (ETL) service
• Automates data discovery, conversion, mapping and job scheduling
• Glue runs your ETL jobs in an Apache Spark serverless envinronment
• Allow to scale your ETLs jobs
• Can easily schedule a crawler to to create a catalog of files stored on
S3
• Too much code? Try Glue Databrew
Glue Databrew
Glue Jobs, Data Catalog and Crawler 2/2
A serverless CI/CD
RDS Aurora Serverless
• MySql and PostgreSQL supported (reuse
the experience of your team)
• Pay per ACU/hours (2 GB of memory)
• Scales from 1 to 256 ACU
• You can pause the cluster during the night
• Aurora Serverless v2 in preview for
MultiAz, Read-Replicas, faster scale
Glue Jobs, Data Catalog and Crawler 2/2
What we’ve learned
• Serverless gives you High Availability and
great scalability with no effort
• Pause Aurora Serverless v1 (it will take
about 30 seconds to restart)
• Use IaC (Cloudformation, Terraform, CDK,
etc) to deploy your infrastructure
• Tune your lamba memory using
https://github.com/alexcasalboni/aws-
lambda-power-tuning
• S3 is cheap but try not to write tiny
(<128KB) files
• Serverless can be pretty cheap if it’s used
in the right way
Questions?
www.besharp.it
info@besharp.it
+39 0382 1692920
beSharp srl - viale Ludovico il Moro 27 - 27100 Pavia (ITALY)
VAT ID IT02415160189
Follow beSharp on

More Related Content

What's hot

Benchmarking Aerospike on the Google Cloud - NoSQL Speed with Ease
Benchmarking Aerospike on the Google Cloud - NoSQL Speed with EaseBenchmarking Aerospike on the Google Cloud - NoSQL Speed with Ease
Benchmarking Aerospike on the Google Cloud - NoSQL Speed with EaseLynn Langit
 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Dave Pitts
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsMongoDB
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018Rachit Arora
 
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Jujuseoul_engineer
 
Cloudsolutionday 2016: Getting Started with Severless Architecture
Cloudsolutionday 2016: Getting Started with Severless ArchitectureCloudsolutionday 2016: Getting Started with Severless Architecture
Cloudsolutionday 2016: Getting Started with Severless ArchitectureAWS Vietnam Community
 
Serverless log analytics with Amazon Kinesis
Serverless log analytics with Amazon KinesisServerless log analytics with Amazon Kinesis
Serverless log analytics with Amazon KinesisRob Greenwood
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSAWS Vietnam Community
 
Getting Started with Serverless PHP
Getting Started with Serverless PHPGetting Started with Serverless PHP
Getting Started with Serverless PHPAndrew Raines
 
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...ScyllaDB
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real lifeDataArt
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginerYousun Jeong
 
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzig
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup LeipzigVisualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzig
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzigroot360 GmbH
 
Apache Superset at Airbnb
Apache Superset at AirbnbApache Superset at Airbnb
Apache Superset at AirbnbBill Liu
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond RelationalLynn Langit
 
Meetup #3: Migrating an Oracle Application from on-premise to AWS
Meetup #3: Migrating an Oracle Application from on-premise to AWSMeetup #3: Migrating an Oracle Application from on-premise to AWS
Meetup #3: Migrating an Oracle Application from on-premise to AWSAWS Vietnam Community
 
Apache Cassandra in the Cloud
Apache Cassandra in the CloudApache Cassandra in the Cloud
Apache Cassandra in the CloudInstaclustr
 
Long running aws lambda - Joel Schuweiler, Minneapolis
Long running aws lambda -  Joel Schuweiler, MinneapolisLong running aws lambda -  Joel Schuweiler, Minneapolis
Long running aws lambda - Joel Schuweiler, MinneapolisAWS Chicago
 

What's hot (19)

Benchmarking Aerospike on the Google Cloud - NoSQL Speed with Ease
Benchmarking Aerospike on the Google Cloud - NoSQL Speed with EaseBenchmarking Aerospike on the Google Cloud - NoSQL Speed with Ease
Benchmarking Aerospike on the Google Cloud - NoSQL Speed with Ease
 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
Why learn jenkins via nomad_ci (nomad/consul/docker/jenkins) 
 
AWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and ResultsAWS to Bare Metal: Motivation, Pitfalls, and Results
AWS to Bare Metal: Motivation, Pitfalls, and Results
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
 
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and JujuMining public datasets using opensource tools: Zeppelin, Spark and Juju
Mining public datasets using opensource tools: Zeppelin, Spark and Juju
 
London HUG 8/3 - Nomad
London HUG 8/3 - NomadLondon HUG 8/3 - Nomad
London HUG 8/3 - Nomad
 
Cloudsolutionday 2016: Getting Started with Severless Architecture
Cloudsolutionday 2016: Getting Started with Severless ArchitectureCloudsolutionday 2016: Getting Started with Severless Architecture
Cloudsolutionday 2016: Getting Started with Severless Architecture
 
Serverless log analytics with Amazon Kinesis
Serverless log analytics with Amazon KinesisServerless log analytics with Amazon Kinesis
Serverless log analytics with Amazon Kinesis
 
Meetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWSMeetup #3: Migrate a fast scale system to AWS
Meetup #3: Migrate a fast scale system to AWS
 
Getting Started with Serverless PHP
Getting Started with Serverless PHPGetting Started with Serverless PHP
Getting Started with Serverless PHP
 
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...
Scylla Summit 2022: Rakuten’s Catalog Platform Migration from Cassandra to Sc...
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real life
 
Kafka for begginer
Kafka for begginerKafka for begginer
Kafka for begginer
 
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzig
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup LeipzigVisualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzig
Visualization of RDS metrics using AWS CLI and JQuery at AWS Usergroup Leipzig
 
Apache Superset at Airbnb
Apache Superset at AirbnbApache Superset at Airbnb
Apache Superset at Airbnb
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
 
Meetup #3: Migrating an Oracle Application from on-premise to AWS
Meetup #3: Migrating an Oracle Application from on-premise to AWSMeetup #3: Migrating an Oracle Application from on-premise to AWS
Meetup #3: Migrating an Oracle Application from on-premise to AWS
 
Apache Cassandra in the Cloud
Apache Cassandra in the CloudApache Cassandra in the Cloud
Apache Cassandra in the Cloud
 
Long running aws lambda - Joel Schuweiler, Minneapolis
Long running aws lambda -  Joel Schuweiler, MinneapolisLong running aws lambda -  Joel Schuweiler, Minneapolis
Long running aws lambda - Joel Schuweiler, Minneapolis
 

Similar to beSharp a serverless approach to big data on aws

Scality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup PresentationScality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup PresentationScality
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudQubole
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesRose Toomey
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesRose Toomey
 
re:Cap RVA - A Recap of AWS re:Invent 2019
re:Cap RVA - A Recap of AWS re:Invent 2019re:Cap RVA - A Recap of AWS re:Invent 2019
re:Cap RVA - A Recap of AWS re:Invent 2019Ford Prior
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersAmazon Web Services
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesAmazon Web Services
 
Serverless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloadsServerless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloadsTensult
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheBùi Quang Lâm
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 DistilledGrig Gheorghiu
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenMS Cloud Summit
 
AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataWeCloudData
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Going Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueGoing Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueMichael Rainey
 
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...AWS Chicago
 
Running SQL Server on AWS | John McCormack | DataGrillen 2019
Running SQL Server on AWS | John McCormack | DataGrillen 2019Running SQL Server on AWS | John McCormack | DataGrillen 2019
Running SQL Server on AWS | John McCormack | DataGrillen 2019John McCormack
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...Radhika Puthiyetath
 
5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency DatabaseScyllaDB
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudCAMMS
 

Similar to beSharp a serverless approach to big data on aws (20)

Scality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup PresentationScality S3 Server: Node js Meetup Presentation
Scality S3 Server: Node js Meetup Presentation
 
Optimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public CloudOptimizing Big Data to run in the Public Cloud
Optimizing Big Data to run in the Public Cloud
 
Leveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark PipelinesLeveraging Databricks for Spark Pipelines
Leveraging Databricks for Spark Pipelines
 
Leveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelinesLeveraging Databricks for Spark pipelines
Leveraging Databricks for Spark pipelines
 
re:Cap RVA - A Recap of AWS re:Invent 2019
re:Cap RVA - A Recap of AWS re:Invent 2019re:Cap RVA - A Recap of AWS re:Invent 2019
re:Cap RVA - A Recap of AWS re:Invent 2019
 
Scaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million UsersScaling Up to Your First 10 Million Users
Scaling Up to Your First 10 Million Users
 
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar SeriesIntroducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
Introducing Amazon EMR Release 5.0 - August 2016 Monthly Webinar Series
 
Serverless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloadsServerless design considerations for Cloud Native workloads
Serverless design considerations for Cloud Native workloads
 
Training AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCacheTraining AWS: Module 8 - RDS, Aurora, ElastiCache
Training AWS: Module 8 - RDS, Aurora, ElastiCache
 
Five Years of EC2 Distilled
Five Years of EC2 DistilledFive Years of EC2 Distilled
Five Years of EC2 Distilled
 
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. NielsenJ1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
J1 T1 3 - Azure Data Lake store & analytics 101 - Kenneth M. Nielsen
 
AWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudDataAWS Well Architected-Info Session WeCloudData
AWS Well Architected-Info Session WeCloudData
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Going Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS GlueGoing Serverless - an Introduction to AWS Glue
Going Serverless - an Introduction to AWS Glue
 
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
AWS Community Day 2022 Shirish Joshi_Choosing between RDS and Aurora for MySQ...
 
Running SQL Server on AWS | John McCormack | DataGrillen 2019
Running SQL Server on AWS | John McCormack | DataGrillen 2019Running SQL Server on AWS | John McCormack | DataGrillen 2019
Running SQL Server on AWS | John McCormack | DataGrillen 2019
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database5 Factors When Selecting a High Performance, Low Latency Database
5 Factors When Selecting a High Performance, Low Latency Database
 
Best of re:Invent
Best of re:InventBest of re:Invent
Best of re:Invent
 
Move your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in CloudMove your on prem data to a lake in a Lake in Cloud
Move your on prem data to a lake in a Lake in Cloud
 

More from Claudio Pontili

Infrastructure as code for enterprises
Infrastructure as code for enterprisesInfrastructure as code for enterprises
Infrastructure as code for enterprisesClaudio Pontili
 
(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverless(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverlessClaudio Pontili
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutionsClaudio Pontili
 
The Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityThe Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityClaudio Pontili
 
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...Claudio Pontili
 
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web Services
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web ServicesSeminario Cloud computing Ordine di latina - L'offerta di Amazon Web Services
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web ServicesClaudio Pontili
 
Seminario Cloud computing Ordine di latina - cloud computing
Seminario Cloud computing Ordine di latina - cloud computingSeminario Cloud computing Ordine di latina - cloud computing
Seminario Cloud computing Ordine di latina - cloud computingClaudio Pontili
 
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...Claudio Pontili
 

More from Claudio Pontili (9)

Infrastructure as code for enterprises
Infrastructure as code for enterprisesInfrastructure as code for enterprises
Infrastructure as code for enterprises
 
(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverless(New)SQL on AWS: Aurora serverless
(New)SQL on AWS: Aurora serverless
 
Big problems Big Data, simple solutions
Big problems Big Data, simple solutionsBig problems Big Data, simple solutions
Big problems Big Data, simple solutions
 
The Fermilab HEPCloud Facility
The Fermilab HEPCloud FacilityThe Fermilab HEPCloud Facility
The Fermilab HEPCloud Facility
 
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...
Seminario Cloud computing Ordine di latina - Caso d'uso realizzazione sito wo...
 
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web Services
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web ServicesSeminario Cloud computing Ordine di latina - L'offerta di Amazon Web Services
Seminario Cloud computing Ordine di latina - L'offerta di Amazon Web Services
 
Seminario Cloud computing Ordine di latina - cloud computing
Seminario Cloud computing Ordine di latina - cloud computingSeminario Cloud computing Ordine di latina - cloud computing
Seminario Cloud computing Ordine di latina - cloud computing
 
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...
GARANTE DELLA PRIVACY - Cloud computing proteggere i dati per non cadere dall...
 
Fermilab aws on demand
Fermilab aws on demandFermilab aws on demand
Fermilab aws on demand
 

Recently uploaded

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 

Recently uploaded (20)

PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 

beSharp a serverless approach to big data on aws

  • 1. AWS SummitEMEA Claudio Pontili AWS Senior Cloud Solution Architect Claudio.pontili@besharp.it AServerless approachtoBigDataonAWS
  • 2.
  • 3. Claudio Pontili 10+ years of experience on AWS Senior Cloud Solution Architect AWS Authorized Instructor Champion Claudio.pontili@besharp.it https://www.linkedin.com/in/claudiopontili/
  • 4. Agenda • Using Lambda for ETLs • Glue ETLs • CI/CD to deploy code inside Lambdas and Glue Jobs • Datawarehousing on Aurora Serverless v1 • A full serverless Big Data Architecture • What we’ve learned
  • 5. Using Lambda for ETLs 1/2 • You can use Python + Panda library • A lambda can have 10 GB of memory and a lot of CPU power • A lambda can run for 15 minutes • Max deployment package 50 MB (zipped) • Container image code package size 10 GB • /tmp directory storage 512 MB
  • 6. Using Lambda for ETLs 2/2
  • 7. Glue Jobs, Data Catalog and Crawler 1/2 • Fully managed Data Catlog and Extract-Transform-Load (ETL) service • Automates data discovery, conversion, mapping and job scheduling • Glue runs your ETL jobs in an Apache Spark serverless envinronment • Allow to scale your ETLs jobs • Can easily schedule a crawler to to create a catalog of files stored on S3 • Too much code? Try Glue Databrew
  • 9. Glue Jobs, Data Catalog and Crawler 2/2
  • 11. RDS Aurora Serverless • MySql and PostgreSQL supported (reuse the experience of your team) • Pay per ACU/hours (2 GB of memory) • Scales from 1 to 256 ACU • You can pause the cluster during the night • Aurora Serverless v2 in preview for MultiAz, Read-Replicas, faster scale
  • 12. Glue Jobs, Data Catalog and Crawler 2/2
  • 13. What we’ve learned • Serverless gives you High Availability and great scalability with no effort • Pause Aurora Serverless v1 (it will take about 30 seconds to restart) • Use IaC (Cloudformation, Terraform, CDK, etc) to deploy your infrastructure • Tune your lamba memory using https://github.com/alexcasalboni/aws- lambda-power-tuning • S3 is cheap but try not to write tiny (<128KB) files • Serverless can be pretty cheap if it’s used in the right way
  • 15. www.besharp.it info@besharp.it +39 0382 1692920 beSharp srl - viale Ludovico il Moro 27 - 27100 Pavia (ITALY) VAT ID IT02415160189 Follow beSharp on