SlideShare a Scribd company logo
AMAZON GLUE
Elif Nurber KARAKAS
AWS Glue is a serverless data integration service that
makes it easy to discover, prepare, and combine data
for analytics, machine learning, and application
development. AWS Glue provides all the capabilities
needed for data integration so that you can start
analyzing your data and putting it to use in minutes
instead of months.
What is AWS Glue?
USE CASES
Glue can integrate with Snowflake data
warehouse to help manage the data
integration process.
AWS data lake can integrate with Glue.
AWS Glue can integrate with Athena to
create schemas.
ETL code can be used for Glue on GitHub
as well.
Benefits of using Glue
Fault-tolerance: Failed jobs in Glue are retrievable, and
logs in Glue can be debugged.
Filtering: Filters for bad data.
Support: Supports several non-native Java Database
Connectivity (JDBC) data sources.
Maintenance and deployment: Simple maintenance
and deployment, because the service is completely
managed by AWS.
Drawbacks of
Using Glue
Limited compatibility: While AWS Glue does work with
a variety of commonly used data sources, it only works
with services running on AWS. Organizations may need
a third-party ETL service if sources are not AWS-based.
No incremental data sync: All data is staged on S3 first,
so Glue is not the best option for real-time ETL jobs.
Learning curve: Teams using Glue should have a strong
understanding of Apache spark.
Relational database queries: Glue has limited support
for queries of traditional relational databases, only SQL
queries.
SOME OF GLUE
CUSTOMERS
THANK YOU!
nurberelif@gmail.com

More Related Content

Similar to Amazon Glue

5 Reasons to Move Your BI to the Cloud
5 Reasons to Move Your BI to the Cloud5 Reasons to Move Your BI to the Cloud
5 Reasons to Move Your BI to the Cloud
Tableau Software
 
Azure quick-start-for-net-developers
Azure quick-start-for-net-developersAzure quick-start-for-net-developers
Azure quick-start-for-net-developers
imdurgesh
 
cloud computing.pptx
cloud computing.pptxcloud computing.pptx
cloud computing.pptx
GayathriP95
 
Extending your Hadoop Implementation to the Cloud
Extending your Hadoop Implementation to the CloudExtending your Hadoop Implementation to the Cloud
Extending your Hadoop Implementation to the Cloud
DataWorks Summit
 
Aswin
AswinAswin
Aswin
AswinT6
 
UNIT -IV.docx
UNIT -IV.docxUNIT -IV.docx
UNIT -IV.docx
Revathiparamanathan
 
Cloud Computing Serverless Architecture
Cloud Computing Serverless ArchitectureCloud Computing Serverless Architecture
Cloud Computing Serverless Architecture
YASH Technologies
 
Aws vs azure bakeoff
Aws vs azure bakeoffAws vs azure bakeoff
Aws vs azure bakeoff
SoHo Dragon
 
Working with azure database services platform
Working with azure database services platformWorking with azure database services platform
Working with azure database services platform
ssuser79fc19
 
Azure fundamental -Introduction
Azure fundamental -IntroductionAzure fundamental -Introduction
Azure fundamental -Introduction
ManishK55
 
Introdcution to Azure
Introdcution to AzureIntrodcution to Azure
Introdcution to Azure
Omid Vahdaty
 
When Should You Use AWS Lambda?
When Should You Use AWS Lambda?When Should You Use AWS Lambda?
When Should You Use AWS Lambda?
Whizlabs
 
Google apps engine
Google apps engineGoogle apps engine
Google apps engine
shubhravrat Deshpande
 
Azure from scratch part 3 By Girish Kalamati
Azure from scratch part 3 By Girish KalamatiAzure from scratch part 3 By Girish Kalamati
Azure from scratch part 3 By Girish Kalamati
Girish Kalamati
 
Google apps engine
Google apps engineGoogle apps engine
Google apps engine
shubhravrat Deshpande
 
Cloud service providers
Cloud service providersCloud service providers
Cloud service providers
AgnihotriGhosh1
 
Aws community day pune 2020 v3
Aws community day pune 2020 v3Aws community day pune 2020 v3
Aws community day pune 2020 v3
Sridevi Murugayen
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft Azure
Ernest Mueller
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
CCG
 
Serverless Extract-transform-load (ETL) on AWS Webinar
Serverless Extract-transform-load (ETL) on AWS WebinarServerless Extract-transform-load (ETL) on AWS Webinar
Serverless Extract-transform-load (ETL) on AWS Webinar
Amazon Web Services
 

Similar to Amazon Glue (20)

5 Reasons to Move Your BI to the Cloud
5 Reasons to Move Your BI to the Cloud5 Reasons to Move Your BI to the Cloud
5 Reasons to Move Your BI to the Cloud
 
Azure quick-start-for-net-developers
Azure quick-start-for-net-developersAzure quick-start-for-net-developers
Azure quick-start-for-net-developers
 
cloud computing.pptx
cloud computing.pptxcloud computing.pptx
cloud computing.pptx
 
Extending your Hadoop Implementation to the Cloud
Extending your Hadoop Implementation to the CloudExtending your Hadoop Implementation to the Cloud
Extending your Hadoop Implementation to the Cloud
 
Aswin
AswinAswin
Aswin
 
UNIT -IV.docx
UNIT -IV.docxUNIT -IV.docx
UNIT -IV.docx
 
Cloud Computing Serverless Architecture
Cloud Computing Serverless ArchitectureCloud Computing Serverless Architecture
Cloud Computing Serverless Architecture
 
Aws vs azure bakeoff
Aws vs azure bakeoffAws vs azure bakeoff
Aws vs azure bakeoff
 
Working with azure database services platform
Working with azure database services platformWorking with azure database services platform
Working with azure database services platform
 
Azure fundamental -Introduction
Azure fundamental -IntroductionAzure fundamental -Introduction
Azure fundamental -Introduction
 
Introdcution to Azure
Introdcution to AzureIntrodcution to Azure
Introdcution to Azure
 
When Should You Use AWS Lambda?
When Should You Use AWS Lambda?When Should You Use AWS Lambda?
When Should You Use AWS Lambda?
 
Google apps engine
Google apps engineGoogle apps engine
Google apps engine
 
Azure from scratch part 3 By Girish Kalamati
Azure from scratch part 3 By Girish KalamatiAzure from scratch part 3 By Girish Kalamati
Azure from scratch part 3 By Girish Kalamati
 
Google apps engine
Google apps engineGoogle apps engine
Google apps engine
 
Cloud service providers
Cloud service providersCloud service providers
Cloud service providers
 
Aws community day pune 2020 v3
Aws community day pune 2020 v3Aws community day pune 2020 v3
Aws community day pune 2020 v3
 
Inside Microsoft Azure
Inside Microsoft AzureInside Microsoft Azure
Inside Microsoft Azure
 
Exploring Microsoft Azure Infrastructures
Exploring Microsoft Azure InfrastructuresExploring Microsoft Azure Infrastructures
Exploring Microsoft Azure Infrastructures
 
Serverless Extract-transform-load (ETL) on AWS Webinar
Serverless Extract-transform-load (ETL) on AWS WebinarServerless Extract-transform-load (ETL) on AWS Webinar
Serverless Extract-transform-load (ETL) on AWS Webinar
 

More from Elif Nurber Karakaş

Amazon Lake Formation
Amazon Lake FormationAmazon Lake Formation
Amazon Lake Formation
Elif Nurber Karakaş
 
Amazon Data Pipeline
Amazon Data PipelineAmazon Data Pipeline
Amazon Data Pipeline
Elif Nurber Karakaş
 
Amazon Quicksight
Amazon QuicksightAmazon Quicksight
Amazon Quicksight
Elif Nurber Karakaş
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
Elif Nurber Karakaş
 
Amazon OpenSearch Service
Amazon OpenSearch ServiceAmazon OpenSearch Service
Amazon OpenSearch Service
Elif Nurber Karakaş
 
Amazon Managed Streaming for Apache Kafka
Amazon Managed Streaming for Apache KafkaAmazon Managed Streaming for Apache Kafka
Amazon Managed Streaming for Apache Kafka
Elif Nurber Karakaş
 
Amazon Kinesis Data Streams
Amazon Kinesis Data StreamsAmazon Kinesis Data Streams
Amazon Kinesis Data Streams
Elif Nurber Karakaş
 
Amazon CloudSearch
Amazon CloudSearchAmazon CloudSearch
Amazon CloudSearch
Elif Nurber Karakaş
 
Amazon Elastic MapReduce
Amazon Elastic MapReduceAmazon Elastic MapReduce
Amazon Elastic MapReduce
Elif Nurber Karakaş
 
Amazon athena
Amazon athenaAmazon athena
Amazon athena
Elif Nurber Karakaş
 

More from Elif Nurber Karakaş (10)

Amazon Lake Formation
Amazon Lake FormationAmazon Lake Formation
Amazon Lake Formation
 
Amazon Data Pipeline
Amazon Data PipelineAmazon Data Pipeline
Amazon Data Pipeline
 
Amazon Quicksight
Amazon QuicksightAmazon Quicksight
Amazon Quicksight
 
Amazon Redshift
Amazon RedshiftAmazon Redshift
Amazon Redshift
 
Amazon OpenSearch Service
Amazon OpenSearch ServiceAmazon OpenSearch Service
Amazon OpenSearch Service
 
Amazon Managed Streaming for Apache Kafka
Amazon Managed Streaming for Apache KafkaAmazon Managed Streaming for Apache Kafka
Amazon Managed Streaming for Apache Kafka
 
Amazon Kinesis Data Streams
Amazon Kinesis Data StreamsAmazon Kinesis Data Streams
Amazon Kinesis Data Streams
 
Amazon CloudSearch
Amazon CloudSearchAmazon CloudSearch
Amazon CloudSearch
 
Amazon Elastic MapReduce
Amazon Elastic MapReduceAmazon Elastic MapReduce
Amazon Elastic MapReduce
 
Amazon athena
Amazon athenaAmazon athena
Amazon athena
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 

Amazon Glue

  • 2. AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months. What is AWS Glue?
  • 3. USE CASES Glue can integrate with Snowflake data warehouse to help manage the data integration process. AWS data lake can integrate with Glue. AWS Glue can integrate with Athena to create schemas. ETL code can be used for Glue on GitHub as well.
  • 4. Benefits of using Glue Fault-tolerance: Failed jobs in Glue are retrievable, and logs in Glue can be debugged. Filtering: Filters for bad data. Support: Supports several non-native Java Database Connectivity (JDBC) data sources. Maintenance and deployment: Simple maintenance and deployment, because the service is completely managed by AWS.
  • 5. Drawbacks of Using Glue Limited compatibility: While AWS Glue does work with a variety of commonly used data sources, it only works with services running on AWS. Organizations may need a third-party ETL service if sources are not AWS-based. No incremental data sync: All data is staged on S3 first, so Glue is not the best option for real-time ETL jobs. Learning curve: Teams using Glue should have a strong understanding of Apache spark. Relational database queries: Glue has limited support for queries of traditional relational databases, only SQL queries.