SlideShare a Scribd company logo
1 of 14
Download to read offline
Amazon Athena overview
S o f t w a r e E n g i n e e r a t E P A M
R a m a n M a s k a l e n k a
3 0 Н О Я Б Р Я
© 2019 EPAM Systems, Inc.
Table of context
A M A Z O N A T H E N A O V E R V I E W
S U P P O R T E D D A T A T Y P E S
T E C H N O L O G I E S U N D E R T H E H O O D
S I M P L E U S E C A S E
I N T E G R A T I O N W I T H O T H E R S E R V I C E S
T H I N G S T O C O N S I D E R W H I L E U S I N G
A T H E N A
2
© 2019 EPAM Systems, Inc.
Amazon Athena Overview
• Serverless
• No need of setting up an infrastructure
• Zero Spin up time
• Transparent upgrades
• Interactive
• High execution speed of queries
• Descriptive error messages
SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
© 2019 EPAM Systems, Inc.
Amazon Athena Overview
• Highly available
• Athena uses warm compute pools across multiple Availability Zones
• Your data is stored in S3 which is also designed for availability
• Core effective
• Automatically parallelize queries
• Results are streamed to console
• Tuned for performance
SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
© 2019 EPAM Systems, Inc.
Amazon Athena Overview
• Uses ANSI SQL
• Supports complex joins, nested queries and window functions
• Supports Complex data types (arrays, structs)
• Supports partitioning by almost any key, except datetime timestamp
• Cost effective
• Pay per query
• $5 per TB scanned
SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
© 2019 EPAM Systems, Inc.
Supported data types
• Text files (CSV, raw)
• Apache Web Logs, TSV
• JSON (simple, nested)
• Compressed files
• Apache parquet & Apache ORC
© 2019 EPAM Systems, Inc.
Technologies under the hood
Originally created by Facebook for their data
analysis to run interactive queries on large
amount of data.
• In-memory distributed query engine, ANSI-
SQL compatible with extensions
• Used by Athena for SQL queries
7
Data warehouse software project built on top of
Apache Hadoop for providing data query and
analysis. Allows to run SQL queries over
distributed data.
• Used by Athena for Data definition language
(DDL) functionality
• Supports complex datatypes and multiple
formats
• Supports partitioning
© 2019 EPAM Systems, Inc.
Simple use case
8
© 2019 EPAM Systems, Inc.
Simple use case
9
© 2019 EPAM Systems, Inc.
Integration with other services
10
© 2019 EPAM Systems, Inc.
Things to consider while using Athena
• No data transformation is made in S3
• You can write complex regexes for table creation
• You don’t pay for data transformation
• You can store your data in compressed format to lower the costs
• Rich access control (IAM, ACL, S3 bucket policies)
• Can be integrated with a lot of Business intelligence (BI) tools
PROS
© 2019 EPAM Systems, Inc.
Things to consider while using Athena
• Canceled queries will cost money for the data scanned
• Queries are rounded up to the nearest MB, with a 10 MB minimum.
• Query execution cost will consist of S3 data read + Athena scanned data rates
• Not all Hive DDL’s are supported by Athena
• Hive or Presto transactions are not supported by Athena
• User-defined functions and stored procedures are not supported
CONS
© 2019 EPAM Systems, Inc.
© 2019 EPAM Systems, Inc.

More Related Content

What's hot

Getting Started with Amazon EMR
Getting Started with Amazon EMRGetting Started with Amazon EMR
Getting Started with Amazon EMRArman Iman
 
Building a global database with MongoDB Atlas - DEM16-S - New York AWS Summit
Building a global database with MongoDB Atlas - DEM16-S - New York AWS SummitBuilding a global database with MongoDB Atlas - DEM16-S - New York AWS Summit
Building a global database with MongoDB Atlas - DEM16-S - New York AWS SummitAmazon Web Services
 
AI and Machine Learning - AWS Public Sector Summit Singapore 2017
AI and Machine Learning - AWS Public Sector Summit Singapore 2017AI and Machine Learning - AWS Public Sector Summit Singapore 2017
AI and Machine Learning - AWS Public Sector Summit Singapore 2017Amazon Web Services
 
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017Dave Nielsen
 
AWS Canberra User Group - August 2019 Intro
AWS Canberra User Group - August 2019 IntroAWS Canberra User Group - August 2019 Intro
AWS Canberra User Group - August 2019 IntroBrian Farnhill
 
AWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAmazon Web Services
 
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...SaM theCloudGuy
 
AWS tutorial-Part27:AWS EC2
AWS tutorial-Part27:AWS EC2AWS tutorial-Part27:AWS EC2
AWS tutorial-Part27:AWS EC2SaM theCloudGuy
 
Canberra AWS User Group Intro - May 2019
Canberra AWS User Group Intro - May 2019Canberra AWS User Group Intro - May 2019
Canberra AWS User Group Intro - May 2019Brian Farnhill
 
AWS Data Lifecycle and Storage Management Demo
AWS Data Lifecycle and Storage Management DemoAWS Data Lifecycle and Storage Management Demo
AWS Data Lifecycle and Storage Management DemoAmazon Web Services
 
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017Amazon Web Services
 
STG205_#EarthOnAWS How NASA is Using AWS
STG205_#EarthOnAWS How NASA is Using AWSSTG205_#EarthOnAWS How NASA is Using AWS
STG205_#EarthOnAWS How NASA is Using AWSAmazon Web Services
 
AWS tutorial-Part82: Exam Essentials#2
AWS tutorial-Part82: Exam Essentials#2AWS tutorial-Part82: Exam Essentials#2
AWS tutorial-Part82: Exam Essentials#2SaM theCloudGuy
 
How Intuit simplifies storage on AWS - DEM13-S - New York AWS Summit
How Intuit simplifies storage on AWS - DEM13-S - New York AWS SummitHow Intuit simplifies storage on AWS - DEM13-S - New York AWS Summit
How Intuit simplifies storage on AWS - DEM13-S - New York AWS SummitAmazon Web Services
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSightAmazon Web Services
 
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...Edureka!
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights Orit Alul
 
Modernizing DMS: Database Week San Francisco
Modernizing DMS: Database Week San FranciscoModernizing DMS: Database Week San Francisco
Modernizing DMS: Database Week San FranciscoAmazon Web Services
 

What's hot (20)

Getting Started with Amazon EMR
Getting Started with Amazon EMRGetting Started with Amazon EMR
Getting Started with Amazon EMR
 
Building a global database with MongoDB Atlas - DEM16-S - New York AWS Summit
Building a global database with MongoDB Atlas - DEM16-S - New York AWS SummitBuilding a global database with MongoDB Atlas - DEM16-S - New York AWS Summit
Building a global database with MongoDB Atlas - DEM16-S - New York AWS Summit
 
AI and Machine Learning - AWS Public Sector Summit Singapore 2017
AI and Machine Learning - AWS Public Sector Summit Singapore 2017AI and Machine Learning - AWS Public Sector Summit Singapore 2017
AI and Machine Learning - AWS Public Sector Summit Singapore 2017
 
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017BigDL Deep Learning in Apache Spark - AWS re:invent 2017
BigDL Deep Learning in Apache Spark - AWS re:invent 2017
 
AWS Canberra User Group - August 2019 Intro
AWS Canberra User Group - August 2019 IntroAWS Canberra User Group - August 2019 Intro
AWS Canberra User Group - August 2019 Intro
 
AWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMRAWS May Webinar Series - Getting Started with Amazon EMR
AWS May Webinar Series - Getting Started with Amazon EMR
 
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...
AWS tutorial-Part5 to 10(Combined):Overview of various AWS services and offer...
 
AWS tutorial-Part27:AWS EC2
AWS tutorial-Part27:AWS EC2AWS tutorial-Part27:AWS EC2
AWS tutorial-Part27:AWS EC2
 
Canberra AWS User Group Intro - May 2019
Canberra AWS User Group Intro - May 2019Canberra AWS User Group Intro - May 2019
Canberra AWS User Group Intro - May 2019
 
AWS Data Lifecycle and Storage Management Demo
AWS Data Lifecycle and Storage Management DemoAWS Data Lifecycle and Storage Management Demo
AWS Data Lifecycle and Storage Management Demo
 
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
How to Extend your Office 365 Investment to AWS - WIN404 - re:Invent 2017
 
STG205_#EarthOnAWS How NASA is Using AWS
STG205_#EarthOnAWS How NASA is Using AWSSTG205_#EarthOnAWS How NASA is Using AWS
STG205_#EarthOnAWS How NASA is Using AWS
 
AWS tutorial-Part82: Exam Essentials#2
AWS tutorial-Part82: Exam Essentials#2AWS tutorial-Part82: Exam Essentials#2
AWS tutorial-Part82: Exam Essentials#2
 
Amazon QuickSight
Amazon QuickSightAmazon QuickSight
Amazon QuickSight
 
Modernizing Databases with DMS
Modernizing Databases with DMSModernizing Databases with DMS
Modernizing Databases with DMS
 
How Intuit simplifies storage on AWS - DEM13-S - New York AWS Summit
How Intuit simplifies storage on AWS - DEM13-S - New York AWS SummitHow Intuit simplifies storage on AWS - DEM13-S - New York AWS Summit
How Intuit simplifies storage on AWS - DEM13-S - New York AWS Summit
 
Visualization with Amazon QuickSight
Visualization with Amazon QuickSightVisualization with Amazon QuickSight
Visualization with Amazon QuickSight
 
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
Time Series In R | Time Series Forecasting | Time Series Analysis | Data Scie...
 
From Data To Insights
From Data To Insights From Data To Insights
From Data To Insights
 
Modernizing DMS: Database Week San Francisco
Modernizing DMS: Database Week San FranciscoModernizing DMS: Database Week San Francisco
Modernizing DMS: Database Week San Francisco
 

Similar to Amazon Athena overview

Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Amazon Web Services
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfAmazon Web Services
 
在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析Amazon Web Services
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfAmazon Web Services
 
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay NordicsBuilding a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordicsjavier ramirez
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaAmazon Web Services
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.javier ramirez
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSSteven Hsieh
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWSAmazon Web Services
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...Amazon Web Services
 
Amazon Aurora (MySQL, Postgres)
Amazon Aurora (MySQL, Postgres)Amazon Aurora (MySQL, Postgres)
Amazon Aurora (MySQL, Postgres)AWS Germany
 
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Amazon Web Services
 
From raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakeFrom raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakejavier ramirez
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with ZopaAmazon Web Services
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...AWS Riyadh User Group
 
Creare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data WarehousesCreare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data WarehousesAmazon Web Services
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSAmazon Web Services
 
Builders' Day - Building Data Lakes for Analytics On AWS LC
Builders' Day - Building Data Lakes for Analytics On AWS LCBuilders' Day - Building Data Lakes for Analytics On AWS LC
Builders' Day - Building Data Lakes for Analytics On AWS LCAmazon Web Services LATAM
 

Similar to Amazon Athena overview (20)

Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...Everything You Need to Know About Big Data: From Architectural Principles to ...
Everything You Need to Know About Big Data: From Architectural Principles to ...
 
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdfBuilding_a_Modern_Data_Platform_in_the_Cloud.pdf
Building_a_Modern_Data_Platform_in_the_Cloud.pdf
 
在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析在 AWS 上構建無服務器分析
在 AWS 上構建無服務器分析
 
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdfBuilding-a-Modern-Data-Platform-in-the-Cloud.pdf
Building-a-Modern-Data-Platform-in-the-Cloud.pdf
 
Building a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay NordicsBuilding a modern data platform in the cloud. AWS DevDay Nordics
Building a modern data platform in the cloud. AWS DevDay Nordics
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.Building Data Lakes and Analytics on AWS. IPExpo Manchester.
Building Data Lakes and Analytics on AWS. IPExpo Manchester.
 
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWSAWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
AWS 2019 Taipei Summit - Building Serverless Analytics Platform on AWS
 
Building a Modern Data Platform on AWS
Building a Modern Data Platform on AWSBuilding a Modern Data Platform on AWS
Building a Modern Data Platform on AWS
 
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
BDA308 Serverless Analytics with Amazon Athena and Amazon QuickSight, featuri...
 
Amazon Aurora (MySQL, Postgres)
Amazon Aurora (MySQL, Postgres)Amazon Aurora (MySQL, Postgres)
Amazon Aurora (MySQL, Postgres)
 
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
Best practices for migrating big data workloads to Amazon EMR - ADB204 - Chic...
 
From raw data to business insights. A modern data lake
From raw data to business insights. A modern data lakeFrom raw data to business insights. A modern data lake
From raw data to business insights. A modern data lake
 
21st Century Analytics with Zopa
21st Century Analytics with Zopa21st Century Analytics with Zopa
21st Century Analytics with Zopa
 
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
Cutting to the chase for Machine Learning Analytics Ecosystem & AWS Lake Form...
 
Creare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data WarehousesCreare e gestire Data Lake e Data Warehouses
Creare e gestire Data Lake e Data Warehouses
 
Big Data@Scale
 Big Data@Scale Big Data@Scale
Big Data@Scale
 
ABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWSABD312_Deep Dive Migrating Big Data Workloads to AWS
ABD312_Deep Dive Migrating Big Data Workloads to AWS
 
Builders' Day - Building Data Lakes for Analytics On AWS LC
Builders' Day - Building Data Lakes for Analytics On AWS LCBuilders' Day - Building Data Lakes for Analytics On AWS LC
Builders' Day - Building Data Lakes for Analytics On AWS LC
 

More from Vitebsk DSC

How to exceed Customer's expectations by delivery complicated ML+RPA project
How to exceed Customer's expectations by delivery complicated ML+RPA projectHow to exceed Customer's expectations by delivery complicated ML+RPA project
How to exceed Customer's expectations by delivery complicated ML+RPA projectVitebsk DSC
 
Аджайл майндсет. Что разрушает вашу команду?
Аджайл майндсет. Что разрушает вашу команду?Аджайл майндсет. Что разрушает вашу команду?
Аджайл майндсет. Что разрушает вашу команду?Vitebsk DSC
 
Микросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring CloudМикросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring CloudVitebsk DSC
 
Тестирование больших данных
Тестирование больших данныхТестирование больших данных
Тестирование больших данныхVitebsk DSC
 
Amazon SQS или не все костыли одинаково бесполезны
Amazon SQS или не все костыли одинаково бесполезныAmazon SQS или не все костыли одинаково бесполезны
Amazon SQS или не все костыли одинаково бесполезныVitebsk DSC
 
Typical BA Mistakes ​in documentation
Typical BA Mistakes ​in documentationTypical BA Mistakes ​in documentation
Typical BA Mistakes ​in documentationVitebsk DSC
 
На пути к совершенному инжинирингу
На пути к совершенному инжинирингуНа пути к совершенному инжинирингу
На пути к совершенному инжинирингуVitebsk DSC
 
Чего же ты хочешь, человек?
Чего же ты хочешь, человек?Чего же ты хочешь, человек?
Чего же ты хочешь, человек?Vitebsk DSC
 
Растем вместе с eKIDS
Растем вместе с eKIDSРастем вместе с eKIDS
Растем вместе с eKIDSVitebsk DSC
 
Технологии беспилотных автомобилей
Технологии беспилотных автомобилейТехнологии беспилотных автомобилей
Технологии беспилотных автомобилейVitebsk DSC
 
Оптимизация потребления памяти в Java - делаем уборку правильно
Оптимизация потребления памяти в Java - делаем уборку правильноОптимизация потребления памяти в Java - делаем уборку правильно
Оптимизация потребления памяти в Java - делаем уборку правильноVitebsk DSC
 
Управляем эволюцией на лету
Управляем эволюцией на летуУправляем эволюцией на лету
Управляем эволюцией на летуVitebsk DSC
 
Жизнь после promises
Жизнь после promisesЖизнь после promises
Жизнь после promisesVitebsk DSC
 
Выбираем стратегию создания бранчей
Выбираем стратегию создания бранчейВыбираем стратегию создания бранчей
Выбираем стратегию создания бранчейVitebsk DSC
 
Reactive programming для успеха вашего стартапа
Reactive programming для успеха вашего стартапаReactive programming для успеха вашего стартапа
Reactive programming для успеха вашего стартапаVitebsk DSC
 
Экстремальная оптимизация производительности на примере MongoDB Java Driver
Экстремальная оптимизация производительности на примере MongoDB Java DriverЭкстремальная оптимизация производительности на примере MongoDB Java Driver
Экстремальная оптимизация производительности на примере MongoDB Java DriverVitebsk DSC
 
Проблемы с производительностью приложений на AngularJS и способы их решения
Проблемы с производительностью приложений на AngularJS и способы их решенияПроблемы с производительностью приложений на AngularJS и способы их решения
Проблемы с производительностью приложений на AngularJS и способы их решенияVitebsk DSC
 
Микросервисы на практике
Микросервисы на практикеМикросервисы на практике
Микросервисы на практикеVitebsk DSC
 

More from Vitebsk DSC (20)

Community-Z
Community-ZCommunity-Z
Community-Z
 
How to exceed Customer's expectations by delivery complicated ML+RPA project
How to exceed Customer's expectations by delivery complicated ML+RPA projectHow to exceed Customer's expectations by delivery complicated ML+RPA project
How to exceed Customer's expectations by delivery complicated ML+RPA project
 
Аджайл майндсет. Что разрушает вашу команду?
Аджайл майндсет. Что разрушает вашу команду?Аджайл майндсет. Что разрушает вашу команду?
Аджайл майндсет. Что разрушает вашу команду?
 
Микросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring CloudМикросервисы со Spring Boot & Spring Cloud
Микросервисы со Spring Boot & Spring Cloud
 
Тестирование больших данных
Тестирование больших данныхТестирование больших данных
Тестирование больших данных
 
Amazon SQS или не все костыли одинаково бесполезны
Amazon SQS или не все костыли одинаково бесполезныAmazon SQS или не все костыли одинаково бесполезны
Amazon SQS или не все костыли одинаково бесполезны
 
Typical BA Mistakes ​in documentation
Typical BA Mistakes ​in documentationTypical BA Mistakes ​in documentation
Typical BA Mistakes ​in documentation
 
Boring is Fun!
Boring is Fun!Boring is Fun!
Boring is Fun!
 
На пути к совершенному инжинирингу
На пути к совершенному инжинирингуНа пути к совершенному инжинирингу
На пути к совершенному инжинирингу
 
Чего же ты хочешь, человек?
Чего же ты хочешь, человек?Чего же ты хочешь, человек?
Чего же ты хочешь, человек?
 
Растем вместе с eKIDS
Растем вместе с eKIDSРастем вместе с eKIDS
Растем вместе с eKIDS
 
Технологии беспилотных автомобилей
Технологии беспилотных автомобилейТехнологии беспилотных автомобилей
Технологии беспилотных автомобилей
 
Оптимизация потребления памяти в Java - делаем уборку правильно
Оптимизация потребления памяти в Java - делаем уборку правильноОптимизация потребления памяти в Java - делаем уборку правильно
Оптимизация потребления памяти в Java - делаем уборку правильно
 
Управляем эволюцией на лету
Управляем эволюцией на летуУправляем эволюцией на лету
Управляем эволюцией на лету
 
Жизнь после promises
Жизнь после promisesЖизнь после promises
Жизнь после promises
 
Выбираем стратегию создания бранчей
Выбираем стратегию создания бранчейВыбираем стратегию создания бранчей
Выбираем стратегию создания бранчей
 
Reactive programming для успеха вашего стартапа
Reactive programming для успеха вашего стартапаReactive programming для успеха вашего стартапа
Reactive programming для успеха вашего стартапа
 
Экстремальная оптимизация производительности на примере MongoDB Java Driver
Экстремальная оптимизация производительности на примере MongoDB Java DriverЭкстремальная оптимизация производительности на примере MongoDB Java Driver
Экстремальная оптимизация производительности на примере MongoDB Java Driver
 
Проблемы с производительностью приложений на AngularJS и способы их решения
Проблемы с производительностью приложений на AngularJS и способы их решенияПроблемы с производительностью приложений на AngularJS и способы их решения
Проблемы с производительностью приложений на AngularJS и способы их решения
 
Микросервисы на практике
Микросервисы на практикеМикросервисы на практике
Микросервисы на практике
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Amazon Athena overview

  • 1. Amazon Athena overview S o f t w a r e E n g i n e e r a t E P A M R a m a n M a s k a l e n k a 3 0 Н О Я Б Р Я
  • 2. © 2019 EPAM Systems, Inc. Table of context A M A Z O N A T H E N A O V E R V I E W S U P P O R T E D D A T A T Y P E S T E C H N O L O G I E S U N D E R T H E H O O D S I M P L E U S E C A S E I N T E G R A T I O N W I T H O T H E R S E R V I C E S T H I N G S T O C O N S I D E R W H I L E U S I N G A T H E N A 2
  • 3. © 2019 EPAM Systems, Inc. Amazon Athena Overview • Serverless • No need of setting up an infrastructure • Zero Spin up time • Transparent upgrades • Interactive • High execution speed of queries • Descriptive error messages SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
  • 4. © 2019 EPAM Systems, Inc. Amazon Athena Overview • Highly available • Athena uses warm compute pools across multiple Availability Zones • Your data is stored in S3 which is also designed for availability • Core effective • Automatically parallelize queries • Results are streamed to console • Tuned for performance SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
  • 5. © 2019 EPAM Systems, Inc. Amazon Athena Overview • Uses ANSI SQL • Supports complex joins, nested queries and window functions • Supports Complex data types (arrays, structs) • Supports partitioning by almost any key, except datetime timestamp • Cost effective • Pay per query • $5 per TB scanned SERVERLESS INTERACTIVE HIGHLY AVAILABLE SQL QUERY SERVICE
  • 6. © 2019 EPAM Systems, Inc. Supported data types • Text files (CSV, raw) • Apache Web Logs, TSV • JSON (simple, nested) • Compressed files • Apache parquet & Apache ORC
  • 7. © 2019 EPAM Systems, Inc. Technologies under the hood Originally created by Facebook for their data analysis to run interactive queries on large amount of data. • In-memory distributed query engine, ANSI- SQL compatible with extensions • Used by Athena for SQL queries 7 Data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Allows to run SQL queries over distributed data. • Used by Athena for Data definition language (DDL) functionality • Supports complex datatypes and multiple formats • Supports partitioning
  • 8. © 2019 EPAM Systems, Inc. Simple use case 8
  • 9. © 2019 EPAM Systems, Inc. Simple use case 9
  • 10. © 2019 EPAM Systems, Inc. Integration with other services 10
  • 11. © 2019 EPAM Systems, Inc. Things to consider while using Athena • No data transformation is made in S3 • You can write complex regexes for table creation • You don’t pay for data transformation • You can store your data in compressed format to lower the costs • Rich access control (IAM, ACL, S3 bucket policies) • Can be integrated with a lot of Business intelligence (BI) tools PROS
  • 12. © 2019 EPAM Systems, Inc. Things to consider while using Athena • Canceled queries will cost money for the data scanned • Queries are rounded up to the nearest MB, with a 10 MB minimum. • Query execution cost will consist of S3 data read + Athena scanned data rates • Not all Hive DDL’s are supported by Athena • Hive or Presto transactions are not supported by Athena • User-defined functions and stored procedures are not supported CONS
  • 13. © 2019 EPAM Systems, Inc.
  • 14. © 2019 EPAM Systems, Inc.