SlideShare a Scribd company logo
1 of 83
Download to read offline
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS를 통한 데이터 분석 및 처리의
새로운 혁신 기법
김윤건
사업개발 담당
AWS Korea
Agenda
인사이트 획득을 위한 데이터
데이터 레이크 구축
데이터 분석을 통한 인사이트 획득
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
인사이트 획득을 위한 이상적 모습
현실의 어려움
현실의 장애물들
데이터 구조
• 여러 시스템에 산재
• 분석에 부적합한 형태
• 레이블이 없는 데이터
데이터 가치
• 포맷: 철자, 단위
• 결측치
• 편향된 데이터
• 무상관성 데이터
데이터 중요도
• 비선호 데이터
• 높은 수집비용
• 본연적 상관성 데이터
현실의 장애물들
데이터 구조
• 여러 시스템에 산재
• 분석에 부적합한 형태
• 레이블이 없는 데이터
데이터 가치
• 포맷: 철자, 단위
• 결측치
• 편향된 데이터
• 무상관성 데이터
데이터 중요도
• 비선호 데이터
• 높은 수집비용
• 본연적 상관성 데이터
Data Transformation
Feature Engineering
Feature Selection
머신러닝: 분류
범주형
머신러닝: 회귀
수치형
머신러닝: 이상탐지
머신러닝: 군집분석
머신러닝: 연관성 분석
머신러닝을 위한 데이터의 모습
데이터 레이블링
데이터 레이블링
Name Month - 3 Month - 2 Month - 1
Joe Schmo 123.23 0 0
Jane Plain 0 0 0
Mary Happy 0 55.22 243.33
Tom Thumb 12.34 8.34 14.56
레이블 있는 데이터
Name Month - 3 Month - 2 Month - 1 Default
Joe
Schmo
123.23 0 0 FALSE
Jane
Plain
0 0 0 TRUE
Mary
Happy
0 55.22 243.33 FALSE
Tom
Thumb
12.34 8.34 14.56 FALSE
데이터 레이블링
Name Date Duration (s) Genre Plays
Highway star 1984-05-24 - Rock 139
Blues alive 1990/03/01 281 Blues 239
Lonely
planet
2002-11-19 5:32s Techno 42
Dance,
dance
02/23/1983 312 Disco N/A
The wall 1943-01-20 218 Reagge 83
Offside
down
1965-02-19 4 minutes Techno 895
The
alchemist
2001-11-21 418 Bluesss 178
Bring me
down
18-10-98 328 Classic 21
The
scarecrow
1994-10-12 269 Rock 734
정제된 데이터
Name Date Duration (s) Genre Plays
Highway star 1984-05-24 Rock 139
Blues alive 1990-03-01 281 Blues 239
Lonely
planet
2002-11-19 332 Techno 42
Dance,
dance
1983-02-23 312 Disco
The wall 1943-01-20 218 Reagge 83
Offside
down
1965-02-19 240 Techno 895
The
alchemist
2001-11-21 418 Blues 178
Bring me
down
1998-10-18 328 Classic 21
The
scarecrow
1994-10-12 269 Rock 734
머신러닝을 위한 데이터의 모습
Join
데이터 취합
Content Genre
Duratio
n
Play Time User Device
Highway Rock 190 2019-05-12 User001 TV
Blues alive Blues 281 2019-05-14 User005 Tablet
Lonely
planet
Tech 332 2019-05-14 User003 TV
Dance,
dance
Disco 312 2019-05-14 User001 Tablet
The wall Reaggae 218 2019-05-14 User002 Smartphone
Offside
down
Tech 240 2019-05-14 User005 Tablet
The
alchemist
Blues 418 2019-05-14 User003 TV
Bring me
down
Class 328 2019-05-15 User001 Tablet
The one Rock 269 2019-05-15 User003 Smartphone
취합된 데이터
User
Num.Playback
s
Total Time Pref.Device
User001 3 830 Tablet
User002 1 218 Smartphone
User003 3 1019 TV
User004 2 521 Tablet
데이터 피벗
Content Genre Duration Play Time User Device
Highway Rock 190 2019-05-12 User001 TV
Blues alive Blues 281 2019-05-14 User005 Tablet
Lonely planet Tech 332 2019-05-14 User003 TV
Dance, dance Disco 312 2019-05-14 User001 Tablet
The wall Reaggae 218 2019-05-14 User002 Smartphone
Offside down Tech 240 2019-05-14 User005 Tablet
The alchemist Blues 418 2019-05-14 User003 TV
Bring me
down
Class 328 2019-05-15 User001 Tablet
The one Rock 269 2019-05-15 User003 Smartphone
취합되고 피벗된 컬럼
User Num.Playbacks
Total
Time
Pref.Device NP_TV NP_Tablet NP_Smartphone TT_TV TT_Tablet TT_Smartphone
User001 3 830 Tablet 1 2 0 190 640 0
User002 1 218 Smartphone 0 0 1 0 0 218
User003 3 1019 TV 2 0 1 750 0 269
User004 2 521 Tablet 0 2 0 0 521 0
ETL vs. ELT
ELT
데이터 스케일: 모든 데이터를 쿼리
Unified view: Local storage and Amazon S3 data lake
Data Lake
Amazon S3 Directly query exabytes in Amazon S3
No data loading, eliminate ingestion time
Unified view of data across Amazon Redshift
and Amazon S3
Scale compute and storage separately
No server to maintain for Amazon S3 query
Support for Parquet, ORC, Avro, CSV,
JSON, Grok, and other open file formats
Pay only for the amount of data scanned
Amazon Redshift Spectrum
Amazon Redshift
query engine
Amazon Redshift
data
JDBC/ODBC
ETL
ELT-ETL
ELT-ETL
데이터 레이크 구축
데이터를 통한 인사이트 획득
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
데이터 레이크로 시작하는 인사이트 획득
데이터 레이크 구축
데이터 분석을 통한 인사이트 획득
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Amazon Athena
Interactive analytics
Amazon Kinesis
Real-time data analytics
Amazon Elasticsearch Service
Operational Analytics
Amazon
S3/Glacier
AWS GlueAWS Lake
Formation
Explosion of data Explosion of personas Demand for faster
decision-making on
real-time data
고객들이 겪고 있는 새로운 현실
전통적인 사일로를 통합
Data silos to
OLTP ERP CRM LOB
DW Silo 1
Business
intelligence
Devices Web sensors Social
DW Silo 2
Business
intelligence
MAC H I NE
LEARNI NG
BI +
ANALYTI C S
DATA
WAREH O U SI NG
Data lakes
OPEN
FORMATS
CENTRAL
CATALOG
클라우드 데이터 레이크
Customers want:
A single data store that is scalable and cost-effective
To use the standards-based data format of their choice
To analyze their data in a variety of ways
Cloud data lake
infrastructure
Decoupled storage
and compute resources
Security and governance
Data
migration
Streaming
services
Data warehouse
Big data
processing
Serverless data
processing
Real-time
analytics
Operational
analytics
Predictive
analyticsETL and catalog
data management
Amazon S3
Unmatched durability,
availability, and scalability
Most object-level
controls
Easiest to use with
cost optimization:
Intelligent Tiering
Best security, compliance,
and audit capabilities
Most ways to get data in
Broadest portfolio
of analytics tools
Amazon S3: 데이터 레이크를 위한 탁월한 선택
AWS의 서비스 구성
Amazon S3 | AWS Glue
AWS Lake Formation
Data lake
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Amazon Athena
Interactive analytics
Amazon Kinesis Data
Analytics Real time
Amazon Elasticsearch Service
Operational Analytics
Analytics-optimized storage
Amazon Redshift AQUA
Amazon Elasticsearch Ultrawarm
Analytics
AWS Data
Exchange
QuickSight
Visualizations
Amazon
SageMaker ML
Comprehend
NLP
Transcribe
Speech-to-text
Textract
Extract text
Personalize
Recommendation
Forecast
Forecasts
Translate
Translation
Business intelligence and machine learning
Kibana in ES
Operational dashboarding
Third-party BI tools
Data movement
AWS Database Migration Service | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose | Amazon Kinesis Data Streams
Managed Streaming for Kafka
새로운 현실을 위한 설계
Cloud-optimized: Architect services ground-up
for the cloud and for the explosion of data
Purpose-built: Offer a portfolio of purpose-
built services, optimized for your workloads
Fully managed: Help you innovate
faster through managed services
Now used by a very large number of
customers for mission-critical applications
가장 많은 데이터 레이크와 분석 고객을 보유
데이터 레이크 구축
데이터 레이크 구축
데이터 레이크 구축
데이터 레이크 구축
데이터 레이크 구축
AWS Lake Formation: 빠르게 데이터 레이크를 구축
Fastest to go from data to insights
Move, store, catalog, and
clean your data faster
Move, store, catalog,
and clean your data faster
with machine learning
Enforce security policies
across multiple services
Enforce security policies across
multiple services
Gain and manage new
insights
Empower analysts and data
scientists to gain and manage
new insights
Amazon Kinesis: 스트리밍 데이터 분석
Amazon Managed
Streaming for
Apache Kafka
(Amazon MSK)
Capture and store
data streams
Amazon Kinesis
Data Streams
Capture and store
data streams
Amazon Kinesis
Data Firehose
Analyze data streams
in real time
Amazon Kinesis
Data Analytics
Load streaming data
into streams, data
lakes, and data
warehouses
Amazon Kinesis: 스트리밍 분석 역량
Amazon MSK releases Open Monitoring with Prometheus
• Consume every Apache Kafka metric with low latency
• Enable time-series logging, alarming, and charting through Prometheus
Access resources within an Amazon VPC using Amazon Kinesis Data Analytics
• Read and write data from resources within your VPCs like Amazon Elasticsearch
Service clusters, RDS databases, and Redshift data warehouses
Run Apache Flink and Apache Kafka together using fully
managed services on AWS
• Use Kinesis Data Analytics to process streaming data stored
in Amazon MSK
• Run streaming solutions end-to-end using open source
software in fully managed services
AWS Data exchange: 외부 데이터 검색 및 구독
Easily find and subscribe to third-party data in the cloud
Efficiently access
third-party data
Simplifies access to data No need
to receive physical media,
manage FTP credentials, or
integrate with different APIs
Minimize legal reviews and
negotiations
Easily analyze data
Download or copy data to Amazon S3
Combine, analyze, and model with
existing data
Analyze data with EMR, Redshift,
Athena, and Glue
Quickly find diverse data
in one place
>1,000 data products
>80 data providers including Dow
Jones, Change Healthcare,
Foursquare, Dun & Bradstreet,
Thomson Reuters, Pitney Bowes,
Lexis Nexis, and Deloitte
Best performance,
most scalable
AUQA and RA3: 10x faster
than other cloud data
warehouses
Adds unlimited compute
capacity on demand to meet
unlimited concurrent access
Lowest cost
Cost-optimized workloads by
paying compute and storage
separately
1/10th cost of traditional DW at
$1000/TB/year
Up to 75% less than other cloud
data warehouses & predictable
costs
Most secure &
compliant
AWS-grade security
(e.g. VPC, encryption
with AWS KMS, AWS
CloudTrail)
All major certifications
such as SOC, PCI, DSS,
ISO, FedRAMP, HIPAA
Data lake & AWS
integration
Analyze exabytes of data
across data warehouse, data
lakes, and operational
database
Query data across various
analytics services
Amazon Redshift: 데이터웨어하우스
First and most popular cloud data warehouse
가장 광범위하게 사용되는 데이터웨어하우스
Tens of thousands of customers use Amazon Redshift
Robust result set
caching
Large # of tables
support ~20000
Copy command support
for ORC, Parquet
IAM role chaining Elastic resize Groups
Redshift Spectrum: Date formats,
scalar json and ION file formats
support, region expansion,
predicate filtering
Auto analyze
Health and performance
monitoring w/Amazon Cloud
watch
Automatic table
distribution style
Amazon CloudWatch
support for
WLM queues
Performance enhancements:
Hash join, vacuum, window
functions, resize ops,
aggregations, console, union all,
efficient compile code cache
Unload
to CSV
Auto WLM
~25 Query Monitoring
Rules (QMR) support
200+
new features in the past 18
months
AQUA
Concurrency scaling DC1 migration to DC2
Resiliency of ROLLBACK
processing
Manage multi-part
query in AWS
Management Console
Auto analyze for
incremental changes on
table
Spectrum Request
Accelerator
Apply new distribution key
Redshift Spectrum: Row
group filtering in Parquet
and ORC, nested data
support, enhanced VPC
routing, multiple partitions
Faster Classic resize
with optimized data
transfer protocol
Performance: Bloom filters in
joins, complex queries that
create internal table,
communication layer
Redshift Spectrum:
Concurrency scaling
AWS Lake Formation
integration
Auto-vacuum sort,
auto-analyze and auto
table sort
Auto WLM with query
priorities
Snapshot scheduler
Performance: Join pushdowns
to subquery, mixed workloads
temporary tables, rank functions,
null handling in join, single row
insert
Advisor recommendations
for distribution keys
AZ64 compression
encoding
Console redesign
Stored procedures
Spatial processing
Column level access
control
with AWS lake formation
RA3
Performance of inter-
Region snapshot
transfers
Federated
query
Materialized views Manual pause and resume
Amazon Redshift는 급격한 혁신을 제공
Amazon Redshift: Federated Query (Preview)
Analyze data across data warehouse, data lakes, and operational databases
Query across multiple systems from Amazon Redshift
Combine data warehouse and transactional data
Compatible with Amazon RDS and Amazon Aurora
(PostgreSQL)
SQL
A M A Z O N
R D S
A M A Z O N
A U R O R A
A M A Z O N
R E D S H I F T
A M A Z O N S 3 D A T A L A K E
Amazon Redshift: RA3 instances
Optimize your data warehouse by paying for compute and storage separately
Delivers 3x the performance of existing cloud DWs
DS2 customers can migrate and get 2x performance
and 2x storage for the same cost
Automatically scales your DW storage capacity
Supports workloads up to 8 PB (compressed)
COMPUTE NODE
(RA3)
SSD Cache
A M A Z O N S 3 S T O R A G E
COMPUTE NODE
(RA3)
SSD Cache
COMPUTE NODE
(RA3)
SSD Cache
COMPUTE NODE
(RA3)
SSD Cache
Managed storage
$/node/hour
$/TB/month
AQUA – Advanced Query Accelerator
Redshift runs 10x faster than any other cloud data warehouse without increasing cost
AQUA brings compute to the storage layer so data
doesn’t have to move back and forth
High-speed cache on top of S3 scales out to process data in
parallel across many nodes
AWS custom-designed analytics processors accelerate data
compression, encryption, and data processing
100% compatible with the current version of Redshift
C O M ING I N 2 0 2 0
Compute
Node
Compute
Node
Compute
Node
Compute
Node
Storage
Node
Storage
Node
Storage
Node
Storage
Node
Amazon Redshift: spatial demo
Amazon Redshift: Zoom and Drag
Amazon Redshift: Zoom and Drag
Amazon Redshift: Choropleth with zip
Amazon Redshift: Fetching POI from Spectrum
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
데이터 레이크로 시작하는 인사이트 획득
데이터 레이크 구축
데이터 분석을 통한 인사이트 획득
Amazon Redshift
Data warehousing
Amazon EMR
Hadoop + Spark
Amazon Athena
Interactive analytics
Amazon Kinesis
Real-time data analytics
Amazon Elasticsearch Service
Operational Analytics
Amazon
S3/Glacier
AWS GlueAWS Lake
Formation
데이터 분석을 통한 인사이트 획득
데이터 분석을 통한 인사이트 획득
데이터 분석을 통한 인사이트 획득
데이터 분석을 통한 인사이트 획득
데이터 분석을 통한 인사이트 획득
데이터 분석을 통한 인사이트 획득
AWS Glue: ETL and Data Catalog
Simple, flexible, and cost-effective ETL
Less hassle
Integrated across AWS: Supports
Amazon Aurora, Amazon RDS,
Amazon Redshift, Amazon S3,
and common database engines in
your VPC running on Amazon EC2
Serverless
Serverless: No infrastructure to
provision or manage
More power
Automatically generates the code
to execute your data
transformations and loading
processes
Amazon EMR
Run Apache Spark, Hadoop, Hive, Presto, HBase and more big data apps on AWS
Lowest cost, fully integrated with auto scaling, Spot, and Amazon S3
Enterprise-grade security, latest versions of OSS frameworks
Low cost
50-80% reduction in costs
with EC2 Spot and Reserved
Instances
Per-second billing
for flexibility
Easy
Fully managed
no cluster setup,
node provisioning,
cluster tuning
Use S3 storage
Process data in
Amazon S3
securely with high
performance using the
EMRFS connector
Latest versions
Updated with latest open
source frameworks within 30
days
Amazon EMR: Apache Hudi
Record-level insert, update, and delete on Amazon S3
Comply with data privacy laws
Consume real-time streams and change data capture
Reinstate late-arriving data
Track change history and rollback
Apache Hudi is open source, and uses open
data formats, enabling data lakes to:
Includes support for Spark, Hive, and Presto
Runtime optimized for Apache Spark performance
Best performance
• 2.6x faster than Spark with Amazon EMR without runtime
• 1.6x faster than third-party Managed Spark (with their runtime)
Lowest price
• 1/10th the cost of third-party Managed Spark (with their runtime)
100% compliant with Apache Spark APIs*Based on TPC-DS 3TB Benchmarking running 6 node C4x8
extra large clusters and EMR 5.28, Spark 2.4
10,164
16,478
26,478
0 5,000 10,000 15,000 20,000 25,000 30,000
Spark with EMR (with runtime)
3rd party Managed Spark (with their
runtime)
Spark with EMR (without runtime)
Runtime total on 104 queries
(seconds - lower is better)
Performance improvements in Spark for Amazon EMR
Performance-optimized runtime for Apache Spark, 2.6x faster performance at 1/10th the cost
Fully managed
Deploy Amazon ES clusters in
minutes: simplified hardware
provisioning, software
installation/patching, failure
recovery, backups, and
monitoring
Pay only for what
you use
No upfront fee or usage
requirement
Critical features built-in:
encryption, VPC support,
24x7 monitoring
Scalable, secure,
and compliant
Scale clusters up/down with a
single API call or a few clicks
Secured network isolation with
VPC, encrypt data at rest and
in transit
Compliant: HIPAA, PCI DSS,
and ISO
Open source Amazon ES
APIs, Kibana and Logstash
Open-source Amazon ES
APIs
Managed Kibana
Integration with
Logstash
Amazon Elasticsearch Service (Amazon ES)
Fully managed, scalable, secure
Amazon Athena
Run SQL queries on data in Amazon S3
No infrastructure to manage
Pay per query
Pay per query
Pay only for queries run
Save 30–90% on per-query
costs through compression
Easy
Serverless:
Zero infrastructure,
zero administration
Integrated with
QuickSight
Open
ANSI SQL
JDBC/ODBC drivers
Multiple formats,
compression types,
complex joins & data types
Query instantly
Zero setup cost
Point to Amazon S3 and start
querying
SQL
Run SQL queries on relational, non-relational, object,
or custom data sources; in the cloud or on-premises
Open Source Connectors for common data sources
Build connectors to custom data sources
Run connectors in AWS Lambda: no servers to manage
Redshift
Data warehousing
ElastiCache
Redis
Aurora
MySQL, PostgreSQL
DynamoDB
Key value, Document
DocumentDB
Document
S3/Glacier
Run SQL queries on data spanning multiple data stores
Amazon Athena: Federated Query (Preview)
Serverless, cloud-powered BI service (no servers to manage)
Scale from 10s of users to 100s of thousands of users
Pay only for what you use
• Readers: $0.30/30 min session with a $5/user/month max
• Authors: $18/month/Author
Integrates with Amazon S3, Amazon Athena, Amazon
Redshift, Amazon RDS, Amazon Aurora, and Amazon EMR
Amazon QuickSight
First BI service with pay-per-session pricing and ML insights
Customer churn analysis
Predicting price
Text analytics
Scoring sales leads
Assessing loan default risk
Demand forecastingDetecting fraudulent patterns
Employee attrition
prediction
Credit scoring
일반적인 비즈니스 과제들
Step 1: Find a data
scientist
Train, experiment, and build ML models for
predicting customer churn
Step 2: Find a data
engineer
ETL, build, and productionize machine
learning infrastructure
Step 3: Build reports
using data
Analyze and reports on ML predictions
Takes weeks to months and multiple
teams to complete
BI분석가가 머신러닝을 활용하는 방법
++
Time consuming, error prone
process even for ML experts
Combinatorial
Largely explorative
& iterative
Requires broad and
complete knowledge
of ML domain
머신러닝은 알고리즘, 데이터, 파라메터의 복잡한 조합임
DIY model training
• Manual effort by experts
• Fully controlled and auditable
• Experts make tradeoff
decisions
• Gets better over time with
experience
Automated ML
• Accessible to experts and non-
experts alike
• No visibility into the training
process
• Can’t make tradeoffs between
accuracy and other
characteristics
잘못된 선택의 여지가 있음
Amazon SageMaker Autopilot: 더 좋은 선택
Amazon SageMaker
AutopilotDIY model training
• Manual effort by experts
• Fully controlled and auditable
• Experts make tradeoff
decisions
• Gets better over time with
experience
Automated ML
• Accessible to experts and non-
experts alike
• No visibility into the training
process
• Can’t make tradeoffs between
accuracy and other
characteristics
Integrated
with Amazon
SageMaker
Studio
Amazon SageMaker Autopilot: 자동화된 머신러닝
Commented
notebook
describing actions
Specify
prediction target
Automated
feature
engineering
Regression &
classification
Automated
algorithm
selection & HPO
Under the hood
1 7 250
머신러닝을 BI에 포함시키는 것은 쉽지 않음
Typical steps require ML expertise & manual work
Write application
code to read data
from the database
2
Format the data for the
ML model
3 Call an ML service to
run the ML model on
the formatted data
4
Select and train
the ML model
1
Format the
output5
Load the
results back to
the database
6
AWS/On-premise data sources
• Excel
• CSV
• MySQL
• Postgre SQL
• Maria DB
• Presto
• Spark
• SQL Server
• Amazon Redshift
• RDS
• S3
• Athena
• Aurora
• Amazon EMR
• Snowflake
• Teradata
• Salesforce
• Square
• Adobe
Analytics
• Jira
• ServiceNow
• Twitter
• Github
1 Connect to any data:
Data lakes, SQL engines, 3rd
party applications, and on-
premises databases
3 Visualize and share:
Analyze results, create
visualizations, build dashboards
/ email reports, and share to
business stakeholders
2 Select an ML model:
Create models with Amazon
SageMaker Autopilot, choose
from existing custom models,
and packaged models from
AWS Marketplace.
Custom
Models
Amazon QuickSight
AWS
Marketplace
Amazon
SageMaker
Autopilot
ML predictions in Amazon QuickSight (preview)
Build predictive dashboards in hours with
point-and-click, no coding required
AWS를 선택해야 하는 이유!
Fastest way to get answers from all your data to all your users
Easiest to build data
lakes at scale
• AWS Lake Formation
• Redshift Data Lake Export
• Redshift Federated Query
• Query Federation for Amazon
Athena
• Data Streaming for AWS Glue
Most
secure
• Amazon Westeros
• Amazon Macie
• AWS Lake Formation
Most comprehensive
and open
• Amazon AWS Data Exchange
• Amazon EMR on AWS
Outposts
• Record-level insert/updates
for Amazon EMR
• ML in Amazon Athena
• ML in Amazon QuickSight
Best performance at
lowest cost
• AQUA for Redshift
• RA3 for Redshift
• Redshift Materialized Views
• UltraWarm Storage Tier for
Amazon ES
• Performance improvements
for Spark in Amazon EMR
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
감사합니다
© 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.

More Related Content

What's hot

Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Web Services Korea
 
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기Amazon Web Services Korea
 
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌BESPIN GLOBAL
 
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링Amazon Web Services Korea
 
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인Amazon Web Services Korea
 
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...Amazon Web Services Korea
 
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021AWSKRUG - AWS한국사용자모임
 
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...Amazon Web Services Korea
 
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWSAmazon Web Services Korea
 
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성Amazon Web Services Korea
 
Amazon Redshift로 데이터웨어하우스(DW) 구축하기
Amazon Redshift로 데이터웨어하우스(DW) 구축하기Amazon Redshift로 데이터웨어하우스(DW) 구축하기
Amazon Redshift로 데이터웨어하우스(DW) 구축하기Amazon Web Services Korea
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Hyun-Mook Choi
 
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...Amazon Web Services Korea
 
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?Amazon Web Services Korea
 
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드Amazon Web Services Korea
 
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...Amazon Web Services Korea
 

What's hot (20)

Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB DayAmazon Redshift의 이해와 활용 (김용우) - AWS DB Day
Amazon Redshift의 이해와 활용 (김용우) - AWS DB Day
 
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017
AWS 고객이 주로 겪는 운영 이슈에 대한 해법-AWS Summit Seoul 2017
 
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
AWS Summit Seoul 2023 | 실시간 CDC 데이터 처리! Modern Transactional Data Lake 구축하기
 
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌
AWS 상의 컨테이너 서비스 소개 ECS, EKS - 이종립 / Principle Enterprise Evangelist @베스핀글로벌
 
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
Amazon OpenSearch Deep dive - 내부구조, 성능최적화 그리고 스케일링
 
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
Aurora MySQL Backtrack을 이용한 빠른 복구 방법 - 진교선 :: AWS Database Modernization Day 온라인
 
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...
대용량 데이터레이크 마이그레이션 사례 공유 [카카오게임즈 - 레벨 200] - 조은희, 팀장, 카카오게임즈 ::: Games on AWS ...
 
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021
Amazon EKS로 간단한 웹 애플리케이션 구축하기 - 김주영 (AWS) :: AWS Community Day Online 2021
 
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...
데이터 분석가를 위한 신규 분석 서비스 - 김기영, AWS 분석 솔루션즈 아키텍트 / 변규현, 당근마켓 소프트웨어 엔지니어 :: AWS r...
 
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS
있는 그대로 저장하고, 바로 분석 가능한, 새로운 관점의 데이터 애널리틱 플랫폼 - 정세웅 애널리틱 스페셜리스트, AWS
 
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성
AWS Summit Seoul 2023 | AWS에서 최소한의 비용으로 구현하는 멀티리전 DR 자동화 구성
 
Amazon Redshift로 데이터웨어하우스(DW) 구축하기
Amazon Redshift로 데이터웨어하우스(DW) 구축하기Amazon Redshift로 데이터웨어하우스(DW) 구축하기
Amazon Redshift로 데이터웨어하우스(DW) 구축하기
 
AWS Fargate on EKS 실전 사용하기
AWS Fargate on EKS 실전 사용하기AWS Fargate on EKS 실전 사용하기
AWS Fargate on EKS 실전 사용하기
 
Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부Fargate 를 이용한 ECS with VPC 1부
Fargate 를 이용한 ECS with VPC 1부
 
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
AWS 기반 클라우드 아키텍처 모범사례 - 삼성전자 개발자 포털/개발자 워크스페이스 - 정영준 솔루션즈 아키텍트, AWS / 유현성 수석,...
 
Deep Dive on Amazon Aurora
Deep Dive on Amazon AuroraDeep Dive on Amazon Aurora
Deep Dive on Amazon Aurora
 
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...
효과적인 NoSQL (Elasticahe / DynamoDB) 디자인 및 활용 방안 (최유정 & 최홍식, AWS 솔루션즈 아키텍트) :: ...
 
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?
AWS Summit Seoul 2023 | 오픈소스 데이터베이스로 탈 오라클! Why not?
 
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드
AWS Summit Seoul 2023 | 당신만 모르고 있는 AWS 컨트롤 타워 트렌드
 
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...
롯데이커머스의 마이크로 서비스 아키텍처 진화와 비용 관점의 운영 노하우-나현길, 롯데이커머스 클라우드플랫폼 팀장::AWS 마이그레이션 A ...
 

Similar to AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020

AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAmazon Web Services
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Amazon Web Services LATAM
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Amazon Web Services
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...Provectus
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWSAmazon Web Services
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWSAmazon Web Services
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)Amazon Web Services
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Amazon Web Services
 
Serverless Datalake Day with AWS
Serverless Datalake Day with AWSServerless Datalake Day with AWS
Serverless Datalake Day with AWSAmazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Amazon Web Services
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseData Con LA
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSAmazon Web Services
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSLam Le
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSAmazon Web Services
 

Similar to AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020 (20)

AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWSAWS Summit Singapore - Architecting a Serverless Data Lake on AWS
AWS Summit Singapore - Architecting a Serverless Data Lake on AWS
 
AWS Big Data Platform
AWS Big Data PlatformAWS Big Data Platform
AWS Big Data Platform
 
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
Innovation Track AWS Cloud Experience Argentina - Data Lakes & Analytics en AWS
 
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
Build Data Lakes & Analytics on AWS: Patterns & Best Practices - BDA305 - Ana...
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
 
Building your First Big Data Application on AWS
Building your First Big Data Application on AWSBuilding your First Big Data Application on AWS
Building your First Big Data Application on AWS
 
BDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWSBDA305 Building Data Lakes and Analytics on AWS
BDA305 Building Data Lakes and Analytics on AWS
 
Building a modern data platform in AWS
Building a modern data platform in AWSBuilding a modern data platform in AWS
Building a modern data platform in AWS
 
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
AWS re:Invent 2016: Big Data Mini Con State of the Union (BDM205)
 
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
Building Data Lakes and Analytics on AWS; Patterns and Best Practices - BDA30...
 
Aws meetup 20190427
Aws meetup 20190427Aws meetup 20190427
Aws meetup 20190427
 
Serverless Datalake Day with AWS
Serverless Datalake Day with AWSServerless Datalake Day with AWS
Serverless Datalake Day with AWS
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
Understanding AWS Managed Database and Analytics Services | AWS Public Sector...
 
Implementing a Data Lake
Implementing a Data LakeImplementing a Data Lake
Implementing a Data Lake
 
Owning Your Own (Data) Lake House
Owning Your Own (Data) Lake HouseOwning Your Own (Data) Lake House
Owning Your Own (Data) Lake House
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Building Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWSBuilding Data Lakes and Analytics on AWS
Building Data Lakes and Analytics on AWS
 
Module 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWSModule 1 - CP Datalake on AWS
Module 1 - CP Datalake on AWS
 
Fast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWSFast Track to Your Data Lake on AWS
Fast Track to Your Data Lake on AWS
 

More from Amazon Web Services Korea

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2Amazon Web Services Korea
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1Amazon Web Services Korea
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...Amazon Web Services Korea
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon Web Services Korea
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Web Services Korea
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Amazon Web Services Korea
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...Amazon Web Services Korea
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Amazon Web Services Korea
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon Web Services Korea
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon Web Services Korea
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Amazon Web Services Korea
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Web Services Korea
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...Amazon Web Services Korea
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...Amazon Web Services Korea
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon Web Services Korea
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...Amazon Web Services Korea
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...Amazon Web Services Korea
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...Amazon Web Services Korea
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...Amazon Web Services Korea
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...Amazon Web Services Korea
 

More from Amazon Web Services Korea (20)

AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2AWS Modern Infra with Storage Roadshow 2023 - Day 2
AWS Modern Infra with Storage Roadshow 2023 - Day 2
 
AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1AWS Modern Infra with Storage Roadshow 2023 - Day 1
AWS Modern Infra with Storage Roadshow 2023 - Day 1
 
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
사례로 알아보는 Database Migration Service : 데이터베이스 및 데이터 이관, 통합, 분리, 분석의 도구 - 발표자: ...
 
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
Amazon DocumentDB - Architecture 및 Best Practice (Level 200) - 발표자: 장동훈, Sr. ...
 
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
Amazon Elasticache - Fully managed, Redis & Memcached Compatible Service (Lev...
 
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
Internal Architecture of Amazon Aurora (Level 400) - 발표자: 정달영, APAC RDS Speci...
 
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
[Keynote] 슬기로운 AWS 데이터베이스 선택하기 - 발표자: 강민석, Korea Database SA Manager, WWSO, A...
 
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
Demystify Streaming on AWS - 발표자: 이종혁, Sr Analytics Specialist, WWSO, AWS :::...
 
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
Amazon EMR - Enhancements on Cost/Performance, Serverless - 발표자: 김기영, Sr Anal...
 
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
Amazon OpenSearch - Use Cases, Security/Observability, Serverless and Enhance...
 
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
Enabling Agility with Data Governance - 발표자: 김성연, Analytics Specialist, WWSO,...
 
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
Amazon Redshift Deep Dive - Serverless, Streaming, ML, Auto Copy (New feature...
 
From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...From Insights to Action, How to build and maintain a Data Driven Organization...
From Insights to Action, How to build and maintain a Data Driven Organization...
 
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
[Keynote] Accelerating Business Outcomes with AWS Data - 발표자: Saeed Gharadagh...
 
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
Amazon DynamoDB - Use Cases and Cost Optimization - 발표자: 이혁, DynamoDB Special...
 
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
LG전자 - Amazon Aurora 및 RDS 블루/그린 배포를 이용한 데이터베이스 업그레이드 안정성 확보 - 발표자: 이은경 책임, L...
 
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
KB국민카드 - 클라우드 기반 분석 플랫폼 혁신 여정 - 발표자: 박창용 과장, 데이터전략본부, AI혁신부, KB카드│강병억, Soluti...
 
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
SK Telecom - 망관리 프로젝트 TANGO의 오픈소스 데이터베이스 전환 여정 - 발표자 : 박승전, Project Manager, ...
 
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
코리안리 - 데이터 분석 플랫폼 구축 여정, 그 시작과 과제 - 발표자: 김석기 그룹장, 데이터비즈니스센터, 메가존클라우드 ::: AWS ...
 
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
LG 이노텍 - Amazon Redshift Serverless를 활용한 데이터 분석 플랫폼 혁신 과정 - 발표자: 유재상 선임, LG이노...
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 

AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 - 김윤건, AWS사업개발 담당:: AWS Summit Online Korea 2020

  • 1. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS를 통한 데이터 분석 및 처리의 새로운 혁신 기법 김윤건 사업개발 담당 AWS Korea
  • 2. Agenda 인사이트 획득을 위한 데이터 데이터 레이크 구축 데이터 분석을 통한 인사이트 획득
  • 3. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 4. 인사이트 획득을 위한 이상적 모습
  • 6. 현실의 장애물들 데이터 구조 • 여러 시스템에 산재 • 분석에 부적합한 형태 • 레이블이 없는 데이터 데이터 가치 • 포맷: 철자, 단위 • 결측치 • 편향된 데이터 • 무상관성 데이터 데이터 중요도 • 비선호 데이터 • 높은 수집비용 • 본연적 상관성 데이터
  • 7. 현실의 장애물들 데이터 구조 • 여러 시스템에 산재 • 분석에 부적합한 형태 • 레이블이 없는 데이터 데이터 가치 • 포맷: 철자, 단위 • 결측치 • 편향된 데이터 • 무상관성 데이터 데이터 중요도 • 비선호 데이터 • 높은 수집비용 • 본연적 상관성 데이터 Data Transformation Feature Engineering Feature Selection
  • 15. 데이터 레이블링 Name Month - 3 Month - 2 Month - 1 Joe Schmo 123.23 0 0 Jane Plain 0 0 0 Mary Happy 0 55.22 243.33 Tom Thumb 12.34 8.34 14.56 레이블 있는 데이터 Name Month - 3 Month - 2 Month - 1 Default Joe Schmo 123.23 0 0 FALSE Jane Plain 0 0 0 TRUE Mary Happy 0 55.22 243.33 FALSE Tom Thumb 12.34 8.34 14.56 FALSE
  • 16. 데이터 레이블링 Name Date Duration (s) Genre Plays Highway star 1984-05-24 - Rock 139 Blues alive 1990/03/01 281 Blues 239 Lonely planet 2002-11-19 5:32s Techno 42 Dance, dance 02/23/1983 312 Disco N/A The wall 1943-01-20 218 Reagge 83 Offside down 1965-02-19 4 minutes Techno 895 The alchemist 2001-11-21 418 Bluesss 178 Bring me down 18-10-98 328 Classic 21 The scarecrow 1994-10-12 269 Rock 734 정제된 데이터 Name Date Duration (s) Genre Plays Highway star 1984-05-24 Rock 139 Blues alive 1990-03-01 281 Blues 239 Lonely planet 2002-11-19 332 Techno 42 Dance, dance 1983-02-23 312 Disco The wall 1943-01-20 218 Reagge 83 Offside down 1965-02-19 240 Techno 895 The alchemist 2001-11-21 418 Blues 178 Bring me down 1998-10-18 328 Classic 21 The scarecrow 1994-10-12 269 Rock 734
  • 18. 데이터 취합 Content Genre Duratio n Play Time User Device Highway Rock 190 2019-05-12 User001 TV Blues alive Blues 281 2019-05-14 User005 Tablet Lonely planet Tech 332 2019-05-14 User003 TV Dance, dance Disco 312 2019-05-14 User001 Tablet The wall Reaggae 218 2019-05-14 User002 Smartphone Offside down Tech 240 2019-05-14 User005 Tablet The alchemist Blues 418 2019-05-14 User003 TV Bring me down Class 328 2019-05-15 User001 Tablet The one Rock 269 2019-05-15 User003 Smartphone 취합된 데이터 User Num.Playback s Total Time Pref.Device User001 3 830 Tablet User002 1 218 Smartphone User003 3 1019 TV User004 2 521 Tablet
  • 19. 데이터 피벗 Content Genre Duration Play Time User Device Highway Rock 190 2019-05-12 User001 TV Blues alive Blues 281 2019-05-14 User005 Tablet Lonely planet Tech 332 2019-05-14 User003 TV Dance, dance Disco 312 2019-05-14 User001 Tablet The wall Reaggae 218 2019-05-14 User002 Smartphone Offside down Tech 240 2019-05-14 User005 Tablet The alchemist Blues 418 2019-05-14 User003 TV Bring me down Class 328 2019-05-15 User001 Tablet The one Rock 269 2019-05-15 User003 Smartphone 취합되고 피벗된 컬럼 User Num.Playbacks Total Time Pref.Device NP_TV NP_Tablet NP_Smartphone TT_TV TT_Tablet TT_Smartphone User001 3 830 Tablet 1 2 0 190 640 0 User002 1 218 Smartphone 0 0 1 0 0 218 User003 3 1019 TV 2 0 1 750 0 269 User004 2 521 Tablet 0 2 0 0 521 0
  • 21. ELT
  • 22. 데이터 스케일: 모든 데이터를 쿼리 Unified view: Local storage and Amazon S3 data lake Data Lake Amazon S3 Directly query exabytes in Amazon S3 No data loading, eliminate ingestion time Unified view of data across Amazon Redshift and Amazon S3 Scale compute and storage separately No server to maintain for Amazon S3 query Support for Parquet, ORC, Avro, CSV, JSON, Grok, and other open file formats Pay only for the amount of data scanned Amazon Redshift Spectrum Amazon Redshift query engine Amazon Redshift data JDBC/ODBC
  • 23. ETL
  • 26. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 27. 데이터 레이크로 시작하는 인사이트 획득 데이터 레이크 구축 데이터 분석을 통한 인사이트 획득 Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Amazon Athena Interactive analytics Amazon Kinesis Real-time data analytics Amazon Elasticsearch Service Operational Analytics Amazon S3/Glacier AWS GlueAWS Lake Formation
  • 28. Explosion of data Explosion of personas Demand for faster decision-making on real-time data 고객들이 겪고 있는 새로운 현실
  • 29. 전통적인 사일로를 통합 Data silos to OLTP ERP CRM LOB DW Silo 1 Business intelligence Devices Web sensors Social DW Silo 2 Business intelligence MAC H I NE LEARNI NG BI + ANALYTI C S DATA WAREH O U SI NG Data lakes OPEN FORMATS CENTRAL CATALOG
  • 30. 클라우드 데이터 레이크 Customers want: A single data store that is scalable and cost-effective To use the standards-based data format of their choice To analyze their data in a variety of ways Cloud data lake infrastructure Decoupled storage and compute resources Security and governance Data migration Streaming services Data warehouse Big data processing Serverless data processing Real-time analytics Operational analytics Predictive analyticsETL and catalog data management
  • 31. Amazon S3 Unmatched durability, availability, and scalability Most object-level controls Easiest to use with cost optimization: Intelligent Tiering Best security, compliance, and audit capabilities Most ways to get data in Broadest portfolio of analytics tools Amazon S3: 데이터 레이크를 위한 탁월한 선택
  • 32. AWS의 서비스 구성 Amazon S3 | AWS Glue AWS Lake Formation Data lake Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Amazon Athena Interactive analytics Amazon Kinesis Data Analytics Real time Amazon Elasticsearch Service Operational Analytics Analytics-optimized storage Amazon Redshift AQUA Amazon Elasticsearch Ultrawarm Analytics AWS Data Exchange QuickSight Visualizations Amazon SageMaker ML Comprehend NLP Transcribe Speech-to-text Textract Extract text Personalize Recommendation Forecast Forecasts Translate Translation Business intelligence and machine learning Kibana in ES Operational dashboarding Third-party BI tools Data movement AWS Database Migration Service | AWS Snowball | AWS Snowmobile | Amazon Kinesis Data Firehose | Amazon Kinesis Data Streams Managed Streaming for Kafka
  • 33. 새로운 현실을 위한 설계 Cloud-optimized: Architect services ground-up for the cloud and for the explosion of data Purpose-built: Offer a portfolio of purpose- built services, optimized for your workloads Fully managed: Help you innovate faster through managed services Now used by a very large number of customers for mission-critical applications
  • 34. 가장 많은 데이터 레이크와 분석 고객을 보유
  • 40. AWS Lake Formation: 빠르게 데이터 레이크를 구축 Fastest to go from data to insights Move, store, catalog, and clean your data faster Move, store, catalog, and clean your data faster with machine learning Enforce security policies across multiple services Enforce security policies across multiple services Gain and manage new insights Empower analysts and data scientists to gain and manage new insights
  • 41. Amazon Kinesis: 스트리밍 데이터 분석 Amazon Managed Streaming for Apache Kafka (Amazon MSK) Capture and store data streams Amazon Kinesis Data Streams Capture and store data streams Amazon Kinesis Data Firehose Analyze data streams in real time Amazon Kinesis Data Analytics Load streaming data into streams, data lakes, and data warehouses
  • 42. Amazon Kinesis: 스트리밍 분석 역량 Amazon MSK releases Open Monitoring with Prometheus • Consume every Apache Kafka metric with low latency • Enable time-series logging, alarming, and charting through Prometheus Access resources within an Amazon VPC using Amazon Kinesis Data Analytics • Read and write data from resources within your VPCs like Amazon Elasticsearch Service clusters, RDS databases, and Redshift data warehouses Run Apache Flink and Apache Kafka together using fully managed services on AWS • Use Kinesis Data Analytics to process streaming data stored in Amazon MSK • Run streaming solutions end-to-end using open source software in fully managed services
  • 43. AWS Data exchange: 외부 데이터 검색 및 구독 Easily find and subscribe to third-party data in the cloud Efficiently access third-party data Simplifies access to data No need to receive physical media, manage FTP credentials, or integrate with different APIs Minimize legal reviews and negotiations Easily analyze data Download or copy data to Amazon S3 Combine, analyze, and model with existing data Analyze data with EMR, Redshift, Athena, and Glue Quickly find diverse data in one place >1,000 data products >80 data providers including Dow Jones, Change Healthcare, Foursquare, Dun & Bradstreet, Thomson Reuters, Pitney Bowes, Lexis Nexis, and Deloitte
  • 44. Best performance, most scalable AUQA and RA3: 10x faster than other cloud data warehouses Adds unlimited compute capacity on demand to meet unlimited concurrent access Lowest cost Cost-optimized workloads by paying compute and storage separately 1/10th cost of traditional DW at $1000/TB/year Up to 75% less than other cloud data warehouses & predictable costs Most secure & compliant AWS-grade security (e.g. VPC, encryption with AWS KMS, AWS CloudTrail) All major certifications such as SOC, PCI, DSS, ISO, FedRAMP, HIPAA Data lake & AWS integration Analyze exabytes of data across data warehouse, data lakes, and operational database Query data across various analytics services Amazon Redshift: 데이터웨어하우스 First and most popular cloud data warehouse
  • 45. 가장 광범위하게 사용되는 데이터웨어하우스 Tens of thousands of customers use Amazon Redshift
  • 46. Robust result set caching Large # of tables support ~20000 Copy command support for ORC, Parquet IAM role chaining Elastic resize Groups Redshift Spectrum: Date formats, scalar json and ION file formats support, region expansion, predicate filtering Auto analyze Health and performance monitoring w/Amazon Cloud watch Automatic table distribution style Amazon CloudWatch support for WLM queues Performance enhancements: Hash join, vacuum, window functions, resize ops, aggregations, console, union all, efficient compile code cache Unload to CSV Auto WLM ~25 Query Monitoring Rules (QMR) support 200+ new features in the past 18 months AQUA Concurrency scaling DC1 migration to DC2 Resiliency of ROLLBACK processing Manage multi-part query in AWS Management Console Auto analyze for incremental changes on table Spectrum Request Accelerator Apply new distribution key Redshift Spectrum: Row group filtering in Parquet and ORC, nested data support, enhanced VPC routing, multiple partitions Faster Classic resize with optimized data transfer protocol Performance: Bloom filters in joins, complex queries that create internal table, communication layer Redshift Spectrum: Concurrency scaling AWS Lake Formation integration Auto-vacuum sort, auto-analyze and auto table sort Auto WLM with query priorities Snapshot scheduler Performance: Join pushdowns to subquery, mixed workloads temporary tables, rank functions, null handling in join, single row insert Advisor recommendations for distribution keys AZ64 compression encoding Console redesign Stored procedures Spatial processing Column level access control with AWS lake formation RA3 Performance of inter- Region snapshot transfers Federated query Materialized views Manual pause and resume Amazon Redshift는 급격한 혁신을 제공
  • 47. Amazon Redshift: Federated Query (Preview) Analyze data across data warehouse, data lakes, and operational databases Query across multiple systems from Amazon Redshift Combine data warehouse and transactional data Compatible with Amazon RDS and Amazon Aurora (PostgreSQL) SQL A M A Z O N R D S A M A Z O N A U R O R A A M A Z O N R E D S H I F T A M A Z O N S 3 D A T A L A K E
  • 48. Amazon Redshift: RA3 instances Optimize your data warehouse by paying for compute and storage separately Delivers 3x the performance of existing cloud DWs DS2 customers can migrate and get 2x performance and 2x storage for the same cost Automatically scales your DW storage capacity Supports workloads up to 8 PB (compressed) COMPUTE NODE (RA3) SSD Cache A M A Z O N S 3 S T O R A G E COMPUTE NODE (RA3) SSD Cache COMPUTE NODE (RA3) SSD Cache COMPUTE NODE (RA3) SSD Cache Managed storage $/node/hour $/TB/month
  • 49. AQUA – Advanced Query Accelerator Redshift runs 10x faster than any other cloud data warehouse without increasing cost AQUA brings compute to the storage layer so data doesn’t have to move back and forth High-speed cache on top of S3 scales out to process data in parallel across many nodes AWS custom-designed analytics processors accelerate data compression, encryption, and data processing 100% compatible with the current version of Redshift C O M ING I N 2 0 2 0 Compute Node Compute Node Compute Node Compute Node Storage Node Storage Node Storage Node Storage Node
  • 54. Amazon Redshift: Fetching POI from Spectrum
  • 55. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 56. 데이터 레이크로 시작하는 인사이트 획득 데이터 레이크 구축 데이터 분석을 통한 인사이트 획득 Amazon Redshift Data warehousing Amazon EMR Hadoop + Spark Amazon Athena Interactive analytics Amazon Kinesis Real-time data analytics Amazon Elasticsearch Service Operational Analytics Amazon S3/Glacier AWS GlueAWS Lake Formation
  • 57. 데이터 분석을 통한 인사이트 획득
  • 58. 데이터 분석을 통한 인사이트 획득
  • 59. 데이터 분석을 통한 인사이트 획득
  • 60. 데이터 분석을 통한 인사이트 획득
  • 61. 데이터 분석을 통한 인사이트 획득
  • 62. 데이터 분석을 통한 인사이트 획득
  • 63. AWS Glue: ETL and Data Catalog Simple, flexible, and cost-effective ETL Less hassle Integrated across AWS: Supports Amazon Aurora, Amazon RDS, Amazon Redshift, Amazon S3, and common database engines in your VPC running on Amazon EC2 Serverless Serverless: No infrastructure to provision or manage More power Automatically generates the code to execute your data transformations and loading processes
  • 64. Amazon EMR Run Apache Spark, Hadoop, Hive, Presto, HBase and more big data apps on AWS Lowest cost, fully integrated with auto scaling, Spot, and Amazon S3 Enterprise-grade security, latest versions of OSS frameworks Low cost 50-80% reduction in costs with EC2 Spot and Reserved Instances Per-second billing for flexibility Easy Fully managed no cluster setup, node provisioning, cluster tuning Use S3 storage Process data in Amazon S3 securely with high performance using the EMRFS connector Latest versions Updated with latest open source frameworks within 30 days
  • 65. Amazon EMR: Apache Hudi Record-level insert, update, and delete on Amazon S3 Comply with data privacy laws Consume real-time streams and change data capture Reinstate late-arriving data Track change history and rollback Apache Hudi is open source, and uses open data formats, enabling data lakes to: Includes support for Spark, Hive, and Presto
  • 66. Runtime optimized for Apache Spark performance Best performance • 2.6x faster than Spark with Amazon EMR without runtime • 1.6x faster than third-party Managed Spark (with their runtime) Lowest price • 1/10th the cost of third-party Managed Spark (with their runtime) 100% compliant with Apache Spark APIs*Based on TPC-DS 3TB Benchmarking running 6 node C4x8 extra large clusters and EMR 5.28, Spark 2.4 10,164 16,478 26,478 0 5,000 10,000 15,000 20,000 25,000 30,000 Spark with EMR (with runtime) 3rd party Managed Spark (with their runtime) Spark with EMR (without runtime) Runtime total on 104 queries (seconds - lower is better) Performance improvements in Spark for Amazon EMR Performance-optimized runtime for Apache Spark, 2.6x faster performance at 1/10th the cost
  • 67. Fully managed Deploy Amazon ES clusters in minutes: simplified hardware provisioning, software installation/patching, failure recovery, backups, and monitoring Pay only for what you use No upfront fee or usage requirement Critical features built-in: encryption, VPC support, 24x7 monitoring Scalable, secure, and compliant Scale clusters up/down with a single API call or a few clicks Secured network isolation with VPC, encrypt data at rest and in transit Compliant: HIPAA, PCI DSS, and ISO Open source Amazon ES APIs, Kibana and Logstash Open-source Amazon ES APIs Managed Kibana Integration with Logstash Amazon Elasticsearch Service (Amazon ES) Fully managed, scalable, secure
  • 68. Amazon Athena Run SQL queries on data in Amazon S3 No infrastructure to manage Pay per query Pay per query Pay only for queries run Save 30–90% on per-query costs through compression Easy Serverless: Zero infrastructure, zero administration Integrated with QuickSight Open ANSI SQL JDBC/ODBC drivers Multiple formats, compression types, complex joins & data types Query instantly Zero setup cost Point to Amazon S3 and start querying SQL
  • 69. Run SQL queries on relational, non-relational, object, or custom data sources; in the cloud or on-premises Open Source Connectors for common data sources Build connectors to custom data sources Run connectors in AWS Lambda: no servers to manage Redshift Data warehousing ElastiCache Redis Aurora MySQL, PostgreSQL DynamoDB Key value, Document DocumentDB Document S3/Glacier Run SQL queries on data spanning multiple data stores Amazon Athena: Federated Query (Preview)
  • 70. Serverless, cloud-powered BI service (no servers to manage) Scale from 10s of users to 100s of thousands of users Pay only for what you use • Readers: $0.30/30 min session with a $5/user/month max • Authors: $18/month/Author Integrates with Amazon S3, Amazon Athena, Amazon Redshift, Amazon RDS, Amazon Aurora, and Amazon EMR Amazon QuickSight First BI service with pay-per-session pricing and ML insights
  • 71. Customer churn analysis Predicting price Text analytics Scoring sales leads Assessing loan default risk Demand forecastingDetecting fraudulent patterns Employee attrition prediction Credit scoring 일반적인 비즈니스 과제들
  • 72. Step 1: Find a data scientist Train, experiment, and build ML models for predicting customer churn Step 2: Find a data engineer ETL, build, and productionize machine learning infrastructure Step 3: Build reports using data Analyze and reports on ML predictions Takes weeks to months and multiple teams to complete BI분석가가 머신러닝을 활용하는 방법
  • 73. ++ Time consuming, error prone process even for ML experts Combinatorial Largely explorative & iterative Requires broad and complete knowledge of ML domain 머신러닝은 알고리즘, 데이터, 파라메터의 복잡한 조합임
  • 74. DIY model training • Manual effort by experts • Fully controlled and auditable • Experts make tradeoff decisions • Gets better over time with experience Automated ML • Accessible to experts and non- experts alike • No visibility into the training process • Can’t make tradeoffs between accuracy and other characteristics 잘못된 선택의 여지가 있음
  • 75. Amazon SageMaker Autopilot: 더 좋은 선택 Amazon SageMaker AutopilotDIY model training • Manual effort by experts • Fully controlled and auditable • Experts make tradeoff decisions • Gets better over time with experience Automated ML • Accessible to experts and non- experts alike • No visibility into the training process • Can’t make tradeoffs between accuracy and other characteristics
  • 76. Integrated with Amazon SageMaker Studio Amazon SageMaker Autopilot: 자동화된 머신러닝 Commented notebook describing actions Specify prediction target Automated feature engineering Regression & classification Automated algorithm selection & HPO
  • 78. 머신러닝을 BI에 포함시키는 것은 쉽지 않음 Typical steps require ML expertise & manual work Write application code to read data from the database 2 Format the data for the ML model 3 Call an ML service to run the ML model on the formatted data 4 Select and train the ML model 1 Format the output5 Load the results back to the database 6
  • 79. AWS/On-premise data sources • Excel • CSV • MySQL • Postgre SQL • Maria DB • Presto • Spark • SQL Server • Amazon Redshift • RDS • S3 • Athena • Aurora • Amazon EMR • Snowflake • Teradata • Salesforce • Square • Adobe Analytics • Jira • ServiceNow • Twitter • Github 1 Connect to any data: Data lakes, SQL engines, 3rd party applications, and on- premises databases 3 Visualize and share: Analyze results, create visualizations, build dashboards / email reports, and share to business stakeholders 2 Select an ML model: Create models with Amazon SageMaker Autopilot, choose from existing custom models, and packaged models from AWS Marketplace. Custom Models Amazon QuickSight AWS Marketplace Amazon SageMaker Autopilot ML predictions in Amazon QuickSight (preview)
  • 80. Build predictive dashboards in hours with point-and-click, no coding required
  • 81. AWS를 선택해야 하는 이유! Fastest way to get answers from all your data to all your users Easiest to build data lakes at scale • AWS Lake Formation • Redshift Data Lake Export • Redshift Federated Query • Query Federation for Amazon Athena • Data Streaming for AWS Glue Most secure • Amazon Westeros • Amazon Macie • AWS Lake Formation Most comprehensive and open • Amazon AWS Data Exchange • Amazon EMR on AWS Outposts • Record-level insert/updates for Amazon EMR • ML in Amazon Athena • ML in Amazon QuickSight Best performance at lowest cost • AQUA for Redshift • RA3 for Redshift • Redshift Materialized Views • UltraWarm Storage Tier for Amazon ES • Performance improvements for Spark in Amazon EMR
  • 82. © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 83. 감사합니다 © 2020, Amazon Web Services, Inc. or its affiliates. All rights reserved.