클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)

AWS Online Series:
Data, Analytics, and ML Edition
,

. 2
1 . ? S
. ? W3
A ?
.
.
4 ? W3

데이터로 뭔가 하고 싶다!
•
•
• s
•
• v
•
• )
• /
• ?
•
• (
•
• )
•
•
그러나, 이미 겪은 문제점

웹 로그
클릭 데이터
사용자 행동
콘텐츠
구매 데이터
센서 데이터
소셜 미디어
인공 지능
기계 학습
) (
대용량 저장소
관계형 DB
NoSQL
데이터웨어하우스
실시간 분석
비지니스 인텔리전스
오픈 소스 도구
데이터레이크

0 29
Source: IDC Digital Universe Study (2012)
• . 8 0 1 6 % 4
• 5 B B 2 % , % 4
• 데이터 중 % 로서 70 %는 사용자가 생성 한 콘텐츠로 예측

© Daum 내부 빅데이터 및 클라우드 기술 활용 사례- 윤석찬 (2012)
https://www.slideshare.net/Channy/daums-hadoop-usecases
• H
• H

상용 분석 도구 오픈 소스 플랫폼 클라우드 컴퓨팅 매니지드 서비스
Amazon EMR
(Hadoop)
Amazon Athena
(Presto)
Amazon EC2
(가상 서버)
Amazon S3
(스토리지)

상대적으로 저렴
데이터 생성 수집 및 저장 분석 및 예측 협업 및 공유
수집 처리량 한계
저장 공간 부족
주기적 분석 및
고성능 컴퓨팅
전통적인
DW/BI 도구

데이터 생성 수집 및 저장 분석 및 예측 협업 및 공유
Accelerated

Transactions
ERP
Data analysts
1 4
0 9
5
AWS LambdaAmazon EMR
Amazon
Redshift
Amazon
Machine
Learning
Amazon
Elastisearch
Service
Amazon
Quicksight
AWS Glue
Amazon S3
Amazon
Kinesis
Amazon
DynamoDB Amazon Athena

DB 관리의 부담이 많?니,. (RDB)
관계형 DB 는 확장성이 쉽지 않B요. (NoSQL)
Hadoop 배포 및 관리하기가 힘듭니,.
기존 DW는 복잡하고 비싸고 느립니,.
상용 RDB는 고비용에 관리, 확장이 D려워요.
실시간 데이터는 수집하고 분석하기 힘듭니,.
데이터 클린징(ETL)을 좀. 쉽게 할 수 없을까요?
딥러닝 모델링 및 배포를 좀 . 쉽게 하고 싶D요.
ü Amazon RDS
ü Amazon DynamoDB
ü Amazon EMR
ü Amazon Redshift
ü Amazon Aurora
ü Amazon Kinesis
ü AWS Glue
ü Amazon SageMaker

https://aws.amazon.com/solutions/case-studies/big-data/

13억개의 모바일 디바이스에
광고 플랫폼 운영
PICO- 20억 결제 로그 및 쇼핑
데이터 기반 상품 추천 서비스
맛집 추천 및 데이터 분석
9배 이상 속도 증가
일간 100 GB의 가정용 및 상업용
전기 사용량을 분석 서비스
육해상 선박 데이터 수집을 통한
스마트 선박운행 시스템 구축
쿠키런 게임 로그 수집 및 분석
플랫폼 운영

Transactions
ERP
Database
Data analysts
Data Warehouse
Amazon Redshift
.
& &
• -
• &
Data Processing
Amazon EMR
Amazon
DynamoDB
Amazon RDS
& Aurora
AWS Data
Migration
Service
AWS Snowball
Amazon S3
Storage

, , , A8
Aurora
ElastiCache
(Redis)
Redshift
Kinesis
Firehose
S3
Historical
queries on
up to 2 years
of data
Operational
queries of real-
time data
Staging near
real-time data
Join / compare
events
Real-time
streams of
lodging
market data
Ingest
multiple data
streams
Reference data
on-premises
EC2
https://www.youtube.com/watch?v=9hUVcH48eLg
• E E , , 9
• -5 0 - (1 0 ) ,0 ) , 0
• C
• E % ) 5 C R

Work Item Storage
Partition Assigner
Timer Router
Time
r
Node
s
odesdesNode
s
Timer
Hosts
View Router
Time
r
Node
s
Node
sodesNode
s
View
Hosts
DynamoDB
• B O 1 - 3 0 D
• 7 2
• 7
https://www.youtube.com/watch?v=83-IWlvJ__8

https://www.youtube.com/watch?v=W9ofQMdl48w
Amazon
Elastic MapReduce
RDS/Redshift
Direct
Connect
Amazon
S3
Data Mart
DATA FROM
DEVICES AND
SERVICES
• E I S I 3I
• 3I M 3I R
• A P 3I

Transactions
ERP
Data
Lake
Data Data analysts
Direct Query
Amazon Athena
Data Storage
Amazon S3
. ( ( )) 2
,
•
.
• ,
Amazon
QuickSight

9 D%
https://aws.amazon.com/big-data/datalakes-and-analytics/what-is-a-data-lake/
9 ) ) () , E
( )
A 9 L
?

Raw Data
Amazon S3
• A
•
ETL (Hadoop)
Amazon EMR
Triggered Code
AWS Lambda
Staged Data
(Data Lake)
Amazon S3
ETL & Catalog Management
AWS Glue
Data Warehouse
Amazon Redshift
Triggered Code
AWS Lambda

Amazon
S3
Data lake
AWS Glue
(ETL & Data
Catalog)
Amazon
Athena
Amazon
QuickSight
$
AWS IoT
Devices Web Sensors Social

3
&
I 3n
& &
)(
.
Qz
o
&
B A
S L a Q
m

https://aws.amazon.com/athena/#Case_studies
P
B P
h
o P W %%
I P T
P
) a )% h e
t n o P 0 ,5
P t S (
S h s
o B P
2 5 32 1P
A r T

https://aws.amazon.com/quicksight/#Customers
“Amazon QuickSight's native integration
with Amazon Athena makes it the
ideal serverless analytics solution. With
QuickSight pay-per-session pricing, we
can easily extend access to interactive
dashboards across our teams and only
pay for what we use. The move from
static email reports and ad-hoc analysis
to always-available data in QuickSight
has been great!”
Anders Rahm-Nilzon
Cloud Manager, Volvo Group Connected Solutions
“The QuickSight pay-per-session dashboard
access is perfect as it allows secure, fast
and cost-effective access to interactive
data. As a cloud-based solution, QuickSight
automatically scales to our needs. The
combination of being able to connect to data
from a private Virtual Private Cloud (VPC)
through PrivateLink, authenticate users
via SAML.”
Massimilliano Ponticelli
Product Manager, Siemens

Data
Lake
Business users
Transactions
ERP
Social media
Data
Stream
Capture
Amazon
Kinesis
Events
Amazon
QuickSight
Data Warehouse
Amazon Redshift
Stream Data
Amazon
ElasticSearch
Data Storage
Amazon S3
.
.
•

• 0 : D L W g y
• 35 R w , Vh o
• 1 B E E L cow i R g () mK
• GB C 1 B E EW g S R y s
https://aws.amazon.com/ko/solutions/case-studies/supercell/
https://aws.amazon.com/solutions/case-studies/netflix-kinesis-streams/
• l ds e W a n
• ds 1 B E E D
gl mKs 4, CF 2C E a
• P a r
h M r

"AWS 플랫폼은 17PB의 야구 게임 데이터를 처리하고 고객에게
이를 거의 실시간으로 제공하기 위한 탁월한 선택이었습니다.”
–·Joe Inzerillo, EVP 및 CTO, Major League ase,all Advanced Media

8 7 2 1 5
ü c
ü
ü
ü
ü : 9 3
• E ) 0 2 17 dR (M: 8 3
• ) dRa M: ( 3
• M: ) 4 7 0 2 8)
https://www.youtube.com/watch?v=AsyqdESMSG8

Transactions
Data scientists
Business users
Connected
devices
Data
Event
Insights
Data
Lake
ML / Deep LearningWeb logs /
clickstream
.

,
https://nucleusresearch.com/research/single/guidebook-tensorflow-aws/
“In analyzing the experiences of researchers supporting more
than 388 unique projects, Nucleus found that 88 percent of
cloud-based TensorFlow projects are running on AWS.”

1 0
https://engineering.grab.com/driving-southeast-asia-forward-with-aws
https://aws.amazon.com/solutions/case-studies/grab/
ü +
5
5
ü
•
• 5 5 9
5

https://www.youtube.com/watch?v=tIt2JeNkbys
ü
ü 4
7
•
• 7 R
• D
EMR
Master Node
Data Node
Redshift
WAS
WEB
M
S
Collect
Server
ElasticSearch
Shard 1
Shard 2
Shard
Shard 4
Kinesis
WAS
L3g
S3
RDS
Aurora
Availability Zone
VPN
AWSE2d43i2t
L3gstash
S4ark Hive
Dashb3ard
A1ert
Debug
L3g
실시간
Bastion
EC2
Sync
R Server
Woongjin
IDC
NoSQL & Prediction Engine

Direct Connect
80TB / day
Build Model
Feature Extraction
100 PB Archive
User
Application
Cache Hit Rate
Feedback
Optimized
S3 Cache
SM Decision:
Cache Image or Not
Cleaned
Feature
Vectors
Amazon
SageMaker
Jupyter/Pandas
Order
History
Data
Warehouse
Imagery
Metadata
Cache hit rate dropped by nearly 2x
“We plan to use Amazon SageMaker to train models against petabytes of
Earth observation imagery datasets using hosted Jupyter notebooks, so
DigitalGlobe's Geospatial Big Data Platform (GBDX) users can just push a
button, create a model, and deploy it all within one scalable distributed
environment at scale.”
- Dr. Walter Scott, CTO of Maxar Technologies and founder of DigitalGlobe

S
S , 5 , I m
z k kGe b
M m k L a
o A g sW S
2
0 5 n . g k r A
A M
!

Transactions
ERP
Data analysts
Data scientists
Business users
Engagement platformsConnected
devices
Automation / events
Data
Event Action
Insights
Data
Lake
ML / Deep Learning
Predict / Recommend
AI Services
Social media
Web logs /
clickstream
.

"Persons": [
{
"Timestamp": number,
"Person":
{
"Index": number,
"BoundingBox":
{
"Width": number,
"Top": number,
"Height": number,
"Left": number
},
"Face":
{
"BoundingBox": { ... },
"Landmarks": { ... },
"Pose": { ... },
"Quality": { ... },
"Confidence": number
}
},
...
GetPersonTracking
StartPersonTracking
-

Live Street Camera Amazon Kinesis Video Streams
1. Camera-captured video
streams are processed by
Kinesis Video Streams
End User
3. End user is notified
in case of face matches
Amazon SNS AWS Lambda Amazon Kinesis
Streams
Amazon Rekognition Video Face collection
2. Rekognition Video analyses the
video and searches faces on screen
against a collection of millions of faces

P R
8 E
2
C
C 0
S C
TM
1 E
E
N A TM J
https://aws.amazon.com/ko/rekognition/customers/

A QUIET OFFICE
Amazon SageMaker
Image Classification
Amazon Rekognition
Image
CHAIR
LAPTOP
LAMP
DESK
97%
95%
88%
82%
Object Identification
WORKING!
<HISTORY>

수집 저장/처리 협업/공유분석/시각화
Kinesis
E트리A 데이터
Database Migration
Service
Oracl,, N,t,zza 등의
데이터 S포트
Amazon S3
안전c고, 비L
효OT인 E토리지
Direct Connect
데이터 센터와 연결
Snowball
B크 데이터 로드
내부 사용자와 시스템
고객 대상 서비스
Redshift
데이터 NIc우E
EMR
비정e 데이터 처리,
pac-, Spark
Athena
ad--oc 쿼리
SageMaker
기계 d습 플랫a
QuickSight
시각화, BI
다양한 솔루션과 연동
Glue
데이터카타로그와 ETL

d P p n
g r a R n & R M
• -H , A CAC AC -H ,
• ADE C , A CAC AC ADE C ,
• /C A CAC AC /C
• , C C A CAC AC , C C
• - C & A CAC AC - C &
uQ n d a R n • A ,z P hl H A &
• & S m d m A CAC
c o d t • A DE
y r d i Wr r • A D E A B EC A E
kf r d tK • A E A E
AAB e hl • A -
vtL d R hl • A D D DE D C C
y • A A BC
O d N s • A - C
• B B C A -

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
aws-korea-marketing@amazon.com
twitter.com/AWSKorea
facebook.com/amazonwebservices.ko
youtube.com/user/AWSKorea
slideshare.net/awskorea
twitch.tv/aws
캠페인 온라인 세미나: Data, Analytics, and ML Edition
참석해주셔서 대단히 감사합니다.
저희가 준비한 내용, 어떻게 보셨나요?
더 나은 세미나를 위하여 설문을 꼭 작성해 주시기 바랍니다.

클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to 클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)

Similar to 클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트) (20)

More from Amazon Web Services Korea

More from Amazon Web Services Korea (20)

Recently uploaded

Recently uploaded (20)

클라우드 기반 데이터 분석 및 인공 지능을 위한 비지니스 혁신 - 윤석찬 (AWS 테크에반젤리스트)