© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS
(Yan So)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
(Yan So)
AWSKRUG #datascience
E: 13imso@gmail.com
L: https://www.linkedin.com/in/yanso
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
-
-
- Amazon EMR:
- Amazon Kinesis:
- AWS Glue: ,
- AWS S3:
- Tableau:
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3400
10,000+
1600
(MAU) 250
GB &
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
,
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
,
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
,
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
,
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
,
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
•
•
• Hadoop + Spark / Presto
•
•
•
•
•
• Time/Event Driven
•
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S3
Amazon

S3
Amazon

S3
Amazon

S3
Amazon

RDS
Amazon

DynamoDB
Client
Amazon Kinesis

Data Firehose
AWS
Lambda
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S3 ,
Amazon

S3
Amazon

S3
Amazon

S3
AWS Glue

Crawler
AWS Glue

Catalog
Amazon

RDS
Amazon

DynamoDB
Client
Amazon Kinesis

Data Firehose
AWS
Lambda
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S3 ,
Amazon

S3
Amazon

S3
Amazon

S3
AWS Glue

Crawler
AWS Glue

Catalog
Amazon

RDS
Amazon

DynamoDB
Client
Amazon Kinesis

Data Firehose
AWS
Lambda
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S3 ,
Amazon

S3
Amazon

S3
Amazon

S3
AWS Glue

Crawler
AWS Glue

Catalog
Amazon

RDS
Amazon

DynamoDB
Client
Amazon Kinesis

Data Firehose
AWS
Lambda
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S3 ,
Amazon

S3
Amazon

S3
Amazon

S3
AWS Glue

Crawler
AWS Glue

Catalog
Amazon

RDS
Amazon

DynamoDB
Client
Amazon Kinesis

Data Firehose
AWS
Lambda
User
Activity

Digestion
Datalake on AWS
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR
AWS Glue

Crawler
Amazon

EMR
AWS Glue

Catalog
Amazon

RDS
Amazon

DynamoDB
Amazon

S3
Client
Amazon Kinesis

Data Firehose
Amazon

S3
Amazon

S3
AWS
Lambda
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
EMR + Amazon Athena, QuickSight
Amazon

RDS
Amazon

DynamoDB
Amazon

S3
Client
Amazon Kinesis

Data Firehose
Amazon

S3
Amazon

S3
AWS
Lambda
AWS Glue

Crawler
Amazon

EMR
AWS Glue

Catalog
Amazon

Athena
Amazon

QuickSight
User
Activity

Digestion
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
( )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ad-hoc
• / /
•
!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
( )
,
DAU ?
DAU , ?
?
?
?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ad-hoc
• Spark + Zeppelin
• Web -> -> (-> -> ->
)
• PySpark: python
• (R Studio + sparklyr packages)



( :$)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• / /
•
• Spark + Zeppelin
• Web -> -> (-> -> ->
)
• PySpark: python
• (R Studio + sparklyr packages)
Tableau
Rstudio
Zeppelin Jupyter
Amazon

EMR
AWS Glue

Catalog
Amazon

S3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Zeppelin 

•
• & ,
•
• (PySpark ) Pandas 

•
•
• ( )
• .. .. ( )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
( ..)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
( )
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Ad-hoc
• .

=> 

• .

=> 



© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SQL
? ?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
•
• ?
• ?
• ?
• A/B ?
•
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tableau .
Tableau Desktop
Tableau Server
Tableau Online
https://www.tableau.com/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tableau .

•
•
•
• , (Tableau Online )

•
•
• ( )
• Windows OS Friendly
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
#
> df = spark.read.parquet(…)
> df.join(…)
.groupBy(…)
.agg(…)
> spark.write.parquet("s3n://bucketname")
# ?
0
17.5
35
52.5
70
0
25
50
75
100
April May June July
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tableau
Rstudio
Zeppelin Jupyter
Amazon

EMR
AWS Glue

Catalog
Amazon

S3
1
2
3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
.
3.
4.
1.
,
2.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
-
- ?
- ?
- ?
- ?
- /
-
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3.
4.
1.
,
2.
- , ,
-
- S3
-
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
? .
& #1
https://career.zigzag.kr/2018/07/23/007/
& #2
https://career.zigzag.kr/2018/09/11/008/
Zeppelin
-
- (EDA)
- S3
AWS Glue
- EMR, Tableau
S3
- ETL,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3.
-
- R, Python Google Spreadsheet
- S3
-
4.
1.
,
2.
- , ,
-
- S3
-
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
3.
-
- R, Python Google Spreadsheet
- S3
-
4.
-
- ,
1.
,
2.
- , ,
-
- S3
-
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
- , ,
-
- S3
-
, ,
3 ( )
Top10
-
- ?
- ?
- ?
- ?
- /
-
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
& /


© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
& /
3.
4.
✅ BI ,
, ?
1.
2.
✅ , ,
?
Case #1
,
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
& /
Case #2
3.
4.
✅ ?
✅
?
1.
2.
✅ S3
?
✅ AWS Glue
?
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
& /
Case #3
3.
✅ &
✅ &
4.
1.
2.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• AWS (S3 + Glue + EMR)
•
•
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
!
" https://career.zigzag.kr/
💌 yanso@croquis.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
!
#AWSSummit
.
AWS Summit Seoul 2019
QR
.
Summit
.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

AWS 기반 지속 가능한 데이터 분석 플랫폼 구축하기 - 소성운, 지그재그 :: AWS Summit Seoul 2019

  • 1.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. AWS (Yan So)
  • 2.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.
  • 3.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. (Yan So) AWSKRUG #datascience E: 13imso@gmail.com L: https://www.linkedin.com/in/yanso
  • 4.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. - - - Amazon EMR: - Amazon Kinesis: - AWS Glue: , - AWS S3: - Tableau:
  • 5.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. 3400 10,000+ 1600 (MAU) 250 GB &
  • 6.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. , ,
  • 7.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. , ,
  • 8.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. , ,
  • 9.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. , ,
  • 10.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. , ,
  • 11.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. • • • Hadoop + Spark / Presto • • • • • • Time/Event Driven •
  • 12.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. S3 Amazon
 S3 Amazon
 S3 Amazon
 S3 Amazon
 RDS Amazon
 DynamoDB Client Amazon Kinesis
 Data Firehose AWS Lambda User Activity
 Digestion
  • 13.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. S3 , Amazon
 S3 Amazon
 S3 Amazon
 S3 AWS Glue
 Crawler AWS Glue
 Catalog Amazon
 RDS Amazon
 DynamoDB Client Amazon Kinesis
 Data Firehose AWS Lambda User Activity
 Digestion
  • 14.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. S3 , Amazon
 S3 Amazon
 S3 Amazon
 S3 AWS Glue
 Crawler AWS Glue
 Catalog Amazon
 RDS Amazon
 DynamoDB Client Amazon Kinesis
 Data Firehose AWS Lambda User Activity
 Digestion
  • 15.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. S3 , Amazon
 S3 Amazon
 S3 Amazon
 S3 AWS Glue
 Crawler AWS Glue
 Catalog Amazon
 RDS Amazon
 DynamoDB Client Amazon Kinesis
 Data Firehose AWS Lambda User Activity
 Digestion
  • 16.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. S3 , Amazon
 S3 Amazon
 S3 Amazon
 S3 AWS Glue
 Crawler AWS Glue
 Catalog Amazon
 RDS Amazon
 DynamoDB Client Amazon Kinesis
 Data Firehose AWS Lambda User Activity
 Digestion Datalake on AWS
  • 17.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. EMR AWS Glue
 Crawler Amazon
 EMR AWS Glue
 Catalog Amazon
 RDS Amazon
 DynamoDB Amazon
 S3 Client Amazon Kinesis
 Data Firehose Amazon
 S3 Amazon
 S3 AWS Lambda User Activity
 Digestion
  • 18.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. EMR + Amazon Athena, QuickSight Amazon
 RDS Amazon
 DynamoDB Amazon
 S3 Client Amazon Kinesis
 Data Firehose Amazon
 S3 Amazon
 S3 AWS Lambda AWS Glue
 Crawler Amazon
 EMR AWS Glue
 Catalog Amazon
 Athena Amazon
 QuickSight User Activity
 Digestion
  • 19.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ( )
  • 20.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Ad-hoc • / / • !
  • 21.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ( ) , DAU ? DAU , ? ? ? ?
  • 22.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Ad-hoc • Spark + Zeppelin • Web -> -> (-> -> -> ) • PySpark: python • (R Studio + sparklyr packages)
 
 ( :$)
  • 23.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. • / / • • Spark + Zeppelin • Web -> -> (-> -> -> ) • PySpark: python • (R Studio + sparklyr packages) Tableau Rstudio Zeppelin Jupyter Amazon
 EMR AWS Glue
 Catalog Amazon
 S3
  • 24.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Zeppelin 
 • • & , • • (PySpark ) Pandas 
 • • • ( ) • .. .. ( )
  • 25.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ( ..)
  • 26.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ( )
  • 27.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.
  • 28.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Ad-hoc • .
 => 
 • .
 => 
 

  • 29.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. SQL ? ?
  • 30.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. • • ? • ? • ? • A/B ? •
  • 31.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Tableau . Tableau Desktop Tableau Server Tableau Online https://www.tableau.com/
  • 32.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Tableau .
 • • • • , (Tableau Online )
 • • • ( ) • Windows OS Friendly
  • 33.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. # > df = spark.read.parquet(…) > df.join(…) .groupBy(…) .agg(…) > spark.write.parquet("s3n://bucketname") # ? 0 17.5 35 52.5 70 0 25 50 75 100 April May June July
  • 34.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Tableau Rstudio Zeppelin Jupyter Amazon
 EMR AWS Glue
 Catalog Amazon
 S3 1 2 3
  • 35.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. . 3. 4. 1. , 2.
  • 36.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. - - ? - ? - ? - ? - / -
  • 37.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. 3. 4. 1. , 2. - , , - - S3 -
  • 38.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ? . & #1 https://career.zigzag.kr/2018/07/23/007/ & #2 https://career.zigzag.kr/2018/09/11/008/ Zeppelin - - (EDA) - S3 AWS Glue - EMR, Tableau S3 - ETL,
  • 39.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. 3. - - R, Python Google Spreadsheet - S3 - 4. 1. , 2. - , , - - S3 -
  • 40.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. 3. - - R, Python Google Spreadsheet - S3 - 4. - - , 1. , 2. - , , - - S3 -
  • 41.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. - , , - - S3 - , , 3 ( ) Top10 - - ? - ? - ? - ? - / -
  • 42.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. & / 

  • 43.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. & / 3. 4. ✅ BI , , ? 1. 2. ✅ , , ? Case #1 ,
  • 44.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. & / Case #2 3. 4. ✅ ? ✅ ? 1. 2. ✅ S3 ? ✅ AWS Glue ?
  • 45.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. & / Case #3 3. ✅ & ✅ & 4. 1. 2.
  • 46.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. • AWS (S3 + Glue + EMR) • •
  • 47.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ! " https://career.zigzag.kr/ 💌 yanso@croquis.com
  • 48.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.
  • 49.
    ! #AWSSummit . AWS Summit Seoul2019 QR . Summit . © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.