SlideShare a Scribd company logo
AWS에서Tajo를이용한빅데이터분석실습 
고영경 
ykko@gruter.com
©2014 Gruter. All rights reserved. 
실습안내 
AWS Cloud 환경에서Tajo 클러스터를구성하고 
Tajo 쿼리를이용하여데이터분석을직접수행해봅니다. 
1.Tajo Cloud를이용하여Tajo 클러스터구성하기 
2.S3 의데이터를External Table 로연결하기 
3.Tajo Connector를이용하여원격질의수행하기 
4.갂단한Cohort 분석예제 
•실습안내페이지: http://techday.gruter.com:2014/
©2014 Gruter. All rights reserved. 
Tajo란? 
•Tajo 
–하둡기반의대용량데이터웨어하우스시스템 
–2010년부터리서치프로토타입으로개발시작 
–아파치탑레벨프로젝트 
•Features 
–SQL 표준호홖 
–질의젂체를분산처리 
–HDFS가기본스토리지 
–관계형모델(Nested model로확장논의중) 
–긴시갂이요하는ETL뿐만아니라low-latency 질의를함께지원 
•Tajo on AWS 
–Hadoop없이동작가능. (EMR에서실행도가능) 
–S3 에저장된데이터를바로액세스(로컬HDFS로복사할필요없음)
©2014 Gruter. All rights reserved. 
AWS에서Tajo 클러스터셋업하기 
•Tajo를다운받아직접설치하셔도됩니다 
•그루터의Tajo Cloud AMI 를이용하면더쉽습니다 
–최신버젂의Tajo 탑재 
–복잡한설정필요없이클릭몇번으로셋업완료 
–AWS 홖경에최적화 
–그루터개발추가기능(Tajo Proxy, Tajo Connecter, SQL Workbench) 
–그루터의기술지원 
–AWS 마켓플레이스에롞치준비중 
•Tajo Cloud AMI를이용한클러스터셋업 
–방법1. AWS Web Console 이용 
–방법2. AWS에서CLI (커맨드라인) API이용 
–방법3. 그루터의Tajo Cloud 서비스이용 
4
©2014 Gruter. All rights reserved. 
Tajo Cloud on AWS
©2014 Gruter. All rights reserved. 
Tajo Cloud를이용한클러스터셋업(실습) 
6http://taas.gruter.com/접속 
AWS Account 입력 
클러스터설정 
클러스터구동완료 
1 
2 
3 
4 
Region: US East (Virginia) 
인스턴스: c3.xlarge, 2 worker 노드 
Keypair: demo2014 
실습용개인계정입력
©2014 Gruter. All rights reserved. 
Tajo 클러스터셋업–Security Group 설정 
•AWS Console > EC2 > Security Group > 
•“taas” security group > Inbound 
7 
Protocol 
Port 
Source 
tcp 
22 
SSH접속을허용할IP 
tcp 
80 
WebWorkbench 접속을허용할IP 
tcp 
26080 
Tajo 관리UI에접속할IP 
tcp 
26992 
Tajo Connector에접속할IP
©2014 Gruter. All rights reserved. 
Tajo 사용실습-1 
1.Tajo master 노드SSH 접속 
2.TSQL (Tajo interactive shell) 실행 
3.자신의database 생성 
8 
$ ssh -i demo2014.pemec2-user@your-master-node-ip 
$ sudo su -tajo 
$ /home/tajo/tajo/bin/tsql 
Try ? for help. 
default> 
default> CREATE DATABASE db_name; --use your ID as DB name 
default> c db_name; 
You are now connected to database “db_name" as user "tajo".
©2014 Gruter. All rights reserved. 
Tajo 사용실습-2 
3. S3 데이터연결 
9 
CREATE EXTERNAL TABLE orders ( 
O_ORDERKEY bigint, O_CUSTKEY bigint, O_ORDERSTATUS text, 
O_TOTALPRICE double, O_ORDERDATE text, O_ORDERPRIORITY text, 
O_CLERK text, O_SHIPPRIORITY int, O_COMMENT text) 
USING csvwith ('csvfile.delimiter'='|') 
LOCATION 's3n://taas-bucket-us-east-1- 1594485745/tajo/sampleData/tpch-1g/orders'; 
SELECT * FROM orders LIMIT 10; 
s3n://taas-bucket-us-east-1-1594485745/tajo/sampleData/tpch-1g/orders/orders.tbl
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (1) 
•Cohort 분석 
–동일한특성을가짂고객들을그룹(Cohort)로묶어 
–시갂의흐름에따라각그룹의성과(유지율, 사용량, 고객가치등)를측정하고비교하는분석방법 
10 
* 출처: 하용호“스타트업은데이터를어떻게바라봐야할까?” http://www.slideshare.net/yongho/ss-32267675
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (2) 
•이예제에서는 
–TPC-H 샘플데이터의orders 테이블에서 
–특정월에첫구매한사용자들을Cohort 로묶고 
–각그룹의이후월단위재구매패턴을비교 
11 
Cohort 
첫구매월 
1달후 
2달후 
3달후 
4달후 
5달후 
총합계 
1월첫구매그룹 
151,292 
151,330 
150,063 
149,407 
149,510 
152,193 
903,795 
2월첫구매그룹 
150,624 
153,407 
151,847 
148,187 
149,797 
753,862 
3월첫구매그룹 
150,328 
152,783 
149,548 
154,045 
606,704 
4월첫구매그룹 
151,178 
149,859 
148,542 
449,579 
5월첫구매그룹 
152,174 
150,412 
302,586 
6월첫구매그룹 
151,265 
151,265 
총합계 
151,292 
301,954 
453,798 
605,215 
749,278 
906,254 
3,167,791 
Column 
설명 
o_orderkey 
주문번호 
o_custkey 
고객번호 
o_totalprice 
주문금액 
o_orderdate 
주문일자 
… 
Table: orders
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (3) 
•Cohort 구하기 
•Cohort 정의: 특정월에첫구매한사용자그룹 
12 
CREATE TABLE cohort AS 
SELECT o_custkey, --고객번호 
min(o_orderdate) as cohort_date, --최초주문일 
min(substr(o_orderdate,0,8)) as cohort --cohort 그룹 
FROM orders 
WHERE o_orderdatebetween '1992-01-01' and '1992-06-30' 
GROUP BY o_custkey 
ORDER BY o_custkey; 
•Tajo Cloud 의SQL Workbench 에서실행해보세요
©2014 Gruter. All rights reserved. 
SQL Workbench 
•http://tajo-master-ip/ 
•Tajo Cloud 클러스터목록> ACTION > SQL Workbench 
•Tip. 설정메뉴에Sample Data (TPC-H 1G) 로드기능이포함되어있음 
13
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (4) 
•각Cohort 의월별재구매계산 
14--cohort, 주문월, 주문자수, 주문건수, 주문총액, 평균주문액CREATE TABLE cohort_analysisAS SELECT c.cohort, substr(o_orderdate,0,8) as order_month, count(distinct(o.o_custkey)) as buyer_cnt, count(o.o_orderkey) as order_cnt, round(sum(o.o_totalprice)) as amount, round(avg(o.o_totalprice)) as avg_amountFROM orders o JOIN cohort c ON o.o_custkey= c.o_custkeyWHERE o.o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY c.cohort, substr(o_orderdate,0,8) ORDER BY c.cohort, substr(o_orderdate,0,8) ASC 
•쿼리실행상황을Tajo 관리UI에서확인해보세요
©2014 Gruter. All rights reserved. 
Tajo 관리UI 
•http://tajo-master-ip:26080/ 
•Tajo Cloud 클러스터목록> ACTION > Tajo Master 
•Tip. Security Group 설정에서26080 포트오픈필요 
15
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (5) 
16--cohort, 주문월, 평균주문액CREATE TABLE cohort_analysisAS SELECT c.cohort, substr(o_orderdate,0,8) as order_month, round(avg(o.o_totalprice)) as avg_amountFROM orders o JOIN ( SELECT o_custkey, min(o_orderdate) as cohort_date, min(substr(o_orderdate,0,8)) as cohortFROM orders WHERE o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY o_custkey) c ON o.o_custkey= c.o_custkeyWHERE o.o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY c.cohort, substr(o_orderdate,0,8) ORDER BY c.cohort, substr(o_orderdate,0,8) ASC 
•Sub Query 로합쳐보면 
•외부SQL툴에서Tajo connector를이용해원격실행해보세요
©2014 Gruter. All rights reserved. 
Tajo Connector를이용한원격연결(시연) 
•Custom JDBC driver를지원하는툴(SQuirrelSQL, DB Visualizer등) 
•TajoCloud 에포함된Proxy 서버를통해연결(26992 포트오픈필요) 
•jdbc:taas-tajo://tajo_master_node_ip:26992/db_name 
17
©2014 Gruter. All rights reserved. 
18
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (5) 
19 
Cohort 
1992-01 
1992-02 
1992-03 
1992-04 
1992-05 
1992-06 
총합계 
1992-01 
151,292 
151,330 
150,063 
149,407 
149,510 
152,193 
903,795 
1992-02 
150,624 
153,407 
151,847 
148,187 
149,797 
753,862 
1992-03 
150,328 
152,783 
149,548 
154,045 
606,704 
1992-04 
151,178 
149,859 
148,542 
449,579 
1992-05 
152,174 
150,412 
302,586 
1992-06 
151,265 
151,265 
총합계 
151,292 
301,954 
453,798 
605,215 
749,278 
906,254 
3,167,791 
Cohort 
첫구매월 
1달후 
2달후 
3달후 
4달후 
5달후 
총합계 
1월첫구매그룹 
151,292 
151,330 
150,063 
149,407 
149,510 
152,193 
903,795 
2월첫구매그룹 
150,624 
153,407 
151,847 
148,187 
149,797 
753,862 
3월첫구매그룹 
150,328 
152,783 
149,548 
154,045 
606,704 
4월첫구매그룹 
151,178 
149,859 
148,542 
449,579 
5월첫구매그룹 
152,174 
150,412 
302,586 
6월첫구매그룹 
151,265 
151,265 첫구매 
재구매
©2014 Gruter. All rights reserved. 
분석예제–Cohort Analysis (6) 
20144,000 
146,000 
148,000 150,000 152,000 154,000 
156,000 
첫구매월1달후2달후 
3달후 
4달후 
5달후 
1월첫구매그룹2월첫구매그룹3월첫구매그룹 
Cohort 
첫구매월 
1달후 
2달후 
3달후 
4달후 
5달후 
총합계 
1월첫구매그룹 
151,292 
151,330 
150,063 
149,407 
149,510 
152,193 
903,795 
2월첫구매그룹 
150,624 
153,407 
151,847 
148,187 
149,797 
753,862 
3월첫구매그룹 
150,328 
152,783 
149,548 
154,045 
606,704 
4월첫구매그룹 
151,178 
149,859 
148,542 
449,579 
5월첫구매그룹 
152,174 
150,412 
302,586 
6월첫구매그룹 
151,265 
151,265
©2014 Gruter. All rights reserved. 
Wrap-up 
•Tajo Cloud 서비스를이용해서자신만의Tajo cluster 를AWS에쉽게만들수있다. 
•Tajo 는S3에저장된데이터를직접액세스할수있다. 
•클러스터의워커노드수를동적으로조정할수있다. 
•Tajo Connector를이용하여, SQL client, Excel, R, BI 등외부툴과연동할수있다. 
•보다자세한내용은taas.gruter.com 의가이드문서를참고하세요 
•가장중요한일! 작업이끝나면클러스터를꼭종료하세요. 
21
©2014 Gruter. All rights reserved. 
Q&A 
22
©2014 Gruter. All rights reserved. 
Reference 
•Apache Tajo Project Home: http://tajo.apache.org 
•Tajo Cloud Site : http://taas.gruter.com 
•Tajo Cloud User Guide: https://s3-us-west- 2.amazonaws.com/tajo/taas/documents/TaaS_UserGuide.pdf 
•Tajo SQL Language Reference: http://tajo.apache.org/docs/current/index.html 
•AWS Getting Started: http://aws.amazon.com/ko/documentation/gettingstarted/ 
23
©2014 Gruter. All rights reserved. 
GRUTER: YOUR PARTNER 
IN THE BIG DATA REVOLUTION 
Phone +82-70-8129-2950 
Fax+82-70-8129-2952 
E-mail contact@gruter.com 
Webwww.gruter.com 
Phone +1-415-841-3345

More Related Content

What's hot

Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using Pig
David Wellman
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
ryancox
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
Kohei KaiGai
 
Hortonworks HBase Meetup Presentation
Hortonworks HBase Meetup PresentationHortonworks HBase Meetup Presentation
Hortonworks HBase Meetup PresentationHortonworks
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
Kohei KaiGai
 
Hadoop Integration in Cassandra
Hadoop Integration in CassandraHadoop Integration in Cassandra
Hadoop Integration in Cassandra
Jairam Chandar
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
Kohei KaiGai
 
Supermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro_SMCI
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Cloudera, Inc.
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
Nathan Bijnens
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
Kohei KaiGai
 
20141111 파이썬으로 Hadoop MR프로그래밍
20141111 파이썬으로 Hadoop MR프로그래밍20141111 파이썬으로 Hadoop MR프로그래밍
20141111 파이썬으로 Hadoop MR프로그래밍
Tae Young Lee
 
Whirr dev-up-puppetconf2011
Whirr dev-up-puppetconf2011Whirr dev-up-puppetconf2011
Whirr dev-up-puppetconf2011
Puppet
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
awesomesos
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
Robbie Strickland
 
Cassandra+Hadoop
Cassandra+HadoopCassandra+Hadoop
Cassandra+Hadoop
Jeremy Hanna
 

What's hot (20)

Practical Hadoop using Pig
Practical Hadoop using PigPractical Hadoop using Pig
Practical Hadoop using Pig
 
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter SlidesJuly 2010 Triangle Hadoop Users Group - Chad Vawter Slides
July 2010 Triangle Hadoop Users Group - Chad Vawter Slides
 
20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai20190909_PGconf.ASIA_KaiGai
20190909_PGconf.ASIA_KaiGai
 
Hortonworks HBase Meetup Presentation
Hortonworks HBase Meetup PresentationHortonworks HBase Meetup Presentation
Hortonworks HBase Meetup Presentation
 
20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN20180920_DBTS_PGStrom_EN
20180920_DBTS_PGStrom_EN
 
Hadoop Integration in Cassandra
Hadoop Integration in CassandraHadoop Integration in Cassandra
Hadoop Integration in Cassandra
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi20181016_pgconfeu_ssd2gpu_multi
20181016_pgconfeu_ssd2gpu_multi
 
Supermicro cloudera hadoop
Supermicro cloudera hadoopSupermicro cloudera hadoop
Supermicro cloudera hadoop
 
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
Hadoop World 2011: The Powerful Marriage of R and Hadoop - David Champagne, R...
 
Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!Hadoop Pig: MapReduce the easy way!
Hadoop Pig: MapReduce the easy way!
 
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
 
20141111 파이썬으로 Hadoop MR프로그래밍
20141111 파이썬으로 Hadoop MR프로그래밍20141111 파이썬으로 Hadoop MR프로그래밍
20141111 파이썬으로 Hadoop MR프로그래밍
 
Whirr dev-up-puppetconf2011
Whirr dev-up-puppetconf2011Whirr dev-up-puppetconf2011
Whirr dev-up-puppetconf2011
 
Hadoop Tutorial
Hadoop TutorialHadoop Tutorial
Hadoop Tutorial
 
Online Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and CassandraOnline Analytics with Hadoop and Cassandra
Online Analytics with Hadoop and Cassandra
 
Cassandra+Hadoop
Cassandra+HadoopCassandra+Hadoop
Cassandra+Hadoop
 
Cascalog internal dsl_preso
Cascalog internal dsl_presoCascalog internal dsl_preso
Cascalog internal dsl_preso
 

Viewers also liked

Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter
 
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo,  and application case of SK TelecomSQL-on-Hadoop with Apache Tajo,  and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
Gruter
 
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter
 
프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게
Matthew (정재화)
 
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
Gruter
 
[2D5]무인항공기드론이자동으로움직이는비밀
[2D5]무인항공기드론이자동으로움직이는비밀[2D5]무인항공기드론이자동으로움직이는비밀
[2D5]무인항공기드론이자동으로움직이는비밀
NAVER D2
 
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigDataGruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
Gruter
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
Gruter
 
[2B4]Live Broadcasting 추천시스템
[2B4]Live Broadcasting 추천시스템  [2B4]Live Broadcasting 추천시스템
[2B4]Live Broadcasting 추천시스템
NAVER D2
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER
NAVER D2
 
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
Terry Cho
 

Viewers also liked (12)

Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
Gruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in TelcoGruter TECHDAY 2014 Realtime Processing in Telco
Gruter TECHDAY 2014 Realtime Processing in Telco
 
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo,  and application case of SK TelecomSQL-on-Hadoop with Apache Tajo,  and application case of SK Telecom
SQL-on-Hadoop with Apache Tajo, and application case of SK Telecom
 
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
Gruter_TECHDAY_2014_01_SearchEngine (in Korean)
 
프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게프로그래머를 꿈꾸는 학부 후배들에게
프로그래머를 꿈꾸는 학부 후배들에게
 
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
201210 그루터 빅데이터_플랫폼_아키텍쳐_및_솔루션_소개
 
[2D5]무인항공기드론이자동으로움직이는비밀
[2D5]무인항공기드론이자동으로움직이는비밀[2D5]무인항공기드론이자동으로움직이는비밀
[2D5]무인항공기드론이자동으로움직이는비밀
 
Gruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigDataGruter TECHDAY 2014 MelOn BigData
Gruter TECHDAY 2014 MelOn BigData
 
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
GRUTER가 들려주는 Big Data Platform 구축 전략과 적용 사례: GRUTER의 빅데이터 플랫폼 및 전략 소개
 
[2B4]Live Broadcasting 추천시스템
[2B4]Live Broadcasting 추천시스템  [2B4]Live Broadcasting 추천시스템
[2B4]Live Broadcasting 추천시스템
 
[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER[2A4]DeepLearningAtNAVER
[2A4]DeepLearningAtNAVER
 
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
대용량 분산 아키텍쳐 설계 #1 아키텍쳐 설계 방법론
 

Similar to Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)

[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke HiramaInsight Technology, Inc.
 
Advanced SQL - Quebec 2014
Advanced SQL - Quebec 2014Advanced SQL - Quebec 2014
Advanced SQL - Quebec 2014
Connor McDonald
 
Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
Murtaza Doctor
 
What's new for Spatial in SAP HANA SPS 11
What's new for Spatial in SAP HANA SPS 11What's new for Spatial in SAP HANA SPS 11
What's new for Spatial in SAP HANA SPS 11
SAP Technology
 
Oracle 12c Application development
Oracle 12c Application developmentOracle 12c Application development
Oracle 12c Application development
pasalapudi123
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
Hyoungjun Kim
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
Eli Singer
 
L'ingénierie dans les nuages
L'ingénierie dans les nuagesL'ingénierie dans les nuages
L'ingénierie dans les nuages
Andrew Forward
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Alluxio, Inc.
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deployments
Odoo
 
From big data to AI, power your data with OVHcloud solutions
From big data to AI, power your data with OVHcloud solutionsFrom big data to AI, power your data with OVHcloud solutions
From big data to AI, power your data with OVHcloud solutions
OVHcloud
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataPentaho
 
Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0
oysteing
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
Severalnines
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
Gruter
 
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Thomas Wuerthinger
 
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
The Practice of Presto & Alluxio in E-Commerce Big Data PlatformThe Practice of Presto & Alluxio in E-Commerce Big Data Platform
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
Alluxio, Inc.
 
Analyze database system using a 3 d method
Analyze database system using a 3 d methodAnalyze database system using a 3 d method
Analyze database system using a 3 d method
Ajith Narayanan
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weitingWei Ting Chen
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
Gerard Toonstra
 

Similar to Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean) (20)

[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
[C12]元気Hadoop! OracleをHadoopで分析しちゃうぜ by Daisuke Hirama
 
Advanced SQL - Quebec 2014
Advanced SQL - Quebec 2014Advanced SQL - Quebec 2014
Advanced SQL - Quebec 2014
 
Advanced Analytics using Apache Hive
Advanced Analytics using Apache HiveAdvanced Analytics using Apache Hive
Advanced Analytics using Apache Hive
 
What's new for Spatial in SAP HANA SPS 11
What's new for Spatial in SAP HANA SPS 11What's new for Spatial in SAP HANA SPS 11
What's new for Spatial in SAP HANA SPS 11
 
Oracle 12c Application development
Oracle 12c Application developmentOracle 12c Application development
Oracle 12c Application development
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Jethro data meetup index base sql on hadoop - oct-2014
Jethro data meetup    index base sql on hadoop - oct-2014Jethro data meetup    index base sql on hadoop - oct-2014
Jethro data meetup index base sql on hadoop - oct-2014
 
L'ingénierie dans les nuages
L'ingénierie dans les nuagesL'ingénierie dans les nuages
L'ingénierie dans les nuages
 
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
Intel: How to Use Alluxio to Accelerate BigData Analytics on the Cloud and Ne...
 
Improving the performance of Odoo deployments
Improving the performance of Odoo deploymentsImproving the performance of Odoo deployments
Improving the performance of Odoo deployments
 
From big data to AI, power your data with OVHcloud solutions
From big data to AI, power your data with OVHcloud solutionsFrom big data to AI, power your data with OVHcloud solutions
From big data to AI, power your data with OVHcloud solutions
 
Big Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big DataBig Data Integration Webinar: Getting Started With Hadoop Big Data
Big Data Integration Webinar: Getting Started With Hadoop Big Data
 
Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0Histogram Support in MySQL 8.0
Histogram Support in MySQL 8.0
 
Performance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDBPerformance Tuning Cheat Sheet for MongoDB
Performance Tuning Cheat Sheet for MongoDB
 
Apache Tajo - BWC 2014
Apache Tajo - BWC 2014Apache Tajo - BWC 2014
Apache Tajo - BWC 2014
 
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
Graal and Truffle: Modularity and Separation of Concerns as Cornerstones for ...
 
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
The Practice of Presto & Alluxio in E-Commerce Big Data PlatformThe Practice of Presto & Alluxio in E-Commerce Big Data Platform
The Practice of Presto & Alluxio in E-Commerce Big Data Platform
 
Analyze database system using a 3 d method
Analyze database system using a 3 d methodAnalyze database system using a 3 d method
Analyze database system using a 3 d method
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Apache Airflow Architecture
Apache Airflow ArchitectureApache Airflow Architecture
Apache Airflow Architecture
 

More from Gruter

MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기
Gruter
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
Gruter
 
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with TajoExpanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
Gruter
 
Introduction to Apache Tajo
Introduction to Apache TajoIntroduction to Apache Tajo
Introduction to Apache Tajo
Gruter
 
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
Gruter
 
What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its BeyondWhat's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its Beyond
Gruter
 
Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)
Gruter
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Gruter
 
Tajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWSTajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWS
Gruter
 
Data analysis with Tajo
Data analysis with TajoData analysis with Tajo
Data analysis with Tajo
Gruter
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
Gruter
 
Hadoop security DeView 2014
Hadoop security DeView 2014Hadoop security DeView 2014
Hadoop security DeView 2014
Gruter
 
Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014
Gruter
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Gruter
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Gruter
 
Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료
Gruter
 
Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013
Gruter
 
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105
Gruter
 
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special EventApache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special EventGruter
 
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun KimDeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
Gruter
 

More from Gruter (20)

MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기
 
Introduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data WarehouseIntroduction to Apache Tajo: Future of Data Warehouse
Introduction to Apache Tajo: Future of Data Warehouse
 
Expanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with TajoExpanding Your Data Warehouse with Tajo
Expanding Your Data Warehouse with Tajo
 
Introduction to Apache Tajo
Introduction to Apache TajoIntroduction to Apache Tajo
Introduction to Apache Tajo
 
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS스타트업사례로 본 로그 데이터분석 : Tajo on AWS
스타트업사례로 본 로그 데이터분석 : Tajo on AWS
 
What's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its BeyondWhat's New Tajo 0.10 and Its Beyond
What's New Tajo 0.10 and Its Beyond
 
Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)Big data analysis with R and Apache Tajo (in Korean)
Big data analysis with R and Apache Tajo (in Korean)
 
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache TajoEfficient In­‐situ Processing of Various Storage Types on Apache Tajo
Efficient In­‐situ Processing of Various Storage Types on Apache Tajo
 
Tajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWSTajo TPC-H Benchmark Test on AWS
Tajo TPC-H Benchmark Test on AWS
 
Data analysis with Tajo
Data analysis with TajoData analysis with Tajo
Data analysis with Tajo
 
Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014Elastic Search Performance Optimization - Deview 2014
Elastic Search Performance Optimization - Deview 2014
 
Hadoop security DeView 2014
Hadoop security DeView 2014Hadoop security DeView 2014
Hadoop security DeView 2014
 
Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014Vectorized processing in_a_nutshell_DeView2014
Vectorized processing in_a_nutshell_DeView2014
 
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on HadoopBig Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
Big Data Camp LA 2014 - Apache Tajo: A Big Data Warehouse System on Hadoop
 
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
Hadoop Summit 2014: Query Optimization and JIT-based Vectorized Execution in ...
 
Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료Cloumon sw제품설명회 발표자료
Cloumon sw제품설명회 발표자료
 
Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013Tajo and SQL-on-Hadoop in Tech Planet 2013
Tajo and SQL-on-Hadoop in Tech Planet 2013
 
Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105Tajo case study bay area hug 20131105
Tajo case study bay area hug 20131105
 
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special EventApache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
Apache Tajo - Bay Area HUG Nov. 2013 LinkedIn Special Event
 
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun KimDeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
DeView2013 Big Data Platform Architecture with Hadoop - Hyeong-jun Kim
 

Recently uploaded

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 

Recently uploaded (20)

一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 

Gruter_TECHDAY_2014_04_TajoCloudHandsOn (in Korean)

  • 2. ©2014 Gruter. All rights reserved. 실습안내 AWS Cloud 환경에서Tajo 클러스터를구성하고 Tajo 쿼리를이용하여데이터분석을직접수행해봅니다. 1.Tajo Cloud를이용하여Tajo 클러스터구성하기 2.S3 의데이터를External Table 로연결하기 3.Tajo Connector를이용하여원격질의수행하기 4.갂단한Cohort 분석예제 •실습안내페이지: http://techday.gruter.com:2014/
  • 3. ©2014 Gruter. All rights reserved. Tajo란? •Tajo –하둡기반의대용량데이터웨어하우스시스템 –2010년부터리서치프로토타입으로개발시작 –아파치탑레벨프로젝트 •Features –SQL 표준호홖 –질의젂체를분산처리 –HDFS가기본스토리지 –관계형모델(Nested model로확장논의중) –긴시갂이요하는ETL뿐만아니라low-latency 질의를함께지원 •Tajo on AWS –Hadoop없이동작가능. (EMR에서실행도가능) –S3 에저장된데이터를바로액세스(로컬HDFS로복사할필요없음)
  • 4. ©2014 Gruter. All rights reserved. AWS에서Tajo 클러스터셋업하기 •Tajo를다운받아직접설치하셔도됩니다 •그루터의Tajo Cloud AMI 를이용하면더쉽습니다 –최신버젂의Tajo 탑재 –복잡한설정필요없이클릭몇번으로셋업완료 –AWS 홖경에최적화 –그루터개발추가기능(Tajo Proxy, Tajo Connecter, SQL Workbench) –그루터의기술지원 –AWS 마켓플레이스에롞치준비중 •Tajo Cloud AMI를이용한클러스터셋업 –방법1. AWS Web Console 이용 –방법2. AWS에서CLI (커맨드라인) API이용 –방법3. 그루터의Tajo Cloud 서비스이용 4
  • 5. ©2014 Gruter. All rights reserved. Tajo Cloud on AWS
  • 6. ©2014 Gruter. All rights reserved. Tajo Cloud를이용한클러스터셋업(실습) 6http://taas.gruter.com/접속 AWS Account 입력 클러스터설정 클러스터구동완료 1 2 3 4 Region: US East (Virginia) 인스턴스: c3.xlarge, 2 worker 노드 Keypair: demo2014 실습용개인계정입력
  • 7. ©2014 Gruter. All rights reserved. Tajo 클러스터셋업–Security Group 설정 •AWS Console > EC2 > Security Group > •“taas” security group > Inbound 7 Protocol Port Source tcp 22 SSH접속을허용할IP tcp 80 WebWorkbench 접속을허용할IP tcp 26080 Tajo 관리UI에접속할IP tcp 26992 Tajo Connector에접속할IP
  • 8. ©2014 Gruter. All rights reserved. Tajo 사용실습-1 1.Tajo master 노드SSH 접속 2.TSQL (Tajo interactive shell) 실행 3.자신의database 생성 8 $ ssh -i demo2014.pemec2-user@your-master-node-ip $ sudo su -tajo $ /home/tajo/tajo/bin/tsql Try ? for help. default> default> CREATE DATABASE db_name; --use your ID as DB name default> c db_name; You are now connected to database “db_name" as user "tajo".
  • 9. ©2014 Gruter. All rights reserved. Tajo 사용실습-2 3. S3 데이터연결 9 CREATE EXTERNAL TABLE orders ( O_ORDERKEY bigint, O_CUSTKEY bigint, O_ORDERSTATUS text, O_TOTALPRICE double, O_ORDERDATE text, O_ORDERPRIORITY text, O_CLERK text, O_SHIPPRIORITY int, O_COMMENT text) USING csvwith ('csvfile.delimiter'='|') LOCATION 's3n://taas-bucket-us-east-1- 1594485745/tajo/sampleData/tpch-1g/orders'; SELECT * FROM orders LIMIT 10; s3n://taas-bucket-us-east-1-1594485745/tajo/sampleData/tpch-1g/orders/orders.tbl
  • 10. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (1) •Cohort 분석 –동일한특성을가짂고객들을그룹(Cohort)로묶어 –시갂의흐름에따라각그룹의성과(유지율, 사용량, 고객가치등)를측정하고비교하는분석방법 10 * 출처: 하용호“스타트업은데이터를어떻게바라봐야할까?” http://www.slideshare.net/yongho/ss-32267675
  • 11. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (2) •이예제에서는 –TPC-H 샘플데이터의orders 테이블에서 –특정월에첫구매한사용자들을Cohort 로묶고 –각그룹의이후월단위재구매패턴을비교 11 Cohort 첫구매월 1달후 2달후 3달후 4달후 5달후 총합계 1월첫구매그룹 151,292 151,330 150,063 149,407 149,510 152,193 903,795 2월첫구매그룹 150,624 153,407 151,847 148,187 149,797 753,862 3월첫구매그룹 150,328 152,783 149,548 154,045 606,704 4월첫구매그룹 151,178 149,859 148,542 449,579 5월첫구매그룹 152,174 150,412 302,586 6월첫구매그룹 151,265 151,265 총합계 151,292 301,954 453,798 605,215 749,278 906,254 3,167,791 Column 설명 o_orderkey 주문번호 o_custkey 고객번호 o_totalprice 주문금액 o_orderdate 주문일자 … Table: orders
  • 12. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (3) •Cohort 구하기 •Cohort 정의: 특정월에첫구매한사용자그룹 12 CREATE TABLE cohort AS SELECT o_custkey, --고객번호 min(o_orderdate) as cohort_date, --최초주문일 min(substr(o_orderdate,0,8)) as cohort --cohort 그룹 FROM orders WHERE o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY o_custkey ORDER BY o_custkey; •Tajo Cloud 의SQL Workbench 에서실행해보세요
  • 13. ©2014 Gruter. All rights reserved. SQL Workbench •http://tajo-master-ip/ •Tajo Cloud 클러스터목록> ACTION > SQL Workbench •Tip. 설정메뉴에Sample Data (TPC-H 1G) 로드기능이포함되어있음 13
  • 14. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (4) •각Cohort 의월별재구매계산 14--cohort, 주문월, 주문자수, 주문건수, 주문총액, 평균주문액CREATE TABLE cohort_analysisAS SELECT c.cohort, substr(o_orderdate,0,8) as order_month, count(distinct(o.o_custkey)) as buyer_cnt, count(o.o_orderkey) as order_cnt, round(sum(o.o_totalprice)) as amount, round(avg(o.o_totalprice)) as avg_amountFROM orders o JOIN cohort c ON o.o_custkey= c.o_custkeyWHERE o.o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY c.cohort, substr(o_orderdate,0,8) ORDER BY c.cohort, substr(o_orderdate,0,8) ASC •쿼리실행상황을Tajo 관리UI에서확인해보세요
  • 15. ©2014 Gruter. All rights reserved. Tajo 관리UI •http://tajo-master-ip:26080/ •Tajo Cloud 클러스터목록> ACTION > Tajo Master •Tip. Security Group 설정에서26080 포트오픈필요 15
  • 16. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (5) 16--cohort, 주문월, 평균주문액CREATE TABLE cohort_analysisAS SELECT c.cohort, substr(o_orderdate,0,8) as order_month, round(avg(o.o_totalprice)) as avg_amountFROM orders o JOIN ( SELECT o_custkey, min(o_orderdate) as cohort_date, min(substr(o_orderdate,0,8)) as cohortFROM orders WHERE o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY o_custkey) c ON o.o_custkey= c.o_custkeyWHERE o.o_orderdatebetween '1992-01-01' and '1992-06-30' GROUP BY c.cohort, substr(o_orderdate,0,8) ORDER BY c.cohort, substr(o_orderdate,0,8) ASC •Sub Query 로합쳐보면 •외부SQL툴에서Tajo connector를이용해원격실행해보세요
  • 17. ©2014 Gruter. All rights reserved. Tajo Connector를이용한원격연결(시연) •Custom JDBC driver를지원하는툴(SQuirrelSQL, DB Visualizer등) •TajoCloud 에포함된Proxy 서버를통해연결(26992 포트오픈필요) •jdbc:taas-tajo://tajo_master_node_ip:26992/db_name 17
  • 18. ©2014 Gruter. All rights reserved. 18
  • 19. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (5) 19 Cohort 1992-01 1992-02 1992-03 1992-04 1992-05 1992-06 총합계 1992-01 151,292 151,330 150,063 149,407 149,510 152,193 903,795 1992-02 150,624 153,407 151,847 148,187 149,797 753,862 1992-03 150,328 152,783 149,548 154,045 606,704 1992-04 151,178 149,859 148,542 449,579 1992-05 152,174 150,412 302,586 1992-06 151,265 151,265 총합계 151,292 301,954 453,798 605,215 749,278 906,254 3,167,791 Cohort 첫구매월 1달후 2달후 3달후 4달후 5달후 총합계 1월첫구매그룹 151,292 151,330 150,063 149,407 149,510 152,193 903,795 2월첫구매그룹 150,624 153,407 151,847 148,187 149,797 753,862 3월첫구매그룹 150,328 152,783 149,548 154,045 606,704 4월첫구매그룹 151,178 149,859 148,542 449,579 5월첫구매그룹 152,174 150,412 302,586 6월첫구매그룹 151,265 151,265 첫구매 재구매
  • 20. ©2014 Gruter. All rights reserved. 분석예제–Cohort Analysis (6) 20144,000 146,000 148,000 150,000 152,000 154,000 156,000 첫구매월1달후2달후 3달후 4달후 5달후 1월첫구매그룹2월첫구매그룹3월첫구매그룹 Cohort 첫구매월 1달후 2달후 3달후 4달후 5달후 총합계 1월첫구매그룹 151,292 151,330 150,063 149,407 149,510 152,193 903,795 2월첫구매그룹 150,624 153,407 151,847 148,187 149,797 753,862 3월첫구매그룹 150,328 152,783 149,548 154,045 606,704 4월첫구매그룹 151,178 149,859 148,542 449,579 5월첫구매그룹 152,174 150,412 302,586 6월첫구매그룹 151,265 151,265
  • 21. ©2014 Gruter. All rights reserved. Wrap-up •Tajo Cloud 서비스를이용해서자신만의Tajo cluster 를AWS에쉽게만들수있다. •Tajo 는S3에저장된데이터를직접액세스할수있다. •클러스터의워커노드수를동적으로조정할수있다. •Tajo Connector를이용하여, SQL client, Excel, R, BI 등외부툴과연동할수있다. •보다자세한내용은taas.gruter.com 의가이드문서를참고하세요 •가장중요한일! 작업이끝나면클러스터를꼭종료하세요. 21
  • 22. ©2014 Gruter. All rights reserved. Q&A 22
  • 23. ©2014 Gruter. All rights reserved. Reference •Apache Tajo Project Home: http://tajo.apache.org •Tajo Cloud Site : http://taas.gruter.com •Tajo Cloud User Guide: https://s3-us-west- 2.amazonaws.com/tajo/taas/documents/TaaS_UserGuide.pdf •Tajo SQL Language Reference: http://tajo.apache.org/docs/current/index.html •AWS Getting Started: http://aws.amazon.com/ko/documentation/gettingstarted/ 23
  • 24. ©2014 Gruter. All rights reserved. GRUTER: YOUR PARTNER IN THE BIG DATA REVOLUTION Phone +82-70-8129-2950 Fax+82-70-8129-2952 E-mail contact@gruter.com Webwww.gruter.com Phone +1-415-841-3345