1. [SSA] Big Data Analytics
SQL on Hadoop
(A New Generation of Analytic Databases)
Hyounggi Min
hg.min@samsung.com
2014. 2. 5.
2. Contents
I. 빅데이터 분석
II. 빅데이터 분석 접근법
III. Analytic Database at Google
IV. SQL on Hadoop 종류
I. SQL to MapReduce
II. SQL natively on Hadoop
III. Split SQL Execution
1
8. 빅 데이터 분석 접근법
Traditional Analytic Database
MapReduce
SQL on Hadoop
7
9. Motivation
대규모 인터넷 회사
–예) 아마존, 구글, 야후, 페이스북, 트위터, 마이크로소프트,
넷플릭스 등
매일 TB 규모의 데이터를 수집함
정보를 가공하기 위해 ad-hoc 분석이 필요함
–과거의 트랜트와 함께 분석해야 함
–예) 엔지니어는 검색엔진의 랭킹 알고리즘 개선을 위해 검
색 로그에 대한 트랜드 분석을 해야 함
8
11. Solution 1 - Analytic Database Problem
비용
– 전통적인 분석 데이터 베이스는 TCO가 너무 높음
– 예) 특히 CPU/Core 당 가격정책의 경우에는 인터넷 회사에서 돌리기에는 너무 부담
이됨
확장성 / 가용성
– 상용 데이터 베이스는 확장성과 가용성이 떨어짐
– 확장성과 비용 제약으로 인하여 가공된 데이터를 저장해야 함
– 웹스케일 데이터에서는 ACID보단 가용성과 확장성, 탄력성이 더 중요함
구조
– 전통적인 분석 데이터베이스에서는 로그 분석을 위한 별도 저장소가 필요함
벤더
– Teradata, IBM’s Netezza, Greenplum, Oracle Exadata, HP Vertica, Sybase IQ 등
출처: http://sydney.edu.au/engineering/it/~zhouy/info5011/doc/08_DataAnalytics.pdf
10
12. Solution 2 - MapReduce Problem
MapReduce 개발의 어려움
– MapReduce는 어렵고, 개발 노력이 많이 들고, 성능 보장이 어려움
– 개발자의 수준에 따른 성능차 발생
– 기존 SQL구현에 비해 생산성이 많이 떨어짐
부족한 기능: 스키마, 옵티마이저, 인덱스, 뷰 등
데이터 처리 모델상의 한계(관계형 처리를 위해 고안된 것이 아님)
– 고정된 데이터 흐름
– Shuffle: merge sort hasing merge sort
– Job간의 데이터 교환 오버헤드
출처: http://www.slideshare.net/Hadoop_Summit/steinbach-june26-405pmroom230av2,
11
13. Solution 3 - SQL on Hadoop
Hadoop 기반의 차세대 분석 엔진을 지칭
SQL 표준 지원
– 기존 시스템과 통합 또는 대체 용이
– MapReduce보다 적은 학습곡선 필요
높은 처리 성능
– MapReduce의 한계를 극복하는 분산 처리 프레임워크
– CPU와 메모리를 보다 잘 활용하는 처리방식
종류
– Hive(Stinger), Impala, Presto, Drill, Tajo, Hadapt 등
출처: http://www.slideshare.net/deview/deview2013-tajo
12
16. Google Dremel System
Trillion-record, multi-terabyte datasets at interactive speed
– Scales to thousands of nodes
– Fault and straggler tolerant execution
Nested data model
– Complex datasets; normalization is prohibitive
– Columnar storage and processing
Tree architecture (as in web search)
Interoperates with Google's data mgmt tools
– In situ data access (e.g., GFS, Bigtable)
– MapReduce pipelines
출처: Dremel: Interactive Analysis of Web-Scale Datasets. VLDB'10
15
17. Google Dremel - Widely used inside Google
Analysis of crawled web
documents
Tracking install data for
applications on Android Market
Crash reporting for Google
products
OCR results from Google Books
Spam analysis
Debugging of map tiles on
Google Maps
Tablet migrations in
managed Bigtable instances
Results of tests run on
Google's distributed build
system
Disk I/O statistics for
hundreds of thousands of
disks
Resource monitoring for
jobs run in Google's data
centers
Symbols and dependencies
in Google's codebase
16
18. Google Dremel - Storage Format
DocId: 10
Links
Forward: 20
Name
Language
Code: 'en-us'
Country: 'us'
Url: 'http://A'
Name
Url: 'http://B'
r1
r1
C
r1
r2
A
*
*
...
B
*
D
r1
E
r1
r2
r2
Read less,
r2
cheaper
...
decompression
Challenge: preserve structure, reconstruct from a subset of fields
출처: Dremel: Interactive Analysis of Web-Scale Datasets. VLDB'10
17
19. Google Dremel - Nested Data Model
schema
Repetition
Level
Definition
Level
records
r: At what repeated field in the field's path the value has repeated
d: How many fields in paths that could be undefined (opt. or rep.) are actually present
출처: Dremel: Interactive Analysis of Web-Scale Datasets. VLDB'10
18
20. Google Dremel - Column-striped representation
DocId
Name.Url
Links.Forward
Links.Backward
value
r
d
value
r
d
value
r
d
value
r
d
10
0
0
http://A
0
2
20
0
2
NULL
0
1
20
0
0
http://B
1
2
40
1
2
10
0
2
NULL
1
1
60
1
2
30
1
2
http://C
0
2
80
0
2
Name.Language.Code
Name.Language.Country
value
r
d
value
r
d
en-us
0
2
us
0
3
en
2
2
NULL
2
2
NULL
1
1
NULL
1
1
en-gb
1
2
gb
1
3
NULL
0
1
NULL
0
1
19
21. Google Dremel - Record assembly FSM
0
1
DocId
Links.Backward
Name.Language.Code
1
Transitions
labeled with
repetition levels
0
0
0,1,2
2
Links.Forward
1
Name.Language.Country
Name.Ur
l
0
0,1
For record-oriented data processing (e.g., MapReduce)
20
22. Google Dremel - Reading two fields
DocId
0
1,2
Name.Language.Country
0
DocId: 10
Name
Language
Country: 'us'
Language
Name
Name
Language
Country: 'gb'
DocId: 20
Name
s1
s2
Structure of parent fields is preserved.
Useful for queries like /Name[3]/Language[1]/Country
21
23. Google Dremel - SQL dialect for nested data
SELECT DocId AS Id,
COUNT(Name.Language.Code) WITHIN Name AS Cnt,
Name.Url + ',' + Name.Language.Code AS Str
FROM t
WHERE REGEXP(Name.Url, '^http') AND DocId < 20;
Output table
Id: 10
Name
Cnt: 2
Language
Str: 'http://A,en-us'
Str: 'http://A,en'
Name
Cnt: 0
t1
Output schema
message QueryResult {
required int64 Id;
repeated group Name {
optional uint64 Cnt;
repeated group Language {
optional string Str;
}
}
}
22
24. Google Dremel - Serving Tree
[Dean WSDM'09]
client
• Parallelizes scheduling
and aggregation
• Fault tolerance
• Designed for "small"
results (<1M records)
histogram of
response times
23
25. Google Dremel - Example: count()
0
SELECT A, COUNT(B) FROM T
GROUP BY A
T = {/gfs/1, /gfs/2, …, /gfs/100000}
SELECT A, SUM(c)
FROM (R11.. UNION ALL R110)
GROUP BY A
R11
1
...
3
R12
SELECT A, COUNT(B) AS c
FROM T11 GROUP BY A
T11 = {/gfs/1, …, /gfs/10000}
SELECT A, COUNT(B) AS c
FROM T12 GROUP BY A
T12 = {/gfs/10001, …, /gfs/20000}
SELECT A, COUNT(B) AS c
FROM T31 GROUP BY A
T31 = {/gfs/1}
...
...
Data access ops
24
26. Google Dremel - Experiments
• 1 PB of real data
(uncompressed, non-replicated)
• 100K-800K tablets per table
• Experiments run during business hours
Table
name
Number of
records
Size (unrepl.,
compressed)
Number of Data
fields
center
Repl.
factor
T1
85 billion
87 TB
270
A
3×
T2
24 billion
13 TB
530
A
3×
T3
4 billion
70 TB
1200
A
3×
T4
1+ trillion
105 TB
50
B
3×
T5
1+ trillion
20 TB
30
B
2×
25
32. Google Dremel - Observations
Possible to analyze large disk-resident datasets
interactively on commodity hardware
–1T records, 1000s of nodes
MR can benefit from columnar storage just like a
parallel DBMS
–But record assembly is expensive
–Interactive SQL and MR can be complementary
Parallel DBMSes may benefit from serving tree
architecture just like search engines
31
34. Google Tenzing - Overview
SQL Query Engine Built on top of MapReduce for ad-hoc analysis of Google data.
Features
– Complete SQL implementation (with several extensions) – SQL92, some SQL99
Key Characters
–
–
–
–
–
–
–
–
이질성(Heterogeneity)
고성능(High Performance)
확장성(Scalability)
신뢰성(Reliability)
메터데이터 인식(Meta data awareness)
로우 레이턴시(Low latency)
스토리지 확장: 컬럼형 스토러지(Cloumnar Storage), 구조화 된 데이터 지원(structured data)
쉬운 확장(Easy Extensibility)
현황 (2011년)
– 10000+ query/day
– 1.5PB의 압축데이터
출처: VLDB2011, http://www.vldb.org/pvldb/vol4/p1318-chattopadhyay.pdf
33
35. Google Tenzing - Motivation
2008년에 구글 광고데이터를 위한 상용 DW에 문제가 발생
– 확장 비용 증가 (PB 수준)
– 급격히 증가하는 로딩 시간
– SQL 제약, 다양한 데이터 소스에 대한 지원 부족으로 복잡한 분석에 제약
발생
해결 책: Tenzing
– 확장성 : 수천개의 코어, 수백명의 사용자, 페타바이트 규모의 데이터를 처리
– 높은 신뢰성: 일반 HW를 기반으로 동작 가능
– 성능: 기존 상용 DW 대비 동등 이상의 성능
– ETL 프로세스 최소화: Google 시스템의 데이터를 직접 읽을 수 있어야 함
– SQL 표준 지원: 분석가의 학습곡선을 줄임
– UDF 지원: Prediction과 mining에도 적용할 수 있어야 함
출처: [VLDB2011]
34
37. Google Tenzing - SQL Features
Projection and Filtering
– Standard SQL operations: IN, LIKE, BETWEEN, CASE etc
– Built-in Sawzall function
Aggregation
– Standard aggregate functions: SUM, COUNT, MIN, MAX etc.
– Hash Based Aggregation
Joins
– inner, left, right, cross, full other joins, Equi, semi-equi, non-equi, function based joins etc
– Distributed Implementations: Broadcast Joins, Remote Lookup Joins, Distributed Sort-Merge
Joins, Distributed Hash Joins
Analytic Functions: RANK, SUM, MIN, MAX, LEAD, LAG, NTITLE
Set Operations: UNION, UNION ALL, MINUS, MINUS ALL
Nested Queries and Subqueries
Handling Structured Data
Views, DDL, DML, Table Valued Functions, Data Formats
출처: http://www.vldb.org/pvldb/vol4/p1318-chattopadhyay.pdf
36
38. Google Tenzing - Hash based Aggregation
Tenzing Query:
MapReduce Pseudo-Code
SELECT dept id, COUNT(1)
FROM Employee
/*+ HASH */ GROUP BY 1;
출처: http://www.vldb.org/pvldb/vol4/p1318-chattopadhyay.pdf
37
39. Google Tenzing - Analytic Functions
Tenzing Query:
MapReduce Pseudo-Code
SELECT
dept, emp, salary,
RANK() OVER (
PARTITION BY dept ORDER BY
salary DESC)
AS salary rank
FROM Employee;
출처: http://www.vldb.org/pvldb/vol4/p1318-chattopadhyay.pdf
38
40. Google Tenzing - Performance Enhancement
MapReduce Enhancement
– Worker Pool: master watcher, master pool, worker pool
– Streaming & In-memory Chaining
– Sort Avoidance
– Block shuffle
– Local Execution
Query Engine Enhancement
– 1st impl: SQL to Sawzall code (Sawzall JIT compiler)
Sawzall 시스템으로 변환 비용으로 비효율적
– 2nd impl: Dremel’s SQL expression evaluation engine
Interpreter 방식 환경, Row 기반 처리로 다소 느림
– 3rd impl: LLVM 기반의 native code 생성
Row major block based intermediate data
Column major vector based processing with columnar intermediate storage
출처: http://www.vldb.org/pvldb/vol4/p1318-chattopadhyay.pdf
39
42. Google F1 - Overview
구글 AdWord 비즈니스를 지원하기 위한 분산 관계형
데이터베이스 시스템
하이브리드 데이터베이스: Google Spanner를 기반으로 구축
– NoSQL 특징: High Availability, Scalability
– 전통적인 SQL 데이터베이스 특징: Consistency, Usability
Desing Goal
– Scalability: Auto-sharded storage
– Availability & Consistency: Synchronous replication
– High commit latency: Can be hidden
Hierarchical schema
Protocol buffer column types
Efficient client code
출처: [VLDB2013], [SIGMOD2012]
41
43. Google F1 - Adwards Ecosystem
출처: [VLDB2011]
42
44. Google F1 - Motivation (1/2)
Legacy DB: Sharded MySQL
– 샤딩 전략
Sharded by customer
Apps optimized using shard awareness
제약사항
– 가용성
Master / slave replication -> downtime during failover
Schema changes -> downtime for table locking
– 확장성
Grow by adding shards
Rebalancing shards is extremely difficult and risky
Therefore, limit size and growth of data stored in database
– 기능
Can't do cross-shard transactions or joins
출처: http://research.google.com/pubs/pub38125.html
43
45. Google F1 - Motivation (2/2)
Critical applications driving Google's core ad business
– 24/7 availability, even with datacenter outages
– Consistency required
Can't afford to process inconsistent data
Eventual consistency too complex and painful
– Scale: 10s of TB, replicated to 1000s of machines
Shared schema
– Dozens of systems sharing one database
– Constantly evolving - multiple schema changes per week
SQL Query
– Query without code
해결책: Google F1
– built from scratch,
– designed to operate at Google scale,
– without compromising on RDBMS features.
출처: http://research.google.com/pubs/pub38125.html
44
46. Google F1 - Features
구글 Spanner(NewSQL)를 기반으로 추가 기능을 구현
– OLTP + OLAP 성격을 가지고 있음
Spanner 제공 특징: extremely scalable data storage, synchronous replication, and
strong consistency, ordering properties
추가 기능
–
–
–
–
–
Distributed SQL queries, including joining data from external data sources
Transactionally consistent secondary indexes
Asynchronous schema changes including database reorgnizations
Optimistic transactions
Automatic change history recording and publishing
적용 현황: 2012년 초 AdWords 광고 캠페인 데이터 관리에 사용됨
– 100s of applications, 1000s of users, all sharing the same database
– 100TB↑, hundreds of thousands req/sec, scan ten of trillions of data rows per day
– Availability 99.999%, 계획하지 않은 outage에도 obserbable latency는 증가하지 않음
45
47. Google F1 - Architecture
Architecture
– Sharded Spanner servers
data on GFS and in memory
– Stateless F1 server
– Slave Pool for query execution
Features
– Relational schema
Extensions for hierarchy and rich data
types
Non-blocking schema changes
– Consistent indexes
– Parallel reads with SQL or Map-Reduce
46
52. SQL on Hadoop
Hadoop 기반의 SQL을 지원하는 분석 데이터베이스를 지칭
SQL 언어지원
– 기존 시스템과 통합 또는 대체 용이
– SQL에 최적화된 옵티마이저, 인덱서 등 기능 제공
높은 처리성능
– MapReduce의 한계를 극복하는 분산처리
51
53. 왜 SQL on Hadoop 인가?
인력 확보 및 생산성 문제
– MapReduce를 익히는데 많은 학습곡선이 소요됨 인력 확보 문제
– MapReduce기반으로 구현하는데 많은 노력이 소요됨 생산성 문제
성능 보장 및 사람에 의한 오류 방지
– 개발자 역량에 따라 성능이 좌우됨
– 성능 튜닝에 많은 노력이 소요됨
– 버그 가능성 높음
Ad-hoc 질의에 대한 DB 병행사용 문제
– 복잡한 Architecture 및 관리부담 문제
– 비용부담 증가: 추가적인 DBMS 라이선스 및 스토리지 필요
– 데이터 교환문제: HDFS ↔ DBMS
52
54. SQL on Hadoop 역사
SQL has been ruling since 1970!!
Hadoop came…But little traction…
Facebook open-sourced HIVE in 2008.. Hadoop takes the
next leap in adoption
RDBMS and MPP Vendors brought Hadoop Connectors
Niche players used SQL engine to run Distributed Query on
Hadoop
2010.10: Google Dremel paper opened
2012.10: Cloudera Impala sets the trend for Real time
Query over Hadoop
2013.03: Apache Tajo Project enters incubation.
2013.11: Facebook open sourced Presto
출처: http://www.slideshare.net/SameerMandal1/sql-over-hadoop-ver-3
53
55. SQL on Hadoop Products
출처: http://www.slideshare.net/SameerMandal1/sql-over-hadoop-ver-3
54
56. SQL on Hadoop 분류 (1/3)
451 Group
– SQL in Hadoop: Hive
– SQL on Hadoop: Impala, Drill, HWAQ, JethoData, Big SQL
– SQL and Hadoop: Hadapt, RainStor, PolyBase, Citus Data, SQL-H
Gruter
– Data Warehouse System: Hive(Stinger), Tajo
– Query Engine: Impala, Drill, Presto
출처: 1) http://blogs.the451group.com/information_management/2013/10/09/7-hadoop-questions-q5-sql-in-hadoop-sql-on-hadoop-or-sql-and-hadoop/
2) http://www.slideshare.net/hyunsikchoi/sqlonhadoop-tajo-tech-planet-2013
55
57. SQL on Hadoop 분류 (2/3)
Datascientist.info
–SQL natively on Hadoop: Stinger, Impala, Drill, Presto
–DBMS on Hadoop: Hadapt, CitusDB, Tajo
Hadapt
– SQL translated to MapReduce jobs over a Hadoop cluster
Hive, Stinger(without Tez)
– SQL processed by a specialized(Google-inspired) SQL engine that sits on
a Hadoop cluster
Impala(F1), Drill(Dremel)
– Processing of SQL queries are split between MapReduce and storage that
natively speaks SQL
Hadapt, Polybase
출처: http://datascientists.info/sql-and-hadoop/, http://hadapt.com/blog/2013/10/02/classifying-the-sql-on-hadoop-solutions/
56
58. SQL on Hadoop 분류 (3/3)
SQL to MapReduce: Hive(Stinger)
SQL natively on Hadoop: Tajo, Impala, Drill, Presto
Split SQL execution: Hadapt , CitusDB, Polybase
Others: Shark, Phoenix, Cascading Lingual, Vertica
57
61. Hive - Overview
Invented at Facebook. Open sourced to Apache in 2008.
ASF, 2008~, 0.12/2013.10.15
A database/data warehouse on top of Hadoop
– Structured data similar to relational schema
Tables, columns, rows and partitions
– SQL like query language (HiveQL)
A subset of SQL with many traditional features
It is possible to embedded MR script in HiveQL
– Queries are compiled into MR jobs that are executed on Hadoop.
Key Building Principles:
– SQL as a familiar data warehousing tool
– Extensibility – Types, Functions, Formats, Scripts
– Scalability and Performance
– Interoperability
출처: http://sydney.edu.au/engineering/it/~zhouy/info5011/doc/08_DataAnalytics.pdf
60
62. Hive - Motivation(Facebook)
Problem: Data growth was exponential
– 200GB per day in March 2008
– 2+TB(compressed) raw data / day in April 2009
– 4+TB(compressed) raw data / day in Nov. 2009
– 12+TB(compressed) raw data / day today(2010)
The Hadoop Experiment
– Much superior to availability and scalability of commercial DBs
– Efficiency not that great, but throw more hardware
– Partial Availability/resilience/scale more important than ACID
Problem: Programmability and Metadata
– MapReduce hard to program (users know sql/bash/python)
– Need to publish data in well known schemas
Solution: SQL + MapReduce = HIVE (2007)
출처: 1) http://www.slideshare.net/zshao/hive-data-warehousing-analytics-on-hadoop-presentation
2) http://www.slideshare.net/royans/facebooks-petabyte-scale-data-warehouse-using-hive-and-hadoop
61
63. Hive - Data Flow Architecture of Facebook
Realtime
Hadoop
Cluster
Scribe MidTier
Web Servers
Oracle RAC
Hadoop Hive Warehouse
Online querying of the summary data
Scribe Writers
MySQL
Offline batch processing
출처: http://borthakur.com/ftp/hadoopmicrosoft.pdf, http://sydney.edu.au/engineering/it/~zhouy/info5011/doc/08_DataAnalytics.pdf
62
65. Hive - Data Model
Re-used from Relational Database
– Database, Table, Partition, Row, Column
Tables
– Typed Columns (int, float, string, date, boolean)
– Also, array/list/map/struct for JSON-like data
Partitions
– E.g., to range-partition tables by date
Buckets
– Hash partitions within ranges (useful for sampling, join optimization)
Column Data Type
CREATE TABLE t {
s STRING,
f FLOAT,
a ARRAY<MAP<STRING, STRUCT<p1:INT, p2:INT>>>;
}
SELECT s, f, a[0][‘foobar’] p2 FROM t;
출처: http://www.slideshare.net/cwsteinbach/hive-quick-start-tutorial
64
66. Hive - HiveQL (Hive Query Language)
Basic SQL
–
–
–
–
–
–
From clause subquery
ANSI JOIN (equi-join only)
Multi-table Insert
Multi group-by
Sampling
Objects traversal
Extensibility
– Pluggable Map-reduce scripts using TRANSFORM
Limitations
–
–
–
–
Subset of SQL
Meta-data queries
Limited equality and join predicates
No inserts on existing tables (to preserve worm property)
Can overwrite an entire table
65
67. Hive - Query Execution and MR Jobs
Query plan with 3 map-reduce jobs for multi-table insert query
출처: Ysmart(Yet Another SQL-to-MapReduce Translator), http://sydney.edu.au/engineering/it/~zhouy/info5011/doc/08_DataAnalytics.pdf
66
68. Hive - Problems
Performance Gap
– For simple queries, HIVE performance is comparable with hand-coded MR jobs
– The execution time is much longer for complex queries
HiveQL allows for arbitrary number of embedded subqueries.
They are converted to MapReduce jobs respectively
The simple conversion may involve many unnecessary data scan and transfer.
Missing Features
–
–
–
–
ANSI SQL
Cost Based Optimizer
UDFs
Data Types
Hand-coded MR program only requires 2 jobs
Automatically generated MR code uses 6 jobs
Simple aggregation query
출처: http://hortonworks.com/blog/100x-faster-hive/, http://sydney.edu.au/engineering/it/~zhouy/info5011/doc/08_DataAnalytics.pdf
67
70. Stinger - Overview
An initiative, not a project or product
Includes changes to Hive and a new project Tez
Two main goals
–Improve Hive performance 100x over Hive 0.10
–Extend Hive SQL to include features needed for analytics
Hive will support:
–BI tools connecting to Hadoop
–Analysts performing ad-hoc, interactive queries
–Still excellent at the large batch jobs it is used for today
출처: http://www.slideshare.net/alanfgates/stinger-hadoop-summit-june-2013
69
79. Impala - Overview
In-memory, distributed SQL query engine (no Map/Reduce)
Inspired by Google Dremel, VLDB 2010
Native Code(C++), Version 1.2.3 at 2014.01.22
Distributed on HDFS data nodes
Interactive SQL
– Typically 5-65x faster than Hive (observed up to 100x faster)
– Responses in seconds instead of minutes(sometimes sub-second)
Nearly ANSI-92 standard SQL queries with Hive SQL
– Compatible SQL interface for existing Hadoop/CDH applications
– Based on industry standard SQL
Natively on Hadoop/Hbase storage and metadata
– Flexibility, scale, and cost advantages of Hadoop
– No duplication/synchronization of data and metadata
– Local processing to avoid network bottlenecks
Separate runtime from MapReduce
– MapReduce is designed and greate for batch
– Impala is purpose-built for low-latency SQL queries on Hadoop
출처: http://www.slideshare.net/insideHPC/impala-overview
78
82. Drill - Overview
Interactive analysis of Big Data using standard SQL
Fast
– Low latency
– Columnar execution
Inspired by Google Dremel / BigQuery
– Complement native interfaces and MapReduce/Hive/Pig
Open
– Community driven open source project
– Under Apache Software Foundation
Modern
– Standard ANSI SQL: 2003 (select/into)
– Nested / hierarchical data support
– Schema is optional
– Supports RDBMS, Hadoop and NoSQL
출처: http://www.slideshare.net/tdunning/apache-drill-16513485
81
83. Drill - How Does it Work?
• Drillbits run on each node, designed to
maximize data locality
• Processing is done outside MapReduce
paradigm (but possibly within YARN)
• Queries can be fed to any Drillbit
• Coordination, query planning, optimization,
scheduling, and execution are distributed
출처: http://www.slideshare.net/tdunning/apache-drill-16513485
SELECT * FROM
oracle.transactions,
mongo.users,
hdfs.events
LIMIT 1
82
85. Drill - Status
Heavy active development by multiple organizations
Available
– Logical plan syntax and interpreter
– Reference interpreter
In progress
– SQL interpreter
– Storage engine implementations for Accumulo, Cassandra, HBase and various file formats
Significant community momentum
–
–
–
–
Over 200 people on the Drill mailing list
Over 200 members of the Bay Area Drill User Group
Drill meetups across the US and Europe
OpenDremel team joined Apache Drill
Anticipated schedule:
– Prototype: Q1
– Alpha: Q2
– Beta: Q3
출처: http://www.slideshare.net/tdunning/apache-drill-16513485
84
88. Presto - Overview
Facebook에서 개발하여 오픈소스로 공개(2013.11.6)
수 초에서 수 분이 걸리는 질의 유형을 타겟으로 설계
No Fault Tolerance
빠른 응답과 online query processing
– 우선적으로 처리된 결과가 반환됨
일부 질의 유형에 대해 approximate query 지원
Hive의 보완 시스템으로 개발
모든 연산자는 파이프라이닝(pipelining)과 데이터 전송은 스트리밍
Facebook 사례
– 누적된 300 PB Data Warehouse, 일일 30k 질의, 하루에 1PB 처리
출처: http://www.slideshare.net/hyunsikchoi/sqlonhadoop-tajo-tech-planet-2013
87
93. Tajo - Overview
Tajo
– 하둡 기반의 대용량 데이터웨어하우스
– 2010년 고려대에서 리서치 프로토타입으로 개발 시작
– 2013.03 아파치 인큐베이션 프로젝트
– 0.2-incubating, 2013.11
Features
– 호환성
표준 SQL 지원, HiveQL 지원, UDF 지원
JDBC
– 고성능 및 낮은 반응 시간
유연하고 효율적인 분산 처리 엔진
Cost-based Optimizer
JIT Query Compilation 및 Vectorized 질의엔진
출처: http://tajo.incubator.apache.org/, http://www.slideshare.net/deview/deview2013-tajo
92
94. Tajo - Architecture
Master-Worker 모델
– RPC 기반: Protocol Buffer, Netty
Tajo Master
– 클라이언트 및 어플리케이션 요청 처리
– 카탈로그 서버
테이블 스키마, 물리적인 정보, 각종 통계
JDBC 이용 외부 RDBMS를 저장소로 사용 가능
– Query Parser, 플래너, 최적화, 클러스터 자원관리, Query Master 관리
Query Master
– 질의별 동작
– Executin Block(질의 실행 단계) 제어
– 태스크 스케줄링
Tajo Worker
– 스토리지 매니저
– 로컬 질의 엔진
출처: http://www.slideshare.net/deview/deview2013-tajo
93
101. Hadapt - Overview
A Commercialized version of Daniel Abadi’s HadoopDB project.
HadoopDB Project: An Architetural hybrid of MapReduce and DBMS Technologies
for Analytical Workloads
– Hadoop as communication layer above multiple nodes running single-node DBMS
instances
Full open-source solution:
– PostgreSQL as DB Layer
– Hadoop as communication layer
– Hive as translation layer
Hadapt(HadoopDB) = Hadoop + PostgreSQL
100
102. Hadapt - HadoopDB
Recent study at Yale University, Database Research Dep.
Hybrid architecture of parallel databases and MapReduce
system
The idea is to combine the best qualities of both technologies
Multiple single-node databases are connected using Hadoop
as the task coordinator and network communication layer
Queries are distributed across the nodes by MapReduce
framework, but as much work as possible is done in the
database node
출처: https://wiki.aalto.fi/download/attachments/37102509/HadoopDB%2Bproject.ppt
101
107. Hadoop
대규모 데이터의 분산처리를 위한 오픈 소스 프레임워크
: HDFS(데이터 저장) + MapReduce(분산처리)
유연성
– 다양한 데이터 유형 지원(구조적, 비구조적 데이터, 텍스트 등)
– 범용 프로그래밍 언어를 지원
확장성
– 노드 증가에 따른 선형적인 (성능, 용량) 확장 가능
비용
– 다수의 범용 서버(Commodity Server) 클러스터에서 동작하도록 설계
– 노드 증가를 통한 용량 증설 및 처리 성능 향상 비용이 저렴
106
109. Star Schema (1/3)
Star schema:
–Design of database
–Separating dimensional data and fact/event data
Two major types of components
–Fact table and dimensional tables
–A fact table is surrounded by dimensional tables in a simple
star schema
–Each dimensional table has a one-to-many relationship to
the central fact table
출처: http://www.fbe.hku.hk/~is/busi0092/Notes/t1_dataWarehouse_full_v3.pdf
108
110. Star Schema (2/3) - Component of Star Schema
Fact tables contain
factual or quantitative
data
1:N relationship
between dimension
tables and fact
tables
Dimension tables contain
descriptions about the
subjects of the business
출처: http://www.fbe.hku.hk/~is/busi0092/Notes/t1_dataWarehouse_full_v3.pdf
109
111. Star Schema (3/3) - Sample Data
출처: http://www.fbe.hku.hk/~is/busi0092/Notes/t1_dataWarehouse_full_v3.pdf
110
112. SQL on Hadoop Products by MapR
출처: http://www.mapr.com/products/sql-on-hadoop
111
113. 빅데이터의 잠재적 사용사례
출처: SAS and IDC, http://practicalanalytics.wordpress.com/2011/12/12/big-data-analytics-use-cases/
112