Building your bi system-HadoopCon Taiwan 2015

BUILD YOUR BI SYSTEM
PRACTICE IN DATA LAKE ECOSYSTEM
Bryan@Vpon Data

• Experience
Vpon Data Engineer
TWM, Keywear, Nielsen
• Bryan’s notes for data analysis
http://bryannotes.blogspot.tw
• Spark.TW
• Linikedin
https://tw.linkedin.com/pub/bryan-yang/7b/763/a79
ABOUT ME

AGENDA
• User Story
• Data Lake
• Frame Work of BI

MORE COMPLEX AND
BIG…
http://www.slideteam.net/technology-powerpoint-templates/mobile-phones.html

3 KINDS OF PROBLEMS
https://kavyamuthanna.wordpress.com/category/big-data/

BIG DATA BIG PROBLEM
http://www.mn.uio.no/ifi/studier/masteroppgaver/nd/masteroppgave_cloud_bigdata_hpc.html

BIG DATA BIG COST
• The cost of data storage
What does the data keep?
How long?
• The cost of data management
Is the machine and infra easy to maintain?
Data Flow(ETL)?
• The time cost of data processing
How long will the users can wait?
Accessibility of the data
Human costs you can not see

SO MANY ADHOC QUERIES
SALES
MARKETING
FINANCE
BUSINESS

EVEN A SIMPLE QUERY
Q: HI, PLEASE TELL ME HOW MANY
USERS FROM THE BEGINNING?
A:SELECT COUNT(1)
FROM LOG

ttps://myreelpov.wordpress.com/2012/12/23/which-story-do-you-prefer-life-of-pi/life-of-pi-2
Your Life
Boss
Family and Lover
Customers
Data Ocean

Overviews
Business intelligence (BI) is the set of techniques and tools for
the transformation of raw data into meaningful and useful
information for business analysis purposes. —Wikipedia

DIFFERENT FEATHERS
Price Perfomance Accessibility
Hadoop Low Median Low
SQL Server Low-Median Depends on Median
Data
Warehouse
High High Median
BI System High Depends on High

http://www.datalytyx.com/big-data-data-lakes/

WHY DATA LAKE
tp://thesologuide.com/332/the-seesaw-of-success-when-taking-a-rest-is-bes

HIVE
• Create at Facebook
• Data warehouse in Hadoop ecosystem
• HiveQL(SQL like interface)
• Metastore(Save the schema of data,
schema on read)
• UDF

http://www.stratapps.net/intro-hive.php

http://hortonworks.com/blog/hive-cheat-sheet-for-sql-users/

TERADATA
• Massively Parallel Processing
• Each processor handles different threads
of the program, and Each processor itself
has its own operating disk
• Teradata SQL is fully certified at the SQL
92

http://www.slideshare.net/alam7/module-02-teradata-basics

https://www.safaribooksonline.com/library/view/teradata-architecture-for

TABLEAU
• Visualization Tool
• Connect with kinds of database
• VizQL
• Tableau Server

http://www.clearpeaks.com/blog/tableau/tableau-8-2-new-features

https://www.youtube.com/watch?v=fYpy04vmG_o

m/services/business-intelligence-services/tableau-consulting/table

JENKINS
• Manage ETL processes
• Free & Many Plugins
• Monitor Jobs Status and dependency
• Communication with Git and SVM
• Email alert

User Interface回到首頁
管理選單
建置中項目
Joblist 建置資訊
建置狀態
下次建置項目及時間
ip:port
Job Name List
Job Name
List

Build Steps
call python script
call the remote shell
call local shell script

Build Graph
Job Name Job Name Job Name

Hadoop
Cluster 1
Hadoop
Cluster 2
Teradata
Tableau
Server
User
Data Transfer
Request
ETL
Live Query Too Slow
Data Slicing

Hadoop
Cluster 1
Hadoop
Cluster 2
Teradata
Tableau
Server
User
Data Transfer
Request
ETL
Extract Data Insufficient Space
Data Slicing

Hadoop
Cluster 1
Hadoop
Cluster 2
Teradata
Tableau
Server
User
Data Transfer
Request
ETL
Extract Data
Every Day
Table
View
Statistical Tables
Data Slicing

USER EXPERIENCE
TUNING
0
30
60
90
120
150
HIVE TERADATA BI
120X Faster

HOW TO CHOOSE THE
COMPONENT IN YOUR BI
FRAMEWORK ?
• The cost of data storage
• The cost of data management
• The time cost of data processing

CONSIDERINGS AND
SUGGESTIONS
• Time is money
• HDD space/ money for the time
• Understanding the components and
relationships
• Get balance of the needs and costs
• Good framework will help business growth

COST CURVE
Business Growth
CostofBusinessGrowth

Hardware
*More Nodes
*More Memories
*Graph Card
…
Software
*Spark
*Tez
*Tachyon
*Algorithm
…
IN THE FUTURE
Cloud
*EC2
*Big Query
*Bluemix
*SAP
…

THANK YOU FOR YOUR
LISTENING
Special Thank
Vpon
Hood, Meiyen, Gil and OPS Team

Building your bi system-HadoopCon Taiwan 2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Building your bi system-HadoopCon Taiwan 2015

Similar to Building your bi system-HadoopCon Taiwan 2015 (20)

More from Bryan Yang

More from Bryan Yang (10)

Recently uploaded

Recently uploaded (20)

Building your bi system-HadoopCon Taiwan 2015

Editor's Notes