Building Enterprise OLAP on Hadoop for FSI

Building Enterprise OLAP on Hadoop
for Financial Services Industry
Luke Han
luke@kyligence.io | @lukehq
Co-founder & CEO of Kyligence
Creator & VP of Apache Kylin
Microsoft Regional Director & MVP

About Kyligence
• Formed by creators of Apache Kylin in 2016
• Offers Enterprise and Cloud version of Apache Kylin
• Funding from Redpoint, Cisco, CBC and Shunwei
• Member of Microsoft Accelerator Shanghai 2017
• Dual HQ in Silicon Valley & Shanghai, China
Kyligence booth: #855

Transition to Big Data…
How about your traditional data warehouse?
How about your existing OLAP/BI application?

Data Warehouse/OLAP
in Financial Services Industry
o The biggest industry rely on DW/OLAP
application
o Thousands applications build on top of EDW
o Experienced analysts with decade expertise
…in data…but not in technologies

Presentation
Visualization
OLAP
Data Mart
Enterprise
Data
Warehouse
Data
Source
o Optimized for mission-critical
analytics
o Well modeling
o Best practices of industry
o Thriving ecosystem
o Trained experts everywhere
Enterprise Data Warehouse Architecture

But
you are asked to…
o Migrate or build existing OLAP/BI app to Big Data
o Better performance…just because you have Big Data now
o Train yourself to learn MR/Spark/ML…and AI

Presentation
Visualization
Data
Lake
Data
Source
o Too many options
o Low performance
o Long learning curve
o Compatibility issue
o Technology vs Data
OLAP: The Missing Part of Big Data
Hive Impala Spark
SQL
Drill
MapReduce …Spark

Presentation
Visualization
Data
Lake
Data
Source
o MOLAP on Hadoop
o Simplified Data Modeling
o Optimized for aggregation
query
o ANSI SQL
o Native on Hadoop
o On-Prem & In the Cloud
Apache Kylin: Bring OLAP back to Big Data
OLAP
Data Mart
Hive Impala Spark SQL Drill
MapReduce …Spark

Kylin vs Hive: Star-Schema Benchmark
0.17 0.17 0.18
142.42
161.66
189.17
0
20
40
60
80
100
120
140
160
180
200
2 10 20
ResponseTime(seconds)
Data Volume (Scale Factor)
Apache Kylin vs. Apache Hive
(lower is better)
KAP
Apache Hive
* Based on 4 Nodes, 16 Core CPU, 96 GB Memory per node
Apache Kylin

Global Users
FSI
• ABC
• CCB
• CMB
• CPIC
• Citic Bank
• China
Unionpay
• HUATAI
Securities
• GUOTAI
Securities
• Lufax
Telecom
• China Mobile
• China Telecom
• Chine Unicom
• AT & T
Internet
• eBay
• Yahoo! Japan
• Baidu
• Meituan
• NetEase
• Expedia
• JD.com
• VIP.com
• 360
• Toutiao
Others
• MachineZone
• Glispa
• Inovex
• Adobe
• iFLYTEC
500+ use cases in production global
Manufacturing
• SAIC
• HUAWEI
• Lenovo
• OPPO
• XIAOMI
• VIVO
Data collected from public information and kylin community

Kyligence: Enterprise OLAP on Hadoop
Kyligence Robot
Online Optimize &
Tuning Services
Kyligence Analytics Platform (KAP)
Kyligence Solutions
Apache Kylin
Open Source
OLAP On Hadoop
KyStorage
Columnar Storage
KyStudio
Model Designer
KyManager
Administrator Tool
KyAnalyzer
Agile BI
Security
Cell Level ACL
On-Demand
Deployment
On-Premises Hybrid In the Cloud

Kyligence: Enterprise OLAP on Hadoop
Hive
Spark
SQL
Impala
Kyligence
Analytics Platform (KAP)
Mission Critical AnalyticsData Exploration/Discovery
Intelligent Cubing
by KAP
Query Pushdown: minutes latency Cube Access: sub-second latency

Support
Data Exploration and Discovery

TPC-DS
0
50
100
150
200
250
1 4 7 101316192225283134374043464952555861646770737679828588919497
KAP: TPC-DS
• Hive: 33 queries can’t support
or run out of time
• KAP: all 99 queries supported
• Routine query between SQL
on Hadoop and Apache Kylin

Speed Up
Mission Critical Analytics

TPC-H Benchmark
0
10
20
30
40
50
60
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22
KAP vs SparkSQL 2.1 (lower is better)
SparkSQL 2.1 KAP 2.4

Kyligence Studio: Data Modeling Designer
o Drag & Drop
o Smart Data Modeling
o Intelligent Optimization

Integrate
with Business Intelligence tools

Seamless Integration with BI tools
o KyAnalyzer
o Tableau
o Power BI/Excel
o IBM Cognos
o MicroStrategy
o Superset
o Zeppenlin
o Saiku
o …

Enhanced
Security and Management
Cell Level ACL/SSO/LDAP/Kerberos…

CPIC: China Pacific Insurance (Group) Co., LTD
• Global Fortune 500 insurance company
• Top 2 insurance company in China
• $40+ billion revenue
• 8+ million customers
• 97,000+ employees

Challenges
• Legacy IBM Cognos + DB2 solution can’t support Big Data scenarios
• Long waiting time (minutes ~ hours for reporting)
• Low concurrency (100,000+ employees!)
• High cost

2016.12
~
2017.01
KAP POC: Performance Testing
• Query Latency
• Concurrency
KAP POC: Compatibility
• Cognos Connection
• Cognos Syntax
2017.01
~
2017.03
Development
• Fixed Reports
• Flexible Reports
2017.03
~
2017.05
Go alive
• All dataset aggregation and testing
• Fixed Reports released
2017.05
~
2017.06
Journey of Kyligence Analytics Platform
• No changes on
Hadoop side
• No additional
engineers required
• Most of work done by
analysts

KAP + Cognos: Deployment
Dynamic Report
JDBC
Fixed Report
ODBC
KAP Query Server
Reporting & Dashboard OLAP & Data Mart Big Data Platform

Benefits after Adopting Kyligence
• One-stop BI platform generates complicated reports
• Over 90% queries return within 3 seconds (including high-dimensional
queries)
• Seamless integration with IBM Cognos, no change at front-end
• 2 KAP cubes replaced 2000+ IBM Cognos cubes
• Cost reduced significantly by adopting open source technology

Customer Quote
“Kyligence enables us to find valuable insights faster
from every insurance policy within seconds. Kyligence’s
platform allows us to achieve more with less. Our lean
management system has improved significantly”
-- Minchen Wu, Depute GM of IT, CPIC

Fusion Big Data Platform
• Open: Connect to Teradata/Greenplum and IBM Cognos/Saiku…
• Flexible: Self-Services for end users
• Efficiency: Speed up PC and Mobile analytics experience
China Construction Bank (CCB):
2nd Largest Bank in the World
“Apache Kylin is last piece of puzzle to
serving data asserts management
between legacy DW and new Big Data.”
-- Zhi Zhu, Vice Senior Manager of Tech Dept, CCB

Enterprise OLAP on Hadoop
Speed Up Mission Critical Analytics
Booth #855
luke@kyligence.io
http://kyligence.io

Building Enterprise OLAP on Hadoop for FSI

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building Enterprise OLAP on Hadoop for FSI

Similar to Building Enterprise OLAP on Hadoop for FSI (20)

More from Luke Han

More from Luke Han (15)

Recently uploaded

Recently uploaded (20)

Building Enterprise OLAP on Hadoop for FSI

Editor's Notes