SlideShare a Scribd company logo
1 of 22
Apache Kylin Use Cases
in China and Japan
Luke Han | luke@kyligence.io
VP of Apache Kylin, ASF Member
Co-founder & CEO of Kyligence Inc.
Agenda
§About Apache Kylin
§Why Apache Kylin
§Use Cases
§Q & A
About Kyligence Inc.
Enterprise
Product
Services
Building World Wild
Open Source
Community
DevOps
CloudSolutions
üFormed	by	the	team	who	created	Apache	Kylin
üFunding	from	Redpoint Ventures,	CBC	and	
Shunwei Capital
üOffering	Data	Warehouse	products	and	services	
powered	by	Apache	Kylin
About Apache Kylin
--Kylin / ˈkiːˈlɪn / 麒麟
--n.	(in	Chinese	art)	a	mythical	animal	of	composite	form	
--Apache Kylin
Extreme OLAP Engine for Big Data
Leading open source OLAP on Hadoop
Fast growing open source community
Adopted by 200+ global organizations
First born in China Apache Top Level Project
InfoWorld Bossie Award:
Best Open Source Big Data Tool (2015)
Best Open Source Big Data Tool (2016)
http://kylin.apacheorg
Apache Kylin: the missing part of Big Data
üHigh	Performance
üHigh	Concurrency
üANSI	SQL
üNative	on	Hadoop
üCloud	Ready
Global Users 200+ use cases in production
Yahoo! Japan
OLAP Cube: Space for Time
time, item
time, item, location
time, item, location, supplier
time item location supplier
time, location
Time, supplier
item, location
item, supplier
location, supplier
time, item, supplier
time, location, supplier
item, location, supplier
0-D(apex) cuboid
1-D cuboids
2-D cuboids
3-D cuboids
4-D(base) cuboid
• Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells
1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier>
2. (9/15, milk, Urbana, *) - <time, item, location>
3. (*, milk, Urbana, *) - <item, location>
4. (*, milk, Chicago, *) - <item, location>
5. (*, milk, *, *) - <item>
• Cuboid = one combination of dimensions
• Cube = all combination of dimensions
(all cuboids)
OLAP	Cube
The Magic: Pre-Calculation
Apache Kylin Architecture
§ Kylin vs Hive: O(1) vs O(n)
§ Kylin’s query latency is better than Hive and others
§ Lower is better
Why Apache Kylin? Speed!
0
10
20
30
40
50
60
70
80
90
SF 10 SF 20 SF 40
Kylin vs A popular SQL on Hadoop
Kylin A Popular SQL on Hadoop
Kylin is aim to speed up SQL analytics for OLAP, not aim to replace SQL on HadoopSF: Scale Factor, SF 10 = 60 millions rows
Why Apache Kylin? Stable!
Why Apache Kylin? Scale!
§ Run Star-Schema Benchmark at 10, 20 and 40 million-row scales
§ Kylin offers second latency when scale to billion even trillion data level
SF: Scale Factor, SF 10 = 60 millions rows
Apache Kylin
Use Cases
Use Case – Analytics on Trillion Data
TOUTIAO:	Top	1	news	feed	app	in	China
Use Case – OLAP on Hadoop
Meituan:	Top	1	O2O	company	in	China
Challenge
• Slow performance with previous MySQL option Heavy
development efforts with Hive solution
• Huge resources for Hive job
• Analysts can’t access directly for data on Hadoop
Solution
• Apache Kylin as core OLAP on Hadoop solution
• SQL interface for internal users
• Active participate in open source Kylin community
Supporting all critical business lines including E-Takeaways, Hotel, Movie…
Use Case – Migrating DW/BI to Big Data
CPIC: China Pacific Insurance (Group) Co., Ltd,
Top3 insurance company in China
Challenge
Legacy DW/BI system is facing challenge for today’s need:
• IBM Cognos can’t fulfill requirement with huge volume data
• Billions data volume requires second query latency
• Hard for Analysts to learn Spark/MapReduce
Solution
• Built a big data analytics platform with Enterprise Kylin on top of
Hadoop
• Using IBM Cognos as presentation layer only
• Analyst build data model via Enterprise KAP without IT
Benefits
The new platform enables analysts to:
• Continue DW/BI experts daily work with new tech stack
without learning programing or other “high-tech”
• Supports national wide analytics
• Unleashed Big Data platform's power with Apache Kylin 0
100
200
300
400
500
600
700
800
50并发 100并发 500并发
TPS
Cube_8 Cube_16 Cube32 Cube_all
Use Case – Mobile OS Usage Analytics
OPPO:	Top	4	Global	Smartphone	Vendor
With	Apache	Kylin	(now	KAP)
ü100+B	new	row/day
ü90%+	query	latency	<	2	second
üData	preparation	<	15	minutes
Built	a	Mobile	OS	Usage	Analytics	
platform	with	Apache	Kylin	to	replace	
Hive/Impala	reporting	system	which	has	
lack	interactive	analytics	capability	for	
their	analysts	group.
The	most	important	value	of	Apache	Kylin	
(KAP)	is	it	aggressively	reduced	efforts	
and	duration	to	build	a	Data	Warehouse	
on	Hadoop	which	speeded	up,	simplified	
data	access	for	Analysts,	increase	
efficiency	is	the	key	for	productivity.
-- Louis	Wong
-- Head	of	Big	Data	,OPPO
Use Case – Unified KPI Platform
Home	Link	(Lianjia):	Top	1	O2O	real	estate	agency	service	provider	in	China
Challenge
• Rapid business growth (5k stores in 2015, 8K stores in 2016
) requires ultra-fast reporting system which could serve huge
volume data
• Business require unified KPI definition and system
Benefits
• Unified KPI Platform for KPIs
• Managing critical KPIs in one place
• KPI number: 170+
• Cube Storage: 6+ TB
• Biggest Cube: 2+B rows
• 80% queries less than 1 second
Use Case – Shopping Reporting System
Yahoo!	Japan:	the	most	visited	website	in	Japan
Yahoo! Japan
§ Our reporting system used Impala as a
backend database previously.
- It took a long time (about 60 sec) to show
Web UI.
§ In order to lower the latency, we moved to
Apache Kylin.
- Average latency < 1sec for most cases
§ Thanks to low latency with Kylin, we become
possible to focus on adding functions for users.
§ We provide a reporting system that show
statistics for store owners.
- e. g. impressions, clicks and sales.
Use Case – Kylin on Cloud
Strikingly:	first	Chinese	company	to	graduate	from	the	Y-Combinator seed	accelerator.
§ Built a Cloud Analytics Platform with AWS EMR &
Apache Kylin to replaced existing technology which
provided by Cloud vendor.
- Reduced query latency from 5~10 to less than 1
second
- Bring high concurrency queries with stable
performance (latency be guaranteed)
- Elastic resizing computing cluster to save cost
(only keep query cluster serving daily reporting
needs)
More…
§ eBay
- Apache Kylin origins from eBay, various use cases from behavior, streaming, reporting and so on
§ JD.com Cloud
- Cloud deployment serves cloud usage and dashboard for merchant
§ Glispa
- using Apache Kylin as an OLAP component within its data management platform (DMP)
§ Exponential
- Using Kylin as a component in it’s data platform LAMBDA to power inventory, campaign, behavior and
demand analysis for advertising
§ Powered By:
- http://kylin.apache.org/community/poweredby.html
Thanks
§ For Apache Kylin
- Please visit: http://kylin.apache.org
§ For Kyligence Inc:
- Please visit: http://kyligence.io
§ Contact me:
- luke@kyligence.io

More Related Content

What's hot

Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecYang Li
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine TourLuke Han
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataLuke Han
 
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @ShanghaiLuke Han
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin IntroductionLuke Han
 
Kylin Engineering Principles
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering PrinciplesXu Jiang
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)qhzhou
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupLuke Han
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...Luke Han
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopTed Dunning
 
Kylin olap part 1- getting started
Kylin olap   part 1- getting startedKylin olap   part 1- getting started
Kylin olap part 1- getting startedShubham Shirude
 
Apache Kylin 1.5 Updates
Apache Kylin 1.5 UpdatesApache Kylin 1.5 Updates
Apache Kylin 1.5 UpdatesYang Li
 
Big Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylininovex GmbH
 
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingLuke Han
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingLuke Han
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming hongbin ma
 
Design cube in Apache Kylin
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache KylinYang Li
 
Datacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheConamarsri
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Seshu Adunuthula
 

What's hot (20)

Apache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 DecApache Kylin: Hadoop OLAP Engine, 2014 Dec
Apache Kylin: Hadoop OLAP Engine, 2014 Dec
 
Kylin OLAP Engine Tour
Kylin OLAP Engine TourKylin OLAP Engine Tour
Kylin OLAP Engine Tour
 
Apache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big DataApache Kylin Extreme OLAP Engine for Big Data
Apache Kylin Extreme OLAP Engine for Big Data
 
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
6. Apache Kylin Roadmap and Community - Apache Kylin Meetup @Shanghai
 
Apache Kylin Introduction
Apache Kylin IntroductionApache Kylin Introduction
Apache Kylin Introduction
 
Kylin Engineering Principles
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering Principles
 
Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)Apache kylin (china hadoop summit 2015 shanghai)
Apache kylin (china hadoop summit 2015 shanghai)
 
Adding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark MeetupAdding Spark support to Kylin at Bay Area Spark Meetup
Adding Spark support to Kylin at Bay Area Spark Meetup
 
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
1. Apache Kylin Deep Dive - Streaming and Plugin Architecture - Apache Kylin ...
 
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on HadoopApache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
 
Kylin olap part 1- getting started
Kylin olap   part 1- getting startedKylin olap   part 1- getting started
Kylin olap part 1- getting started
 
Apache Kylin 1.5 Updates
Apache Kylin 1.5 UpdatesApache Kylin 1.5 Updates
Apache Kylin 1.5 Updates
 
Big Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache KylinBig Data MDX with Mondrian and Apache Kylin
Big Data MDX with Mondrian and Apache Kylin
 
Apache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 BeijingApache Kylin Open Source Journey for QCon2015 Beijing
Apache Kylin Open Source Journey for QCon2015 Beijing
 
Apache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 BeijingApache kylin - Big Data Technology Conference 2014 Beijing
Apache kylin - Big Data Technology Conference 2014 Beijing
 
Apache Kylin Streaming
Apache Kylin Streaming Apache Kylin Streaming
Apache Kylin Streaming
 
Design cube in Apache Kylin
Design cube in Apache KylinDesign cube in Apache Kylin
Design cube in Apache Kylin
 
The Evolution of Apache Kylin
The Evolution of Apache KylinThe Evolution of Apache Kylin
The Evolution of Apache Kylin
 
Datacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheConDatacubes in Apache Hive at ApacheCon
Datacubes in Apache Hive at ApacheCon
 
Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015Apache Kylin @ Big Data Europe 2015
Apache Kylin @ Big Data Europe 2015
 

Similar to Apache Kylin Use Cases in China and Japan

Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeDatabricks
 
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive DatasetsApache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive DatasetsTyler Wishnoff
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�Actian Corporation
 
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Tyler Wishnoff
 
PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)Stratebi
 
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...DataWorks Summit
 
Augmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsTyler Wishnoff
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetssuser931288
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetChun'en Ni
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015Luke Han
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for HadoopHBaseCon
 
Batched To Perfection: Modeling & Solving Business Problems With Apache Spark
Batched To Perfection: Modeling & Solving Business Problems  With Apache SparkBatched To Perfection: Modeling & Solving Business Problems  With Apache Spark
Batched To Perfection: Modeling & Solving Business Problems With Apache SparkEliav Lavi
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionAlessandro Salvatico
 
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track3-5: HBase Practice at LianjiaHBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track3-5: HBase Practice at LianjiaMichael Stack
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Big Data Spain
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal GemfireIn-Memory Computing Summit
 
Architecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceSamanthaBerlant
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Inside Analysis
 

Similar to Apache Kylin Use Cases in China and Japan (20)

Cloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data LakeCloud-native Semantic Layer on Data Lake
Cloud-native Semantic Layer on Data Lake
 
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive DatasetsApache kylin 101 - Get Sub-Second Analytics on Massive Datasets
Apache kylin 101 - Get Sub-Second Analytics on Massive Datasets
 
Apache Kylin 101
Apache Kylin 101Apache Kylin 101
Apache Kylin 101
 
SQL + Hadoop: The High Performance Advantage�
SQL + Hadoop:  The High Performance Advantage�SQL + Hadoop:  The High Performance Advantage�
SQL + Hadoop: The High Performance Advantage�
 
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
Lightning-Fast, Interactive Business Intelligence Performance with MicroStrat...
 
PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)PCM18 (Big Data Analytics)
PCM18 (Big Data Analytics)
 
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
Can you Re-Platform your Teradata, Oracle, Netezza and SQL Server Analytic Wo...
 
Augmented OLAP for Big Data Analytics
Augmented OLAP for Big Data AnalyticsAugmented OLAP for Big Data Analytics
Augmented OLAP for Big Data Analytics
 
Apache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large datasetApache kylin boost your sqls on extremely large dataset
Apache kylin boost your sqls on extremely large dataset
 
Apache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large datasetApache kylin boost your SQLs on extremely large dataset
Apache kylin boost your SQLs on extremely large dataset
 
ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015ApacheKylin_HBaseCon2015
ApacheKylin_HBaseCon2015
 
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for HadoopHBaseCon 2015: Apache Kylin - Extreme OLAP  Engine for Hadoop
HBaseCon 2015: Apache Kylin - Extreme OLAP Engine for Hadoop
 
ESGYN Overview
ESGYN OverviewESGYN Overview
ESGYN Overview
 
Batched To Perfection: Modeling & Solving Business Problems With Apache Spark
Batched To Perfection: Modeling & Solving Business Problems  With Apache SparkBatched To Perfection: Modeling & Solving Business Problems  With Apache Spark
Batched To Perfection: Modeling & Solving Business Problems With Apache Spark
 
Actian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL EditionActian Analytics Platform - Hadoop SQL Edition
Actian Analytics Platform - Hadoop SQL Edition
 
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track3-5: HBase Practice at LianjiaHBaseConAsia2018 Track3-5: HBase Practice at Lianjia
HBaseConAsia2018 Track3-5: HBase Practice at Lianjia
 
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
Shortening the Feedback Loop: How Spotify’s Big Data Ecosystem has evolved to...
 
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal GemfireIMCSummit 2015 - 1 IT Business  - The Evolution of Pivotal Gemfire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
 
Architecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High PerformanceArchitecting Snowflake for High Concurrency and High Performance
Architecting Snowflake for High Concurrency and High Performance
 
Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?Hadoop as an Analytic Platform: Why Not?
Hadoop as an Analytic Platform: Why Not?
 

More from Luke Han

Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big DataLuke Han
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsLuke Han
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanLuke Han
 
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @ShanghaiLuke Han
 
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @ShanghaiLuke Han
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @ShanghaiLuke Han
 
Actuate presentation 2011
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011Luke Han
 

More from Luke Han (7)

Augmented OLAP for Big Data
Augmented OLAP for Big DataAugmented OLAP for Big Data
Augmented OLAP for Big Data
 
Refactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics ProductsRefactoring your EDW with Mobile Analytics Products
Refactoring your EDW with Mobile Analytics Products
 
The Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke HanThe Apache Way - Building Open Source Community in China - Luke Han
The Apache Way - Building Open Source Community in China - Luke Han
 
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
3. Apache Tez Introducation - Apache Kylin Meetup @Shanghai
 
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
5. Apache Kylin的金融大数据应用场景 - Apache Kylin Meetup @Shanghai
 
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
4.Building a Data Product using apache Zeppelin - Apache Kylin Meetup @Shanghai
 
Actuate presentation 2011
Actuate presentation   2011Actuate presentation   2011
Actuate presentation 2011
 

Recently uploaded

What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noidabntitsolutionsrishis
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...Technogeeks
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)jennyeacort
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odishasmiwainfosol
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 

Recently uploaded (20)

What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in NoidaBuds n Tech IT Solutions: Top-Notch Web Services in Noida
Buds n Tech IT Solutions: Top-Notch Web Services in Noida
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...What is Advanced Excel and what are some best practices for designing and cre...
What is Advanced Excel and what are some best practices for designing and cre...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
Call Us🔝>༒+91-9711147426⇛Call In girls karol bagh (Delhi)
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company OdishaBalasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
Balasore Best It Company|| Top 10 IT Company || Balasore Software company Odisha
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 

Apache Kylin Use Cases in China and Japan

  • 1. Apache Kylin Use Cases in China and Japan Luke Han | luke@kyligence.io VP of Apache Kylin, ASF Member Co-founder & CEO of Kyligence Inc.
  • 2. Agenda §About Apache Kylin §Why Apache Kylin §Use Cases §Q & A
  • 3. About Kyligence Inc. Enterprise Product Services Building World Wild Open Source Community DevOps CloudSolutions üFormed by the team who created Apache Kylin üFunding from Redpoint Ventures, CBC and Shunwei Capital üOffering Data Warehouse products and services powered by Apache Kylin
  • 4. About Apache Kylin --Kylin / ˈkiːˈlɪn / 麒麟 --n. (in Chinese art) a mythical animal of composite form --Apache Kylin Extreme OLAP Engine for Big Data Leading open source OLAP on Hadoop Fast growing open source community Adopted by 200+ global organizations First born in China Apache Top Level Project InfoWorld Bossie Award: Best Open Source Big Data Tool (2015) Best Open Source Big Data Tool (2016) http://kylin.apacheorg
  • 5. Apache Kylin: the missing part of Big Data üHigh Performance üHigh Concurrency üANSI SQL üNative on Hadoop üCloud Ready
  • 6. Global Users 200+ use cases in production Yahoo! Japan
  • 7. OLAP Cube: Space for Time time, item time, item, location time, item, location, supplier time item location supplier time, location Time, supplier item, location item, supplier location, supplier time, item, supplier time, location, supplier item, location, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid • Base vs. aggregate cells; ancestor vs. descendant cells; parent vs. child cells 1. (9/15, milk, Urbana, Dairy_land) - <time, item, location, supplier> 2. (9/15, milk, Urbana, *) - <time, item, location> 3. (*, milk, Urbana, *) - <item, location> 4. (*, milk, Chicago, *) - <item, location> 5. (*, milk, *, *) - <item> • Cuboid = one combination of dimensions • Cube = all combination of dimensions (all cuboids) OLAP Cube
  • 10. § Kylin vs Hive: O(1) vs O(n) § Kylin’s query latency is better than Hive and others § Lower is better Why Apache Kylin? Speed! 0 10 20 30 40 50 60 70 80 90 SF 10 SF 20 SF 40 Kylin vs A popular SQL on Hadoop Kylin A Popular SQL on Hadoop Kylin is aim to speed up SQL analytics for OLAP, not aim to replace SQL on HadoopSF: Scale Factor, SF 10 = 60 millions rows
  • 11. Why Apache Kylin? Stable!
  • 12. Why Apache Kylin? Scale! § Run Star-Schema Benchmark at 10, 20 and 40 million-row scales § Kylin offers second latency when scale to billion even trillion data level SF: Scale Factor, SF 10 = 60 millions rows
  • 14. Use Case – Analytics on Trillion Data TOUTIAO: Top 1 news feed app in China
  • 15. Use Case – OLAP on Hadoop Meituan: Top 1 O2O company in China Challenge • Slow performance with previous MySQL option Heavy development efforts with Hive solution • Huge resources for Hive job • Analysts can’t access directly for data on Hadoop Solution • Apache Kylin as core OLAP on Hadoop solution • SQL interface for internal users • Active participate in open source Kylin community Supporting all critical business lines including E-Takeaways, Hotel, Movie…
  • 16. Use Case – Migrating DW/BI to Big Data CPIC: China Pacific Insurance (Group) Co., Ltd, Top3 insurance company in China Challenge Legacy DW/BI system is facing challenge for today’s need: • IBM Cognos can’t fulfill requirement with huge volume data • Billions data volume requires second query latency • Hard for Analysts to learn Spark/MapReduce Solution • Built a big data analytics platform with Enterprise Kylin on top of Hadoop • Using IBM Cognos as presentation layer only • Analyst build data model via Enterprise KAP without IT Benefits The new platform enables analysts to: • Continue DW/BI experts daily work with new tech stack without learning programing or other “high-tech” • Supports national wide analytics • Unleashed Big Data platform's power with Apache Kylin 0 100 200 300 400 500 600 700 800 50并发 100并发 500并发 TPS Cube_8 Cube_16 Cube32 Cube_all
  • 17. Use Case – Mobile OS Usage Analytics OPPO: Top 4 Global Smartphone Vendor With Apache Kylin (now KAP) ü100+B new row/day ü90%+ query latency < 2 second üData preparation < 15 minutes Built a Mobile OS Usage Analytics platform with Apache Kylin to replace Hive/Impala reporting system which has lack interactive analytics capability for their analysts group. The most important value of Apache Kylin (KAP) is it aggressively reduced efforts and duration to build a Data Warehouse on Hadoop which speeded up, simplified data access for Analysts, increase efficiency is the key for productivity. -- Louis Wong -- Head of Big Data ,OPPO
  • 18. Use Case – Unified KPI Platform Home Link (Lianjia): Top 1 O2O real estate agency service provider in China Challenge • Rapid business growth (5k stores in 2015, 8K stores in 2016 ) requires ultra-fast reporting system which could serve huge volume data • Business require unified KPI definition and system Benefits • Unified KPI Platform for KPIs • Managing critical KPIs in one place • KPI number: 170+ • Cube Storage: 6+ TB • Biggest Cube: 2+B rows • 80% queries less than 1 second
  • 19. Use Case – Shopping Reporting System Yahoo! Japan: the most visited website in Japan Yahoo! Japan § Our reporting system used Impala as a backend database previously. - It took a long time (about 60 sec) to show Web UI. § In order to lower the latency, we moved to Apache Kylin. - Average latency < 1sec for most cases § Thanks to low latency with Kylin, we become possible to focus on adding functions for users. § We provide a reporting system that show statistics for store owners. - e. g. impressions, clicks and sales.
  • 20. Use Case – Kylin on Cloud Strikingly: first Chinese company to graduate from the Y-Combinator seed accelerator. § Built a Cloud Analytics Platform with AWS EMR & Apache Kylin to replaced existing technology which provided by Cloud vendor. - Reduced query latency from 5~10 to less than 1 second - Bring high concurrency queries with stable performance (latency be guaranteed) - Elastic resizing computing cluster to save cost (only keep query cluster serving daily reporting needs)
  • 21. More… § eBay - Apache Kylin origins from eBay, various use cases from behavior, streaming, reporting and so on § JD.com Cloud - Cloud deployment serves cloud usage and dashboard for merchant § Glispa - using Apache Kylin as an OLAP component within its data management platform (DMP) § Exponential - Using Kylin as a component in it’s data platform LAMBDA to power inventory, campaign, behavior and demand analysis for advertising § Powered By: - http://kylin.apache.org/community/poweredby.html
  • 22. Thanks § For Apache Kylin - Please visit: http://kylin.apache.org § For Kyligence Inc: - Please visit: http://kyligence.io § Contact me: - luke@kyligence.io