King hug uk
Upcoming SlideShare
Loading in...5
×
 

King hug uk

on

  • 2,307 views

Dr Relational or: How I Learned to Stop Worrying and Love the Database (Andy Done, Data Warehouse Lead, King) ...

Dr Relational or: How I Learned to Stop Worrying and Love the Database (Andy Done, Data Warehouse Lead, King)

In the face of explosive growth King's Hadoop data warehouse simply wasn't scaling fast enough. Find out why King is extending its Big Data platform with MPP database ExaSol and processing its data 100s of times faster.

Statistics

Views

Total Views
2,307
Views on SlideShare
2,038
Embed Views
269

Actions

Likes
0
Downloads
33
Comments
0

5 Embeds 269

http://www.soltera.co.il.usrfiles.com 205
http://static.usrfiles.com 31
https://twitter.com 25
http://htmlcomponentservice.appspot.com 6
http://tweetedtimes.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    King hug uk King hug uk Presentation Transcript

    • © King.com Ltd 2013 – Public 2 Datab ase Relati onal
    • © King.com Ltd 2013 – Public Agenda 3 •  Welcome! •  A brief history of King •  King data platform evolution •  Enter Hive •  Hive + DB •  Hive + better DB •  Questions?
    • © King.com Ltd 2013 – Public A brief history of King 4
    • © King.com Ltd 2013 – Public Who? 5 A brief history of King
    • © King.com Ltd 2013 – Public Where? 6 A brief history of king
    • © King.com Ltd 2013 – Public Web, social, mobile 7 A brief history of King
    • © King.com Ltd 2013 – Public King in numbers 8 •  100 million daily active users •  1 billion game plays per day •  8 offices •  10 billion events per day •  Lots and lots of data… A brief history of King
    • © King.com Ltd 2013 – Public A brief history of me andy.done@king.com 9
    • © King.com Ltd 2013 – Public King data platform evolution 10
    • © King.com Ltd 2013 – Public Enter Hive 11
    • © King.com Ltd 2013 – Public The road to big 12 Enter Hive 0 50 100 150 200 250 300 350 2011-02-16 2011-03-04 2011-03-20 2011-04-05 2011-04-21 2011-05-07 2011-05-23 2011-06-08 2011-06-24 2011-07-10 2011-07-26 2011-08-11 2011-08-27 2011-09-12 2011-09-28 2011-10-14 2011-10-30 2011-11-15 2011-12-01 2011-12-17 2012-01-02 2012-01-18 2012-02-03 2012-02-19 2012-03-06 2012-03-22 2012-04-07 2012-04-23 2012-05-09 2012-05-25 2012-06-10 2012-06-26 2012-07-12 2012-07-28 2012-08-13 2012-08-29 2012-09-14 2012-09-30 2012-10-16 2012-11-01 2012-11-17 2012-12-03 2012-12-19 2013-01-04 2013-01-20 2013-02-05 2013-02-21 2013-03-09 2013-03-25 2013-04-10 2013-04-26 Compressedeventsgigabytes/day Browser Mobile 40 nodes Qlikview says no Infobright CE says no 10 nodes 20 nodes
    • © King.com Ltd 2013 – Public Scaling accomplished 13 Enter Hive
    • © King.com Ltd 2013 – Public Hive says… 14 Enter Hive
    • © King.com Ltd 2013 – Public Data exploration 15 •  COUNT(*) •  SELECT DISTINCT •  COUNT, SUM… GROUP BY date Enter Hive
    • © King.com Ltd 2013 – Public Hive + DB = ? 16
    • © King.com Ltd 2013 – Public Data platform 1.0 17 Hive + DB Games Event data Hive Report s Data scientis ts ETL
    • © King.com Ltd 2013 – Public Data platform 1.5 18 Hive + DB Games Event data Hive DB Report s Data scientis ts ETL
    • © King.com Ltd 2013 – Public Selection criteria 19 •  ‘Accessible’ pricing (free?) •  Single node •  Easy to set up •  Low maintenance Hive + DB
    • © King.com Ltd 2013 – Public Contenders ready 20 •  Infobright •  Columnar MySql engine •  Light tuning and hinting •  InfiniDB •  Columnar MySql engine •  Tuning-less •  Faster for our use case
    • © King.com Ltd 2013 – Public How’s that work out? 21 •  Paid its way •  Popular •  100s queries / day •  Stability •  Ceilings •  Screwed by mobile
    • © King.com Ltd 2013 – Public The road to big 22 Enter Hive 0 50 100 150 200 250 300 350 2011-02-16 2011-03-04 2011-03-20 2011-04-05 2011-04-21 2011-05-07 2011-05-23 2011-06-08 2011-06-24 2011-07-10 2011-07-26 2011-08-11 2011-08-27 2011-09-12 2011-09-28 2011-10-14 2011-10-30 2011-11-15 2011-12-01 2011-12-17 2012-01-02 2012-01-18 2012-02-03 2012-02-19 2012-03-06 2012-03-22 2012-04-07 2012-04-23 2012-05-09 2012-05-25 2012-06-10 2012-06-26 2012-07-12 2012-07-28 2012-08-13 2012-08-29 2012-09-14 2012-09-30 2012-10-16 2012-11-01 2012-11-17 2012-12-03 2012-12-19 2013-01-04 2013-01-20 2013-02-05 2013-02-21 2013-03-09 2013-03-25 2013-04-10 2013-04-26 Compressedeventsgigabytes/day Browser Mobile 40 nodes Qlikview says no Infobright CE says no 10 nodes 20 nodes InfiniDB
    • © King.com Ltd 2013 – Public ETL? 23
    • © King.com Ltd 2013 – Public Hive + better DB = ? 24
    • © King.com Ltd 2013 – Public Data platform 2.0 25 Hive + better DB Game Event data Hive Better DB Report s Data scientis ts ETL
    • © King.com Ltd 2013 – Public State of the market Jan 2013 26 •  Hadoop on steroids •  Hadapt… •  Impala •  Nouvaeu Data •  Platfora •  SIsense •  MPP analytics databases •  Vertica •  ExaSol Hive + better DB
    • © King.com Ltd 2013 – Public Contenders ready 27 Hive + better DB Feature ExaSol Vertica Processing In memory Disc optimised Administration Web based Command line Backup Web based Command line Resiliency Hot spare Gradual degradation Tuning Self tuning User tuning Licensing Allocated RAM Total storage Vendor Smaller Larger
    • © King.com Ltd 2013 – Public Disclaimers 28 •  Our data •  Our queries •  Our use case •  Our results Hive + better DB
    • © King.com Ltd 2013 – Public This is our data 29 Hive + better DB Table Row count Mobile dimension 161 m Social dimension 600 m Mobile facts 1 B Social facts 6.7 B
    • © King.com Ltd 2013 – Public Single query 30 Hive + better DB
    • © King.com Ltd 2013 – Public Single query 31 Hive + better DB
    • © King.com Ltd 2013 – Public Single query 32 Hive + better DB
    • © King.com Ltd 2013 – Public Single query 33 Hive + better DB
    • © King.com Ltd 2013 – Public Cluster stats 34 Hive + better DB Vertica ExaSol Hive InfiniDB Nodes 4 4 19 1 Cores 64 48 228 32 RAM 512 Gb 288 Gb 1216 Gb 300 Gb Discs 96 32 76 4 Hardware cost / USD $$$$ $$ $$ $ Total cost / USD $$$$$$ $$$$$ $$ $$
    • © King.com Ltd 2013 – Public Concurrency 2 35 Hive + better DB
    • © King.com Ltd 2013 – Public Concurrency 4 36 Hive + better DB
    • © King.com Ltd 2013 – Public Concurrency 8 37 Hive + better DB
    • © King.com Ltd 2013 – Public Concurrency 16 38 Hive + better DB
    • © King.com Ltd 2013 – Public Overall run time 39 Hive + better DB
    • © King.com Ltd 2013 – Public Picture:words 40 Hive + better DB $1.9m = 4 ExaSol nodes 420 Hive nodes
    • © King.com Ltd 2013 – Public This is a test 41 •  Ad hoc query tests •  DML •  INSERTs •  UPDATEs •  DELETEs Hive + better DB
    • © King.com Ltd 2013 – Public And in the real world 42 •  Faster processing times •  4.5 hours to 20 minutes •  Happier analysts •  Happier data warehouse engineers •  Happier ops Hive + better DB
    • © King.com Ltd 2013 – Public Conclusions 43 •  For structured workloads, consider a good analytic database to complement your Hadoop infrastructure •  ExaSol was an excellent fit for our use case •  We’ll let you know how we get on! Hive + better DB
    • © King.com Ltd 2013 – Public Questions? 44
    • © King.com Ltd 2013 – Public We’re hiring! 45
    • Thank you © King.com Ltd 2013 – Public 46