0
© King.com Ltd 2013 – Public 2
Datab
ase
Relati
onal
© King.com Ltd 2013 – Public
Agenda
3
•  Welcome!
•  A brief history of King
•  King data platform evolution
•  Enter Hive...
© King.com Ltd 2013 – Public
A brief history of King
4
© King.com Ltd 2013 – Public
Who?
5
A brief history of King
© King.com Ltd 2013 – Public
Where?
6
A brief history of king
© King.com Ltd 2013 – Public
Web, social, mobile
7
A brief history of King
© King.com Ltd 2013 – Public
King in numbers
8
•  100 million daily active users
•  1 billion game plays per day
•  8 offi...
© King.com Ltd 2013 – Public
A brief history of me
andy.done@king.com
9
© King.com Ltd 2013 – Public
King data platform
evolution
10
© King.com Ltd 2013 – Public
Enter Hive
11
© King.com Ltd 2013 – Public
The road to big
12
Enter Hive
0
50
100
150
200
250
300
350
2011-02-16
2011-03-04
2011-03-20
2...
© King.com Ltd 2013 – Public
Scaling accomplished
13
Enter Hive
© King.com Ltd 2013 – Public
Hive says…
14
Enter Hive
© King.com Ltd 2013 – Public
Data exploration
15
•  COUNT(*)
•  SELECT DISTINCT
•  COUNT, SUM… GROUP BY date
Enter Hive
© King.com Ltd 2013 – Public
Hive + DB = ?
16
© King.com Ltd 2013 – Public
Data platform 1.0
17
Hive + DB
Games
Event
data
Hive
Report
s
Data
scientis
ts
ETL
© King.com Ltd 2013 – Public
Data platform 1.5
18
Hive + DB
Games
Event
data
Hive DB
Report
s
Data
scientis
ts
ETL
© King.com Ltd 2013 – Public
Selection criteria
19
•  ‘Accessible’ pricing (free?)
•  Single node
•  Easy to set up
•  Low...
© King.com Ltd 2013 – Public
Contenders ready
20
•  Infobright
•  Columnar MySql engine
•  Light tuning and hinting
•  Inf...
© King.com Ltd 2013 – Public
How’s that work out?
21
•  Paid its way
•  Popular
•  100s queries / day
•  Stability
•  Ceil...
© King.com Ltd 2013 – Public
The road to big
22
Enter Hive
0
50
100
150
200
250
300
350
2011-02-16
2011-03-04
2011-03-20
2...
© King.com Ltd 2013 – Public
ETL?
23
© King.com Ltd 2013 – Public
Hive + better DB = ?
24
© King.com Ltd 2013 – Public
Data platform 2.0
25
Hive + better DB
Game
Event
data
Hive
Better
DB
Report
s
Data
scientis
t...
© King.com Ltd 2013 – Public
State of the market Jan 2013
26
•  Hadoop on steroids
•  Hadapt…
•  Impala
•  Nouvaeu Data
• ...
© King.com Ltd 2013 – Public
Contenders ready
27
Hive + better DB
Feature ExaSol Vertica
Processing In memory Disc optimis...
© King.com Ltd 2013 – Public
Disclaimers
28
•  Our data
•  Our queries
•  Our use case
•  Our results
Hive + better DB
© King.com Ltd 2013 – Public
This is our data
29
Hive + better DB
Table Row count
Mobile dimension 161 m
Social dimension ...
© King.com Ltd 2013 – Public
Single query
30
Hive + better DB
© King.com Ltd 2013 – Public
Single query
31
Hive + better DB
© King.com Ltd 2013 – Public
Single query
32
Hive + better DB
© King.com Ltd 2013 – Public
Single query
33
Hive + better DB
© King.com Ltd 2013 – Public
Cluster stats
34
Hive + better DB
Vertica ExaSol Hive InfiniDB
Nodes 4 4 19 1
Cores 64 48 228...
© King.com Ltd 2013 – Public
Concurrency 2
35
Hive + better DB
© King.com Ltd 2013 – Public
Concurrency 4
36
Hive + better DB
© King.com Ltd 2013 – Public
Concurrency 8
37
Hive + better DB
© King.com Ltd 2013 – Public
Concurrency 16
38
Hive + better DB
© King.com Ltd 2013 – Public
Overall run time
39
Hive + better DB
© King.com Ltd 2013 – Public
Picture:words
40
Hive + better DB
$1.9m
=
4 ExaSol
nodes
420 Hive nodes
© King.com Ltd 2013 – Public
This is a test
41
•  Ad hoc query tests
•  DML
•  INSERTs
•  UPDATEs
•  DELETEs
Hive + better...
© King.com Ltd 2013 – Public
And in the real world
42
•  Faster processing times
•  4.5 hours to 20 minutes
•  Happier ana...
© King.com Ltd 2013 – Public
Conclusions
43
•  For structured workloads, consider a good analytic database to
complement y...
© King.com Ltd 2013 – Public
Questions?
44
© King.com Ltd 2013 – Public
We’re hiring!
45
Thank you
© King.com Ltd 2013 – Public 46
King hug uk
Upcoming SlideShare
Loading in...5
×

King hug uk

3,894

Published on

Dr Relational or: How I Learned to Stop Worrying and Love the Database (Andy Done, Data Warehouse Lead, King)

In the face of explosive growth King's Hadoop data warehouse simply wasn't scaling fast enough. Find out why King is extending its Big Data platform with MPP database ExaSol and processing its data 100s of times faster.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,894
On Slideshare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
46
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "King hug uk"

  1. 1. © King.com Ltd 2013 – Public 2 Datab ase Relati onal
  2. 2. © King.com Ltd 2013 – Public Agenda 3 •  Welcome! •  A brief history of King •  King data platform evolution •  Enter Hive •  Hive + DB •  Hive + better DB •  Questions?
  3. 3. © King.com Ltd 2013 – Public A brief history of King 4
  4. 4. © King.com Ltd 2013 – Public Who? 5 A brief history of King
  5. 5. © King.com Ltd 2013 – Public Where? 6 A brief history of king
  6. 6. © King.com Ltd 2013 – Public Web, social, mobile 7 A brief history of King
  7. 7. © King.com Ltd 2013 – Public King in numbers 8 •  100 million daily active users •  1 billion game plays per day •  8 offices •  10 billion events per day •  Lots and lots of data… A brief history of King
  8. 8. © King.com Ltd 2013 – Public A brief history of me andy.done@king.com 9
  9. 9. © King.com Ltd 2013 – Public King data platform evolution 10
  10. 10. © King.com Ltd 2013 – Public Enter Hive 11
  11. 11. © King.com Ltd 2013 – Public The road to big 12 Enter Hive 0 50 100 150 200 250 300 350 2011-02-16 2011-03-04 2011-03-20 2011-04-05 2011-04-21 2011-05-07 2011-05-23 2011-06-08 2011-06-24 2011-07-10 2011-07-26 2011-08-11 2011-08-27 2011-09-12 2011-09-28 2011-10-14 2011-10-30 2011-11-15 2011-12-01 2011-12-17 2012-01-02 2012-01-18 2012-02-03 2012-02-19 2012-03-06 2012-03-22 2012-04-07 2012-04-23 2012-05-09 2012-05-25 2012-06-10 2012-06-26 2012-07-12 2012-07-28 2012-08-13 2012-08-29 2012-09-14 2012-09-30 2012-10-16 2012-11-01 2012-11-17 2012-12-03 2012-12-19 2013-01-04 2013-01-20 2013-02-05 2013-02-21 2013-03-09 2013-03-25 2013-04-10 2013-04-26 Compressedeventsgigabytes/day Browser Mobile 40 nodes Qlikview says no Infobright CE says no 10 nodes 20 nodes
  12. 12. © King.com Ltd 2013 – Public Scaling accomplished 13 Enter Hive
  13. 13. © King.com Ltd 2013 – Public Hive says… 14 Enter Hive
  14. 14. © King.com Ltd 2013 – Public Data exploration 15 •  COUNT(*) •  SELECT DISTINCT •  COUNT, SUM… GROUP BY date Enter Hive
  15. 15. © King.com Ltd 2013 – Public Hive + DB = ? 16
  16. 16. © King.com Ltd 2013 – Public Data platform 1.0 17 Hive + DB Games Event data Hive Report s Data scientis ts ETL
  17. 17. © King.com Ltd 2013 – Public Data platform 1.5 18 Hive + DB Games Event data Hive DB Report s Data scientis ts ETL
  18. 18. © King.com Ltd 2013 – Public Selection criteria 19 •  ‘Accessible’ pricing (free?) •  Single node •  Easy to set up •  Low maintenance Hive + DB
  19. 19. © King.com Ltd 2013 – Public Contenders ready 20 •  Infobright •  Columnar MySql engine •  Light tuning and hinting •  InfiniDB •  Columnar MySql engine •  Tuning-less •  Faster for our use case
  20. 20. © King.com Ltd 2013 – Public How’s that work out? 21 •  Paid its way •  Popular •  100s queries / day •  Stability •  Ceilings •  Screwed by mobile
  21. 21. © King.com Ltd 2013 – Public The road to big 22 Enter Hive 0 50 100 150 200 250 300 350 2011-02-16 2011-03-04 2011-03-20 2011-04-05 2011-04-21 2011-05-07 2011-05-23 2011-06-08 2011-06-24 2011-07-10 2011-07-26 2011-08-11 2011-08-27 2011-09-12 2011-09-28 2011-10-14 2011-10-30 2011-11-15 2011-12-01 2011-12-17 2012-01-02 2012-01-18 2012-02-03 2012-02-19 2012-03-06 2012-03-22 2012-04-07 2012-04-23 2012-05-09 2012-05-25 2012-06-10 2012-06-26 2012-07-12 2012-07-28 2012-08-13 2012-08-29 2012-09-14 2012-09-30 2012-10-16 2012-11-01 2012-11-17 2012-12-03 2012-12-19 2013-01-04 2013-01-20 2013-02-05 2013-02-21 2013-03-09 2013-03-25 2013-04-10 2013-04-26 Compressedeventsgigabytes/day Browser Mobile 40 nodes Qlikview says no Infobright CE says no 10 nodes 20 nodes InfiniDB
  22. 22. © King.com Ltd 2013 – Public ETL? 23
  23. 23. © King.com Ltd 2013 – Public Hive + better DB = ? 24
  24. 24. © King.com Ltd 2013 – Public Data platform 2.0 25 Hive + better DB Game Event data Hive Better DB Report s Data scientis ts ETL
  25. 25. © King.com Ltd 2013 – Public State of the market Jan 2013 26 •  Hadoop on steroids •  Hadapt… •  Impala •  Nouvaeu Data •  Platfora •  SIsense •  MPP analytics databases •  Vertica •  ExaSol Hive + better DB
  26. 26. © King.com Ltd 2013 – Public Contenders ready 27 Hive + better DB Feature ExaSol Vertica Processing In memory Disc optimised Administration Web based Command line Backup Web based Command line Resiliency Hot spare Gradual degradation Tuning Self tuning User tuning Licensing Allocated RAM Total storage Vendor Smaller Larger
  27. 27. © King.com Ltd 2013 – Public Disclaimers 28 •  Our data •  Our queries •  Our use case •  Our results Hive + better DB
  28. 28. © King.com Ltd 2013 – Public This is our data 29 Hive + better DB Table Row count Mobile dimension 161 m Social dimension 600 m Mobile facts 1 B Social facts 6.7 B
  29. 29. © King.com Ltd 2013 – Public Single query 30 Hive + better DB
  30. 30. © King.com Ltd 2013 – Public Single query 31 Hive + better DB
  31. 31. © King.com Ltd 2013 – Public Single query 32 Hive + better DB
  32. 32. © King.com Ltd 2013 – Public Single query 33 Hive + better DB
  33. 33. © King.com Ltd 2013 – Public Cluster stats 34 Hive + better DB Vertica ExaSol Hive InfiniDB Nodes 4 4 19 1 Cores 64 48 228 32 RAM 512 Gb 288 Gb 1216 Gb 300 Gb Discs 96 32 76 4 Hardware cost / USD $$$$ $$ $$ $ Total cost / USD $$$$$$ $$$$$ $$ $$
  34. 34. © King.com Ltd 2013 – Public Concurrency 2 35 Hive + better DB
  35. 35. © King.com Ltd 2013 – Public Concurrency 4 36 Hive + better DB
  36. 36. © King.com Ltd 2013 – Public Concurrency 8 37 Hive + better DB
  37. 37. © King.com Ltd 2013 – Public Concurrency 16 38 Hive + better DB
  38. 38. © King.com Ltd 2013 – Public Overall run time 39 Hive + better DB
  39. 39. © King.com Ltd 2013 – Public Picture:words 40 Hive + better DB $1.9m = 4 ExaSol nodes 420 Hive nodes
  40. 40. © King.com Ltd 2013 – Public This is a test 41 •  Ad hoc query tests •  DML •  INSERTs •  UPDATEs •  DELETEs Hive + better DB
  41. 41. © King.com Ltd 2013 – Public And in the real world 42 •  Faster processing times •  4.5 hours to 20 minutes •  Happier analysts •  Happier data warehouse engineers •  Happier ops Hive + better DB
  42. 42. © King.com Ltd 2013 – Public Conclusions 43 •  For structured workloads, consider a good analytic database to complement your Hadoop infrastructure •  ExaSol was an excellent fit for our use case •  We’ll let you know how we get on! Hive + better DB
  43. 43. © King.com Ltd 2013 – Public Questions? 44
  44. 44. © King.com Ltd 2013 – Public We’re hiring! 45
  45. 45. Thank you © King.com Ltd 2013 – Public 46
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×