Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Konstantine Krutiy
Principal Engineer, Crew Lead
PATH TO EXTRA PERFORMANCE
Eliminate unneeded work
§ Choose data types wisely
Eliminate unneeded waits
§ Reduce number of...
ART OF CHOOSING DATATYPES
WHY DATA TYPE MATTERS ?
WHY DATA TYPE MATTERS ?
Fastest CPU today is 3.7 GHz
It takes
1 / 3,700,000,000 of a second
to do single operation
WHY DATA TYPE MATTERS ?
Fastest CPU today is 3.7 GHz
It takes
1 / 3,700,000,000 of a second
to do single operation
“BIG DA...
WHY DATA TYPE MATTERS ?
Fastest CPU today is 3.7 GHz
It takes
1 / 3,700,000,000 of a second
to do single operation
“BIG DA...
DO YOU NEED TO STORE DATA SAME
WAY IT IS PRESENTED ?
DO YOU NEED TO STORE DATA SAME
WAY IT IS PRESENTED ?
Presentation: $395.17
DO YOU NEED TO STORE DATA SAME
WAY IT IS PRESENTED ?
Presentation: $395.17
Data: 395.17
DO YOU NEED TO STORE DATA SAME
WAY IT IS PRESENTED ?
Presentation: $395.17
Data: 395.17
Storage: Store as Money
Data type:...
DO YOU NEED TO STORE DATA SAME
WAY IT IS PRESENTED ?
Presentation: $395.17
Data: 395.17
Storage: Store as Money
Data type:...
DATA TYPE BENCHMARK DATA
DATA TYPE BENCHMARK AVERAGES IN SEC
27.2
29.7
37
0
5
10
15
20
25
30
35
40
INT NUMERIC(18,5) NUMERIC(37,15)
MAKING RIGHT CHOICES
• If you can store data as INTEGER
• Choose INTEGER
• If your data fits into 18 digits of PRECISION
•...
ELIMINATING UNNECESSARY LOCKING
LOCKING BEHAVIOR
AUTOCOMMIT = ON (jdbc driver default)
§ Each statement treated as complete transaction
§ When statement...
CONTROLLING AUTOCOMMIT STATE
JAVA:
conn = DriverManager.getConnection("jdbc:vertica://DBHost:5433/MyDB", myProperties);
//...
IMPACT ON LOCK COUNTS BY CHANGING
AUTOCOMMIT SETTING TO OFF
HOW TO DISABLE – OBVIOUS METHOD
HOW TO DISABLE – BETTER METHOD
BIOS SETTINGS OPTIMIZATIONS
WHAT IS TUNABLE IN BIOS?
HOW TO TUNE ?
http://h10032.www1.hp.com/ctg/Manual/c01804533.pdf
DOES IT REALLY MATTER ?
0
100
200
300
400
500
600
700
800
900
1000
DSS BIOS settings with 1x
DRAM refresh rate
DSS BIOS se...
WHAT TUNING DOC SAYS ?
STAYING IN THE SAME “TECHNOLOGY SLICE”
WHAT I WILL BE SLICING THROUGH ???
CPU and chipset
Hardware
Operating System (OS)
Database Management System (DBMS)
WHAT IS “TECHNOLOGY SLICE” ANYWAY ???
CPU Gen3 CPU Gen4
Server Gen-B
OS v. 36
DBMS v. 6
Server Gen-C
OS v. 37
DBMS v. 7
CP...
WHAT IS “TECHNOLOGY SLICE” ANYWAY ???
CPU Gen3 CPU Gen4
Server Gen-B
OS v. 36
DBMS v. 6
Server Gen-C
OS v. 37
DBMS v. 7
CP...
COMMON “TECHNOLOGY SLICE” TRAP
CPU Gen3 CPU Gen4
✔
Server Gen-B
OS v. 36
DBMS v. 6
Server Gen-C
✔
OS v. 37
✔
DBMS v. 7
✔
C...
COMMON “TECHNOLOGY SLICE” TRAP
CPU Gen3 CPU Gen4
✔
Server Gen-B
OS v. 36
DBMS v. 6
Server Gen-C
✔
OS v. 37
✔
DBMS v. 7
✔
C...
SYMPTOMS OF “TECHNOLOGY SLICE” ISSUES
System AVG: 57.90
Nice AVG: 46.56
System AVG > Nice AVG
System AVG / Nice AVG = 1.24...
“TECHNOLOGY SLICE” PERFORMANCE IMPACT
0
20
40
60
80
100
120
140
different “TECHNOLOGY SLICE” kernel proper “TECHNOLOGY SLI...
SUFFICIENT RAM CALCULATIONS
DO I REALLY NEED MORE RAM ?
select event_type, count(1) from query_events group by event_type order by 2 desc;
Spilled eve...
HOW I CAN QUANTIFY IMPACT ?
select 'event_timestamp' as timestamp_type,
min(event_timestamp) as min_timestamp,
max(event_t...
HOW I CAN QUANTIFY IMPACT ? CONT.
select spilled_queries, total_qieries, round( spilled_queries / total_qieries * 100 , 2 ...
CAN MY SPILLED DATA FIT IN TO RAM ?
select min(counter_value) as min_bytes_spilled,
max(counter_value) as max_bytes_spille...
WHO CAUSING SPILLS ?
select user_name, count(1) as spill_event_count
from query_events where event_type ilike '%SPILLED%' ...
WHAT I SHOULD TUNE ?
select distinct resource_pool from users where user_name in ('peter', 'john');
Identified resource po...
The resource pool parameters of
MEMORYSIZE and
PLANNEDCONCURRENCY provide the
options that let you tune the target
memory ...
Q & A
Upcoming SlideShare
Loading in …5
×

Extra performance out of thin air

1,499 views

Published on

Conference: HP Big Data Conference 2015
Session: Real-world Methods for Boosting Query Performance
Presentation: "Extra performance out of thin air"
Presenter: Konstantine Krutiy, Principal Software Engineer / Vertica Whisperer
Company: Localytics

Description:
Learn how to get extra performance out of Vertica from areas you never expected.

This presentation will illustrate how you can improve performance of your Vertica cluster without extra budget.
All you need is ingenuity, knowledge of Vertica internals, and the ability to challenge conventional wisdom.

We will show you real world examples on gaining performance by eliminating unneeded work, eliminating unneeded system waits and making your system operate more efficiently.

Visit my blog http://www.dbjungle.com for more Vertica insights

Published in: Software
  • Be the first to comment

Extra performance out of thin air

  1. 1. Konstantine Krutiy Principal Engineer, Crew Lead
  2. 2. PATH TO EXTRA PERFORMANCE Eliminate unneeded work § Choose data types wisely Eliminate unneeded waits § Reduce number of locks Make system operate in more efficient way § Optimize BIOS settings § Stay in same “technology slice” § Make sure you have enough RAM
  3. 3. ART OF CHOOSING DATATYPES
  4. 4. WHY DATA TYPE MATTERS ?
  5. 5. WHY DATA TYPE MATTERS ? Fastest CPU today is 3.7 GHz It takes 1 / 3,700,000,000 of a second to do single operation
  6. 6. WHY DATA TYPE MATTERS ? Fastest CPU today is 3.7 GHz It takes 1 / 3,700,000,000 of a second to do single operation “BIG DATA” record set starts from 100 billion records
  7. 7. WHY DATA TYPE MATTERS ? Fastest CPU today is 3.7 GHz It takes 1 / 3,700,000,000 of a second to do single operation “BIG DATA” record set starts from 100 billion records Processing time 1 / 3,700,000,000 sec X 100,000,000,000 = 27 sec
  8. 8. DO YOU NEED TO STORE DATA SAME WAY IT IS PRESENTED ?
  9. 9. DO YOU NEED TO STORE DATA SAME WAY IT IS PRESENTED ? Presentation: $395.17
  10. 10. DO YOU NEED TO STORE DATA SAME WAY IT IS PRESENTED ? Presentation: $395.17 Data: 395.17
  11. 11. DO YOU NEED TO STORE DATA SAME WAY IT IS PRESENTED ? Presentation: $395.17 Data: 395.17 Storage: Store as Money Data type: MONEY Internal data type: NUMERIC(18,4) Storage: Store as numeric Data type: NUMERIC Internal data type: NUMERIC(37,15) Storage: Store as integer Data type: INT Internal data type: INT
  12. 12. DO YOU NEED TO STORE DATA SAME WAY IT IS PRESENTED ? Presentation: $395.17 Data: 395.17 Storage: Store as Money Data type: MONEY Internal data type: NUMERIC(18,4) Storage: Store as numeric Data type: NUMERIC Internal data type: NUMERIC(37,15) Storage: Store as integer Data type: INT Internal data type: INT
  13. 13. DATA TYPE BENCHMARK DATA
  14. 14. DATA TYPE BENCHMARK AVERAGES IN SEC 27.2 29.7 37 0 5 10 15 20 25 30 35 40 INT NUMERIC(18,5) NUMERIC(37,15)
  15. 15. MAKING RIGHT CHOICES • If you can store data as INTEGER • Choose INTEGER • If your data fits into 18 digits of PRECISION • Choose NUMERIC(18) • If your data larger then 18 digits of PRECISION • Choose NUMERIC(your-desired-precision) Vertica default for NUMERIC is NUMERIC(37,15)
  16. 16. ELIMINATING UNNECESSARY LOCKING
  17. 17. LOCKING BEHAVIOR AUTOCOMMIT = ON (jdbc driver default) § Each statement treated as complete transaction § When statement completes changes automatically committed to database AUTOCOMMIT = OFF § Transaction continue until manually run COMMIT or ROLLBACK § Locks kept on objects for transaction duration
  18. 18. CONTROLLING AUTOCOMMIT STATE JAVA: conn = DriverManager.getConnection("jdbc:vertica://DBHost:5433/MyDB", myProperties); // get the state of the auto commit parameter System.out.println("Autocommit state: " + conn.getAutoCommit()); // Change the auto commit state to false conn.setAutoCommit(false); SQL:
  19. 19. IMPACT ON LOCK COUNTS BY CHANGING AUTOCOMMIT SETTING TO OFF
  20. 20. HOW TO DISABLE – OBVIOUS METHOD
  21. 21. HOW TO DISABLE – BETTER METHOD
  22. 22. BIOS SETTINGS OPTIMIZATIONS
  23. 23. WHAT IS TUNABLE IN BIOS?
  24. 24. HOW TO TUNE ? http://h10032.www1.hp.com/ctg/Manual/c01804533.pdf
  25. 25. DOES IT REALLY MATTER ? 0 100 200 300 400 500 600 700 800 900 1000 DSS BIOS settings with 1x DRAM refresh rate DSS BIOS settings with 4x DRAM refresh rate HPC BIOS settings with 4x DRAM refresh rate HPC + HyperThreading BIOS settings with 4x DRAM refresh rate HPC - NO TurboBoost BIOS settings with 4x DRAM refresh rate Sec DSS  BIOS  se)ngs  with  1x  DRAM  refresh  rate   738.949439   DSS  BIOS  se)ngs  with  4x  DRAM  refresh  rate   745.111176   HPC  BIOS  se)ngs  with  4x  DRAM  refresh  rate   552.148285   HPC  +  HyperThreading  BIOS  se)ngs  with  4x  DRAM  refresh  rate   877.838469   HPC  -­‐  NO  TurboBoost  BIOS  se)ngs  with  4x  DRAM  refresh  rate   561.260084   Performance increase potential about 40%
  26. 26. WHAT TUNING DOC SAYS ?
  27. 27. STAYING IN THE SAME “TECHNOLOGY SLICE”
  28. 28. WHAT I WILL BE SLICING THROUGH ??? CPU and chipset Hardware Operating System (OS) Database Management System (DBMS)
  29. 29. WHAT IS “TECHNOLOGY SLICE” ANYWAY ??? CPU Gen3 CPU Gen4 Server Gen-B OS v. 36 DBMS v. 6 Server Gen-C OS v. 37 DBMS v. 7 CPU Gen5 Server Gen-D CPU Gen6 CPU Gen7 Server Gen-E Srv Gen F OS v. 38 Server Gen-A OS v. 35OS v. 34 DBMS v. 5DBMS v. 4DBMS v. 3
  30. 30. WHAT IS “TECHNOLOGY SLICE” ANYWAY ??? CPU Gen3 CPU Gen4 Server Gen-B OS v. 36 DBMS v. 6 Server Gen-C OS v. 37 DBMS v. 7 CPU Gen5 Server Gen-D CPU Gen6 CPU Gen7 Server Gen-E Srv Gen F OS v. 38 Server Gen-A OS v. 35OS v. 34 DBMS v. 5DBMS v. 4DBMS v. 3
  31. 31. COMMON “TECHNOLOGY SLICE” TRAP CPU Gen3 CPU Gen4 ✔ Server Gen-B OS v. 36 DBMS v. 6 Server Gen-C ✔ OS v. 37 ✔ DBMS v. 7 ✔ CPU Gen5 Server Gen-D CPU Gen6 CPU Gen7 Server Gen-E Srv Gen F OS v. 38 Server Gen-A OS v. 35OS v. 34 DBMS v. 5DBMS v. 4DBMS v. 3
  32. 32. COMMON “TECHNOLOGY SLICE” TRAP CPU Gen3 CPU Gen4 ✔ Server Gen-B OS v. 36 DBMS v. 6 Server Gen-C ✔ OS v. 37 ✔ DBMS v. 7 ✔ CPU Gen5 Server Gen-D CPU Gen6 CPU Gen7 Server Gen-E Srv Gen F OS v. 38 Server Gen-A OS v. 35OS v. 34 DBMS v. 5DBMS v. 4DBMS v. 3 ? ?
  33. 33. SYMPTOMS OF “TECHNOLOGY SLICE” ISSUES System AVG: 57.90 Nice AVG: 46.56 System AVG > Nice AVG System AVG / Nice AVG = 1.24 System AVG: 11.19 Nice AVG: 57.38 System AVG < Nice AVG System AVG / Nice AVG = 0.19
  34. 34. “TECHNOLOGY SLICE” PERFORMANCE IMPACT 0 20 40 60 80 100 120 140 different “TECHNOLOGY SLICE” kernel proper “TECHNOLOGY SLICE” kernel Sec
  35. 35. SUFFICIENT RAM CALCULATIONS
  36. 36. DO I REALLY NEED MORE RAM ? select event_type, count(1) from query_events group by event_type order by 2 desc; Spilled events are very good indication of queries not fitting in RAM
  37. 37. HOW I CAN QUANTIFY IMPACT ? select 'event_timestamp' as timestamp_type, min(event_timestamp) as min_timestamp, max(event_timestamp) as max_timestamp from query_events union select 'query_timestamp' as timestamp_type, min(start_timestamp) as min_timestamp, max(start_timestamp) as max_timestamp from query_requests; System tables in Vertica have individual rolling window. Make sure you understand relation of histories available.
  38. 38. HOW I CAN QUANTIFY IMPACT ? CONT. select spilled_queries, total_qieries, round( spilled_queries / total_qieries * 100 , 2 ) as spilled_queries_percent from (select count(1) as total_qieries from query_requests where request_type = 'QUERY' and start_timestamp > (select min(event_timestamp) from query_events)) query_data, (select count(1) as spilled_queries from (select session_id, transaction_id, statement_id from query_events where event_type ilike '%SPILLED%' group by session_id, transaction_id, statement_id) spill_data) spill_data2; Amount of spilled queries in relation to entire query volume.
  39. 39. CAN MY SPILLED DATA FIT IN TO RAM ? select min(counter_value) as min_bytes_spilled, max(counter_value) as max_bytes_spilled, avg(counter_value) as avg_bytes_spilled from execution_engine_profiles where counter_name = 'bytes spilled' and counter_value > 0; Understanding size of spillage to disk.
  40. 40. WHO CAUSING SPILLS ? select user_name, count(1) as spill_event_count from query_events where event_type ilike '%SPILLED%' group by user_name order by 2 desc; In Vertica RAM allocated to queries through resource pools. Resource pools connected to users. Knowing user will point us to resource pool, which needs tuning.
  41. 41. WHAT I SHOULD TUNE ? select distinct resource_pool from users where user_name in ('peter', 'john'); Identified resource pool with spilled queries. Now we know what to tune.
  42. 42. The resource pool parameters of MEMORYSIZE and PLANNEDCONCURRENCY provide the options that let you tune the target memory allocated to queries. WHAT I SHOULD CHANGE ? HP Vertica Analytics Platform Version 7.1.x Documentation Administrator's Guide Managing the Database Managing Workloads Resource Pool Architecture Target Memory Determination for Queries in Concurrent Environments
  43. 43. Q & A

×