SlideShare a Scribd company logo
1 of 9
Download to read offline
Tuning Up Apache Phoenix.
Secondary Indexes
By Vlad Krava

2019
Setup
For out basic performance and explain plan overview we are going to use 3 tables:
•USER_DATA_150K (150 000 records)
•USER_DATA_15M (15 000 000 records)
•USER_DATA_30M (30 000 000 records)
Performance
Overview
Search by Primary Key (REC_KEY)
Filtering and Sorting by Table Fields
(REC_LAST_NAME & REC_CREATE_DATE)
Tuning Up
4 Ways to Improve Performance
• Include all fields into index. Pros: guarantee that searching operation
will go by index table. Cons: MAJOR performance downgrades for
UPSERT and DELETE operations, index’s table size grows and by default it
is wrong architectural decision unless you're perform a search by all table
fields.
• Index hinting. Pros: lightweight approach, haven’t been noticed problems
with table out of sync (critical bug in version 5.0.0 and below:
PHOENIX-4045). Cons: doesn’t guarantee traversal through index table.
• Covered index. Pros: guarantee that searching operation will go by index
table. Cons: index’s table size grows.
• Local index. Pros: index and data table are consistent. Cons: low read
intensity.
Important Notes
• Global Index will not be used by Phoenix unless all of the columns referenced in
the query are contained in the index. A Local Index can be an alternative to a
Global Index.
• After including all fields into Secondary Index and introducing Covered Indexes
the index’s table size increased in 2 times comparing to Index Hinting strategy. It
depends mostly on amount and type of fields of your tables. To check the index’s
table size run the following command on Hadoop’s Namenode:
$ hadoop fs -du -h hdfs://{PATH_TO_HBASE}/data/data/{SCHEMA}/{INDEX}
• An index fields ordinal position should be chosen in a way that aligns with the
common query patterns — choose the most frequently queried column or column
which is going to stand always in front of every filter criteria as the first one and
so on.
Q&A
Email: vkrava4@gmail.com

Twitter: vkrava4

More Related Content

What's hot

Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Shani729
 
Informatica perf points
Informatica perf pointsInformatica perf points
Informatica perf pointsdba3003
 
Intro to Data warehousing Lecture 04
Intro to Data warehousing   Lecture 04Intro to Data warehousing   Lecture 04
Intro to Data warehousing Lecture 04AnwarrChaudary
 
MS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating dataMS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating dataDataminingTools Inc
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixYugabyteDB
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyrC. Tobin Magle
 
Stored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlStored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlnishantdavid9
 
Database indexing framework
Database indexing frameworkDatabase indexing framework
Database indexing frameworkNitin Pande
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersAdam Hutson
 
Reconciliation Tool
Reconciliation ToolReconciliation Tool
Reconciliation ToolAchal Kagwad
 
X rec extened reconciliation using excel vba
X rec   extened reconciliation using excel vbaX rec   extened reconciliation using excel vba
X rec extened reconciliation using excel vbasenthilsundaresan
 

What's hot (14)

Dwh lecture slides-week5&6
Dwh lecture slides-week5&6Dwh lecture slides-week5&6
Dwh lecture slides-week5&6
 
Informatica perf points
Informatica perf pointsInformatica perf points
Informatica perf points
 
Intro to Data warehousing Lecture 04
Intro to Data warehousing   Lecture 04Intro to Data warehousing   Lecture 04
Intro to Data warehousing Lecture 04
 
MS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating dataMS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating data
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
 
Data wrangling with dplyr
Data wrangling with dplyrData wrangling with dplyr
Data wrangling with dplyr
 
ML whitepaper v0.2
ML whitepaper v0.2ML whitepaper v0.2
ML whitepaper v0.2
 
Stored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sqlStored procedure tuning and optimization t sql
Stored procedure tuning and optimization t sql
 
Database indexing framework
Database indexing frameworkDatabase indexing framework
Database indexing framework
 
Cis266 final review
Cis266 final reviewCis266 final review
Cis266 final review
 
SQL Server 2008 Development for Programmers
SQL Server 2008 Development for ProgrammersSQL Server 2008 Development for Programmers
SQL Server 2008 Development for Programmers
 
Dbms 3 sem
Dbms 3 semDbms 3 sem
Dbms 3 sem
 
Reconciliation Tool
Reconciliation ToolReconciliation Tool
Reconciliation Tool
 
X rec extened reconciliation using excel vba
X rec   extened reconciliation using excel vbaX rec   extened reconciliation using excel vba
X rec extened reconciliation using excel vba
 

Similar to Tuning Up Apache Phoenix. Secondary Indexes

Pl sql best practices document
Pl sql best practices documentPl sql best practices document
Pl sql best practices documentAshwani Pandey
 
Db2 sql tuning and bmc catalog manager
Db2 sql tuning and bmc catalog manager Db2 sql tuning and bmc catalog manager
Db2 sql tuning and bmc catalog manager Krishan Singh
 
Query Optimization in SQL Server
Query Optimization in SQL ServerQuery Optimization in SQL Server
Query Optimization in SQL ServerRajesh Gunasundaram
 
Optimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxOptimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxJasonTuran2
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at AlibabaMichael Stack
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouseUday Kothari
 
Goldilocks and the Three MySQL Queries
Goldilocks and the Three MySQL QueriesGoldilocks and the Three MySQL Queries
Goldilocks and the Three MySQL QueriesDave Stokes
 
SQL Server Index and Partition Strategy
SQL Server Index and Partition StrategySQL Server Index and Partition Strategy
SQL Server Index and Partition StrategyHamid J. Fard
 
Tuning database performance
Tuning database performanceTuning database performance
Tuning database performanceBinay Acharya
 
Sap abap
Sap abapSap abap
Sap abapnrj10
 
Db performance optimization with indexing
Db performance optimization with indexingDb performance optimization with indexing
Db performance optimization with indexingRajeev Kumar
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publishGleydson Lima
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsNirav Shah
 
Maximizing SAP ABAP Performance
Maximizing SAP ABAP PerformanceMaximizing SAP ABAP Performance
Maximizing SAP ABAP PerformancePeterHBrown
 

Similar to Tuning Up Apache Phoenix. Secondary Indexes (20)

Pl sql best practices document
Pl sql best practices documentPl sql best practices document
Pl sql best practices document
 
Db2 sql tuning and bmc catalog manager
Db2 sql tuning and bmc catalog manager Db2 sql tuning and bmc catalog manager
Db2 sql tuning and bmc catalog manager
 
Query Optimization in SQL Server
Query Optimization in SQL ServerQuery Optimization in SQL Server
Query Optimization in SQL Server
 
San diegophp
San diegophpSan diegophp
San diegophp
 
Sql performance tuning
Sql performance tuningSql performance tuning
Sql performance tuning
 
Optimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptxOptimizing Application Performance - 2022.pptx
Optimizing Application Performance - 2022.pptx
 
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
 
Designing high performance datawarehouse
Designing high performance datawarehouseDesigning high performance datawarehouse
Designing high performance datawarehouse
 
Goldilocks and the Three MySQL Queries
Goldilocks and the Three MySQL QueriesGoldilocks and the Three MySQL Queries
Goldilocks and the Three MySQL Queries
 
SQL Server Index and Partition Strategy
SQL Server Index and Partition StrategySQL Server Index and Partition Strategy
SQL Server Index and Partition Strategy
 
Indexes overview
Indexes overviewIndexes overview
Indexes overview
 
Tuning database performance
Tuning database performanceTuning database performance
Tuning database performance
 
Sap abap
Sap abapSap abap
Sap abap
 
Dbms schemas for decision support
Dbms schemas for decision supportDbms schemas for decision support
Dbms schemas for decision support
 
Db performance optimization with indexing
Db performance optimization with indexingDb performance optimization with indexing
Db performance optimization with indexing
 
Tech Talk - JPA and Query Optimization - publish
Tech Talk  -  JPA and Query Optimization - publishTech Talk  -  JPA and Query Optimization - publish
Tech Talk - JPA and Query Optimization - publish
 
The ABAP Query
The ABAP QueryThe ABAP Query
The ABAP Query
 
Tunning overview
Tunning overviewTunning overview
Tunning overview
 
Top 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tipsTop 10 Oracle SQL tuning tips
Top 10 Oracle SQL tuning tips
 
Maximizing SAP ABAP Performance
Maximizing SAP ABAP PerformanceMaximizing SAP ABAP Performance
Maximizing SAP ABAP Performance
 

Recently uploaded

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 

Recently uploaded (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 

Tuning Up Apache Phoenix. Secondary Indexes

  • 1. Tuning Up Apache Phoenix. Secondary Indexes By Vlad Krava 2019
  • 2. Setup For out basic performance and explain plan overview we are going to use 3 tables: •USER_DATA_150K (150 000 records) •USER_DATA_15M (15 000 000 records) •USER_DATA_30M (30 000 000 records)
  • 4. Search by Primary Key (REC_KEY)
  • 5. Filtering and Sorting by Table Fields (REC_LAST_NAME & REC_CREATE_DATE)
  • 7. 4 Ways to Improve Performance • Include all fields into index. Pros: guarantee that searching operation will go by index table. Cons: MAJOR performance downgrades for UPSERT and DELETE operations, index’s table size grows and by default it is wrong architectural decision unless you're perform a search by all table fields. • Index hinting. Pros: lightweight approach, haven’t been noticed problems with table out of sync (critical bug in version 5.0.0 and below: PHOENIX-4045). Cons: doesn’t guarantee traversal through index table. • Covered index. Pros: guarantee that searching operation will go by index table. Cons: index’s table size grows. • Local index. Pros: index and data table are consistent. Cons: low read intensity.
  • 8. Important Notes • Global Index will not be used by Phoenix unless all of the columns referenced in the query are contained in the index. A Local Index can be an alternative to a Global Index. • After including all fields into Secondary Index and introducing Covered Indexes the index’s table size increased in 2 times comparing to Index Hinting strategy. It depends mostly on amount and type of fields of your tables. To check the index’s table size run the following command on Hadoop’s Namenode: $ hadoop fs -du -h hdfs://{PATH_TO_HBASE}/data/data/{SCHEMA}/{INDEX} • An index fields ordinal position should be chosen in a way that aligns with the common query patterns — choose the most frequently queried column or column which is going to stand always in front of every filter criteria as the first one and so on.