Efficient spatial queries on vanilla databasesJulian Hyde
A talk given by Julian Hyde at the Apache Calcite online meetup, 2021/01/20.
Spatial and GIS applications have traditionally required specialized databases, or at least specialized data structures like r-trees. Unfortunately this means that hybrid applications such as spatial analytics are not well served, and many people are unaware of the power of spatial queries because their favorite database does not support them.
In this talk, we describe how Apache Calcite enables efficient spatial queries using generic data structures such as HBase’s key-sorted tables, using techniques like Hilbert space-filling curves and materialized views. Calcite implements much of the OpenGIS function set and recognizes query patterns that can be rewritten to use particular spatial indexes. Calcite is bringing spatial query to the masses!
This is a fairly easy project, but worth while. I created this scraper using free API to get weather information for a data warehouse. With detailed weather information by date and zip codes at your disposal, you can tie this with location information in your database or data warehouse to do extensive querying/analytics. E.g.
• How does rain affect my sales by region
• How does humidity affect sales
• How does cloud cover affect sales
• How does weather affect tips
• How does weather affect Employee productivity
The sky (pun intended) is virtually the limit on this.
Good Luck.
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at DataWorks Summit, San Jose, on June 14th 2017.
Since version 8.0.14, MySQL supports LATERAL derived tables, sometimes called the for each loop of SQL. What are they? How do they work? Why do you need them? What can they do? How can you use them? Should you use them? What is all this talk about for each loops?
Categorized into 2 types visualize the patterns using R Studio with detailed illustration from bivariate to univariate analysis using methods like boxplot, skewness, outliers, hist, par and much more
Efficient spatial queries on vanilla databasesJulian Hyde
A talk given by Julian Hyde at the Apache Calcite online meetup, 2021/01/20.
Spatial and GIS applications have traditionally required specialized databases, or at least specialized data structures like r-trees. Unfortunately this means that hybrid applications such as spatial analytics are not well served, and many people are unaware of the power of spatial queries because their favorite database does not support them.
In this talk, we describe how Apache Calcite enables efficient spatial queries using generic data structures such as HBase’s key-sorted tables, using techniques like Hilbert space-filling curves and materialized views. Calcite implements much of the OpenGIS function set and recognizes query patterns that can be rewritten to use particular spatial indexes. Calcite is bringing spatial query to the masses!
This is a fairly easy project, but worth while. I created this scraper using free API to get weather information for a data warehouse. With detailed weather information by date and zip codes at your disposal, you can tie this with location information in your database or data warehouse to do extensive querying/analytics. E.g.
• How does rain affect my sales by region
• How does humidity affect sales
• How does cloud cover affect sales
• How does weather affect tips
• How does weather affect Employee productivity
The sky (pun intended) is virtually the limit on this.
Good Luck.
Query optimizers and people have one thing in common: the better they understand their data, the better they can do their jobs. Optimizing queries is hard if you don't have good estimates for the sizes of the intermediate join and aggregate results. Data profiling is a technique that scans data, looking for patterns within the data such as keys, functional dependencies, and correlated columns. These richer statistics can be used in Apache Calcite's query optimizer, and the projects that use it, such as Apache Hive, Phoenix and Drill. We describe how we built a data profiler as a table function in Apache Calcite, review the recent research and algorithms that made it possible, and show how you can use the profiler to improve the quality of your data.
A talk given by Julian Hyde at DataWorks Summit, San Jose, on June 14th 2017.
Since version 8.0.14, MySQL supports LATERAL derived tables, sometimes called the for each loop of SQL. What are they? How do they work? Why do you need them? What can they do? How can you use them? Should you use them? What is all this talk about for each loops?
Categorized into 2 types visualize the patterns using R Studio with detailed illustration from bivariate to univariate analysis using methods like boxplot, skewness, outliers, hist, par and much more
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
An overview of all of the performance tuning instrumentation, tools, and features in Oracle SQL Developer. Get help making those applications and their queries more performant.
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.
In-depth overview of Oracle Real Application Clusters (RAC) 12c Release 2, which was first presented during UKOUG Tech16 under the title "Under the Hood of Oracle Real Application Clusters (RAC) 12c Release 2" and before Oracle Database 12c Release 2 became generally available (GA) in March 2017.
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
Slow query? Add an index or two! But things are suddenly even slower! Indexes are great tools to speed data lookup but have overhead issues. Histograms don’t have that overhead but may not be suited. And how you lock rows also effects performance. So what do you do to speed up queries smartly?
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
Query performance can be enhanced by a major factor if Database Indexed are used properly. The main aim of this slide was to explain the benefits of Covering Index, but ended up writing everything I knew.
Here is the summary of what I have covered in this slide:-
1. What affects Database performance
2. What is Database Index
3. Types Of Database Index
4. Column Index
5. Composite Index
6. Covering Index
7. Indexing Guidelines
It would be interesting to know these as well -
Best practices for Indexing in Database(RDBMS)
Best practices for Indexing in MySQL
Best practices for Indexing in PostgreSQL
Best practices for Database Modeling
Best practices for SQL Query Construction
Performance impact of Indexing on Query Performance
Performance impact of Indexing on INSERT Queries
Consolidate all these knowledge and you should be happy to see the overall performance gain in your SQL Query and hence overall application will run faster.
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
15 Ways to Kill Your Mysql Application Performanceguest9912e5
Jay is the North American Community Relations Manager at MySQL. Author of Pro MySQL, Jay has also written articles for Linux Magazine and regularly assists software developers in identifying how to make the most effective use of MySQL. He has given sessions on performance tuning at the MySQL Users Conference, RedHat Summit, NY PHP Conference, OSCON and Ohio LinuxFest, among others.In his abundant free time, when not being pestered by his two needy cats and two noisy dogs, he daydreams in PHP code and ponders the ramifications of __clone().
Performance Tuning Oracle's BI ApplicationsKPI Partners
http://www.kpipartners.com/webinar-Performance-Tuning-Oracle-BI-Applications/ ... From a virtual event that discusses techniques that can be used to optimize performance of the Oracle BI Apps.
The BI Apps from Oracle present customers with a nice head start to getting their BI environment up and running. But for many customers, their user community demands lighting-fast speeds while running dashboards, reports and ad-hoc queries. Learn about some of the key techniques you can use to take the BI Apps to performance levels you didn’t think were possible.
The discussion begins with a conceptual understanding of why performance problems can exist and the counteracting design considerations. Special attention will be paid to the concept of a Performance Layer, describing what it is, what it is comprised of and how to build it. The presentation includes several real world examples of the significant performance gains that can be had from a Performance Layer.
Objective 1: Learn about the concept of a performance layer and what is involved with building one.
Objective 2: Understand the most important steps to improve the performance of your system.
Design and develop with performance in mind
Establish a tuning environment
Index wisely
Reduce parsing
Take advantage of Cost Based Optimizer
Avoid accidental table scans
Optimize necessary table scans
Optimize joins
Use array processing
Consider PL/SQL for “tricky” SQL
This Doc Consist of ER diagram of University and NHL, Introduction to posgres SQL and installation,DML and its various commands,implementation of constraints with examples,DML Implementation with set operations & Functions,Implementation of nested Queries.
All of the Performance Tuning Features in Oracle SQL DeveloperJeff Smith
An overview of all of the performance tuning instrumentation, tools, and features in Oracle SQL Developer. Get help making those applications and their queries more performant.
Ten query tuning techniques every SQL Server programmer should knowKevin Kline
From the noted database expert and author of 'SQL in a Nutshell' - SELECT statements have a reputation for being very easy to write, but hard to write very well. This session will take you through ten of the most problematic patterns and anti-patterns when writing queries and how to deal with them all. Loaded with live demonstrations and useful techniques, this session will teach you how to take your SQL Server queries mundane to masterful.
In-depth overview of Oracle Real Application Clusters (RAC) 12c Release 2, which was first presented during UKOUG Tech16 under the title "Under the Hood of Oracle Real Application Clusters (RAC) 12c Release 2" and before Oracle Database 12c Release 2 became generally available (GA) in March 2017.
PHP UK 2020 Tutorial: MySQL Indexes, Histograms And other ways To Speed Up Yo...Dave Stokes
Slow query? Add an index or two! But things are suddenly even slower! Indexes are great tools to speed data lookup but have overhead issues. Histograms don’t have that overhead but may not be suited. And how you lock rows also effects performance. So what do you do to speed up queries smartly?
MySQL Indexing : Improving Query Performance Using Index (Covering Index)Hemant Kumar Singh
Query performance can be enhanced by a major factor if Database Indexed are used properly. The main aim of this slide was to explain the benefits of Covering Index, but ended up writing everything I knew.
Here is the summary of what I have covered in this slide:-
1. What affects Database performance
2. What is Database Index
3. Types Of Database Index
4. Column Index
5. Composite Index
6. Covering Index
7. Indexing Guidelines
It would be interesting to know these as well -
Best practices for Indexing in Database(RDBMS)
Best practices for Indexing in MySQL
Best practices for Indexing in PostgreSQL
Best practices for Database Modeling
Best practices for SQL Query Construction
Performance impact of Indexing on Query Performance
Performance impact of Indexing on INSERT Queries
Consolidate all these knowledge and you should be happy to see the overall performance gain in your SQL Query and hence overall application will run faster.
This presentation is an INTRODUCTION to intermediate MySQL query optimization for the Audience of PHP World 2017. It covers some of the more intricate features in a cursory overview.
15 Ways to Kill Your Mysql Application Performanceguest9912e5
Jay is the North American Community Relations Manager at MySQL. Author of Pro MySQL, Jay has also written articles for Linux Magazine and regularly assists software developers in identifying how to make the most effective use of MySQL. He has given sessions on performance tuning at the MySQL Users Conference, RedHat Summit, NY PHP Conference, OSCON and Ohio LinuxFest, among others.In his abundant free time, when not being pestered by his two needy cats and two noisy dogs, he daydreams in PHP code and ponders the ramifications of __clone().
Performance Tuning Oracle's BI ApplicationsKPI Partners
http://www.kpipartners.com/webinar-Performance-Tuning-Oracle-BI-Applications/ ... From a virtual event that discusses techniques that can be used to optimize performance of the Oracle BI Apps.
The BI Apps from Oracle present customers with a nice head start to getting their BI environment up and running. But for many customers, their user community demands lighting-fast speeds while running dashboards, reports and ad-hoc queries. Learn about some of the key techniques you can use to take the BI Apps to performance levels you didn’t think were possible.
The discussion begins with a conceptual understanding of why performance problems can exist and the counteracting design considerations. Special attention will be paid to the concept of a Performance Layer, describing what it is, what it is comprised of and how to build it. The presentation includes several real world examples of the significant performance gains that can be had from a Performance Layer.
Objective 1: Learn about the concept of a performance layer and what is involved with building one.
Objective 2: Understand the most important steps to improve the performance of your system.
Design and develop with performance in mind
Establish a tuning environment
Index wisely
Reduce parsing
Take advantage of Cost Based Optimizer
Avoid accidental table scans
Optimize necessary table scans
Optimize joins
Use array processing
Consider PL/SQL for “tricky” SQL
This Doc Consist of ER diagram of University and NHL, Introduction to posgres SQL and installation,DML and its various commands,implementation of constraints with examples,DML Implementation with set operations & Functions,Implementation of nested Queries.
Microsoft SQL Server Filtered Indexes & Sparse Columns Feb 2011Mark Ginnebaugh
Speaker: Don Vilen, Chief Scientist, Buysight & former Microsoft SQL Server Team Member
This session covers the basics of Filtered Indexes and Sparse Columns, and then dives into the areas where they work well—and not so well—both together and separately. Don will show demos that show how they work and when they work well.
This presentation features the fundamentals of SQL tunning like SQL Processing, Optimizer and Execution Plan, Accessing Tables, Performance Improvement Consideration Partition Technique. Presented by Alphalogic Inc : https://www.alphalogicinc.com/
Parquet performance tuning: the missing guideRyan Blue
Ryan Blue explains how Netflix is building on Parquet to enhance its 40+ petabyte warehouse, combining Parquet’s features with Presto and Spark to boost ETL and interactive queries. Information about tuning Parquet is hard to find. Ryan shares what he’s learned, creating the missing guide you need.
Topics include:
* The tools and techniques Netflix uses to analyze Parquet tables
* How to spot common problems
* Recommendations for Parquet configuration settings to get the best performance out of your processing platform
* The impact of this work in speeding up applications like Netflix’s telemetry service and A/B testing platform
Similar to In Sync11 Presentation The Biggest Loser (20)
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
GridMate - End to end testing is a critical piece to ensure quality and avoid...ThomasParaiso2
End to end testing is a critical piece to ensure quality and avoid regressions. In this session, we share our journey building an E2E testing pipeline for GridMate components (LWC and Aura) using Cypress, JSForce, FakerJS…
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
In Sync11 Presentation The Biggest Loser
1. The biggest loser database Paul Guerin Sydney Convention Centre August 17 2011
2. The weigh in……. Capacity right-sizing to achieve business outcomes.
3. Size Starting size 3 years ago = 730GB Size 2 years later = 550GB Total loss = 180GB + 2 years of growth = 850 GB $$$ = GB * num_entities * $/GB = 850 * 8 * $38.50 = $261,800 (over 2 years) $1/4m over 2 years
4. Growth rate Growth rate was 29 GB/month. Now 12 GB/month…. Less than half the previous growth rate…
6. Unused tables Check for tables that are not used any more Suspect tables may be named: *old, *bkp, etc. Monitor the table for DML activity. v$segment_statistics Analyse the stored procedures for dependencies. dba_source Setup an audit of the table. AUDIT select, insert, delete, update ON <schema.object> Example: A table and its indexes (84GB in total) were identified as unused and dropped.
7. Tables in use may contain data that has expired. Question: “Do we really need 10 years of data in this table?” Answer: “No, we only need the last 3 months.” If required, archive data using the data pump query clause. expdp hr QUERY=employees:"WHERE dte < sysdate-100" Example: Deleted from a 62GB table then rebuilt to 5GB.
8. Direct-path inserts Potential performance benefits to inserting above the HWM. INSERT /*+ append */ INTO … SELECT * FROM …; Potential problem: Inserts always above the HWM, but deletes are always below the HWM. Low block density results as deleted space is not reused in a direct-path insert. Example: A low block density table rebuilt from 42GB to 2GB.
9. Table compression OLTP compression (licence required) Conventional compression ALTER TABLE <schema.tablename> NOLOGGING COMPRESS; INSERT /*+ APPEND */ INTO <schema.tablename> SELECT * FROM …..; Tips: Order low cardinality columns first. Order columns with many nulls last (otherwise costs 1 byte per null).
11. Index waste: Many index configurations are possible. Often not well understood by developers and DBAs. Many SQL statements to consider makes analysis laborious. Large potential for index waste and poor DML performance. Start looking for waste by analysing the existing indexes.
12. SQL statements decide which indexes are used An index on this predicate will not use an index: WHERE x NOT IN (0,1); An index on this predicate may use an index: WHERE x <0 OR (x>0 AND x<1) OR x >1; -- equivalent
13. An index on this predicate will not use an index: WHERE SUBSTR(y, 1, 10) LIKE '610233997600'; An index on this predicate may use an index: WHERE y LIKE '6102339976__'; -- equivalent Opportunities – change the operator to use the index, or drop the index not being used.
14. Unused indexes hh_agg_bucket$bckt(bucket) -- 7.5GB hh_agg_bucket$cntrv(cont_id, rev) -- 6.4GB hh_agg_bucket$exe(execution_number) -- 4.7GB Analysis & testing No evidence of statements referencing bucket, cont_id, rev. No indexes on foreign keys No column transivity on join statements Found useful access paths only on the 3rd index. Freed 13.9GB by dropping the unused indexes
15. Redundant indexes SITE$NDX1(datetm, siteid) -- 32GB SITE$NDX2(siteid, datetm) -- 34GB Proposition – Only 1 index used for the access path Analysis & testing – Found only used access path on SITE$NDX2. Dropped SITE$NDX1 to free 32GB
16. NDX$PK(A, B) /* primary key on this index */ NDX1(A, B, C) /* can relocate PK to this index */ Proposition A: If SQL statements reference A & B only NDX$PK more efficient than NDX1 NDX1 redundant.
17. NDX$PK(A, B) /* primary key on this index */ NDX1(A, B, C) /* can relocate PK to this index */ Proposition B: If SQL statements reference A, B, and C (via FFIS or FIS) NDX1 more efficient than NDX$PK. NDX$PK redundant, so put PK on NDX1.
18. NDX$PK(A, B) /* primary key on this index */ NDX1(A, B, C) /* can relocate PK to this index */ Proposition C: If SQL statements reference A, B, and C (and FFIS + FIS not present) NDX1 redundant as C doesn’t make the index more unique. Keep NDX$PK.
19. B-tree compression B-tree indexes can be compressed Low cardinality keys Potential performance benefits for FFIS, FIS, and IRS. ANALYZE INDEX <schema.indexname> VALIDATE STRUCTURE; SELECT name, partition_name, opt_cmpr_count, opt_cmpr_pctsave FROM INDEX_STATS; ALTER INDEX <schema.indexname> REBUILD COMPRESS <#prefix columns>;
20. Compressed B-tree examples FCASTDTL$FCASTID_DATETIME -- 4.8GB compressed to 2.9GB FCASTDTL$FCASTID_REVISION -- 3.5GB compressed to 1.9GB
21. Bitmap indexes - already compressed For extreme compression; use bitmap indexes Best for single column low cardinality keys. No cluster factor. Potential performance benefits for FIS. Good for SQLs that aggregate, but few updates and deletes. CREATE BITMAP INDEX <schema.indexname> ON …; Bitmap compression ratio is in the order of 100:1, so a 5GB b-tree may compress to a 0.05GB bitmap.
22. Last resort - rebuild Rebuilding is not as effective as eliminating…. -- Determine the amount of deleted space inside an index ANALYZE INDEX <schema.indexname> VALIDATE STRUCTURE; -- % of Btree that is deleted. SELECT DECODE(LF_ROWS,0,NULL,ROUND(DEL_LF_ROWS/LF_ROWS*100,1)) FROM INDEX_STATS;
23. Business outcomes Business outcomes from capacity right-sizing Better database scalability Leads to performance improvements. Lower storage footprint Equates to lower costs. ($1/4m over 2 years) Growth rate reductions are sustainable. Compared to index rebuilding which is often performed over and over again. Good diets - cut the fat, not the muscle