Thomas Kyte discusses different types of indexes in Oracle databases. He describes the structure and functionality of B-tree indexes, which are the most common type. Some key facts about B-trees include that they provide fast retrieval of data regardless of table size, have a fixed traversal path from root to leaf nodes, and typically have a height of 2-3 even for large tables with millions of records. Kyte also covers techniques for compressing index keys to improve storage and performance.
VMWare vFabric SQLFire - scalable SQL instead of NoSQL
There is quite a bit of buzz thesedays on "NoSQL" databases. The lack of transactions and good support for querying (SQL) has been a problem for many to adopt these solutions. This talk presents, VMWare SQLFire, a distributed SQL data management solution that melds Apache Derby (borrowing SQL drivers, parsing and some aspects of the engine) and an object data grid (GemFire) to offer a horizontally scalable, memory oriented data management system where developers can continue to use SQL. We focus on new primitives that extend the well known SQL Data definition syntax for data partitioning and replication strategies but leaving the "select" and data manipulation part of SQL intact so it only minimally impacts your application.
I gave this presentation at What's next, Paris 2011(http://www.whatsnextparis.com/abouttheseminar.html).
What is the main difference between PostgreSQL and other open-source databases
Built in and custom data types in PostgreSQL
Constraint CHECK and why do we need it
Queries merging - UNION, INTERSECT and EXCEPT
PostgreSQL extensions - ltree, hstore etc.
Procedure To Store Database Object Size And Number Of Rows In Custom TableAhmed Elshayeb
Procedure To Store Database Object Size And Number Of Rows In Custom Table
كيفية عمل جدول يتم فيه تسجيل مساحات الجداولاو اي نوع من محتويات قاعدة البيانات لمعرفة معدل الزيادة في هذا النوع
VMWare vFabric SQLFire - scalable SQL instead of NoSQL
There is quite a bit of buzz thesedays on "NoSQL" databases. The lack of transactions and good support for querying (SQL) has been a problem for many to adopt these solutions. This talk presents, VMWare SQLFire, a distributed SQL data management solution that melds Apache Derby (borrowing SQL drivers, parsing and some aspects of the engine) and an object data grid (GemFire) to offer a horizontally scalable, memory oriented data management system where developers can continue to use SQL. We focus on new primitives that extend the well known SQL Data definition syntax for data partitioning and replication strategies but leaving the "select" and data manipulation part of SQL intact so it only minimally impacts your application.
I gave this presentation at What's next, Paris 2011(http://www.whatsnextparis.com/abouttheseminar.html).
What is the main difference between PostgreSQL and other open-source databases
Built in and custom data types in PostgreSQL
Constraint CHECK and why do we need it
Queries merging - UNION, INTERSECT and EXCEPT
PostgreSQL extensions - ltree, hstore etc.
Procedure To Store Database Object Size And Number Of Rows In Custom TableAhmed Elshayeb
Procedure To Store Database Object Size And Number Of Rows In Custom Table
كيفية عمل جدول يتم فيه تسجيل مساحات الجداولاو اي نوع من محتويات قاعدة البيانات لمعرفة معدل الزيادة في هذا النوع
Dramatically increase your database's performance using hierarchical and recu...rcmoutinho
Database access is critical to any project. It's easy to low down the performance with heavy queries, or using multiple queries to get a simple information. Or even that initial query that cascade to dozens of others using programming... And everything gets worse if the database has lots of data or a poorly designed query! Each query has a high cost to your application. It's possible to solve this kind of problems using CTE (Common Table Expression). This powerful resource is available on most of the relational database but isn't well known by developers. Deal with a significant amount of data with simple maintenance and drastically increase your database performance!
What SQL functionality was added in the past year or so. The presentation covers default expressions, functional key parts, lateral derived tables, CHECK constraints, JSON and spatial improvements. Also some other small SQL and other improvements.
Embase: An introduction to indexing 20 October 2014Ann-Marie Roche
View our slides for an introduction to how indexing is carried out in Embase, guided examples of how indexing helps you to retrieve more comprehensive and relevant results and where you can find more information on indexing.
Last But Not Least - Managing The Indexing ProcessFred Leise
Focused on editors and authors who need to understand how to deal with indexes as part of the publishing process. Includes indexing basics and best practices, as well as guides for managing the editor-author-indexer relationship.
Dramatically increase your database's performance using hierarchical and recu...rcmoutinho
Database access is critical to any project. It's easy to low down the performance with heavy queries, or using multiple queries to get a simple information. Or even that initial query that cascade to dozens of others using programming... And everything gets worse if the database has lots of data or a poorly designed query! Each query has a high cost to your application. It's possible to solve this kind of problems using CTE (Common Table Expression). This powerful resource is available on most of the relational database but isn't well known by developers. Deal with a significant amount of data with simple maintenance and drastically increase your database performance!
What SQL functionality was added in the past year or so. The presentation covers default expressions, functional key parts, lateral derived tables, CHECK constraints, JSON and spatial improvements. Also some other small SQL and other improvements.
Embase: An introduction to indexing 20 October 2014Ann-Marie Roche
View our slides for an introduction to how indexing is carried out in Embase, guided examples of how indexing helps you to retrieve more comprehensive and relevant results and where you can find more information on indexing.
Last But Not Least - Managing The Indexing ProcessFred Leise
Focused on editors and authors who need to understand how to deal with indexes as part of the publishing process. Includes indexing basics and best practices, as well as guides for managing the editor-author-indexer relationship.
Learn what document indexing is and how index data can be captured with barcode recognition, OCR and more for unattended or automated indexing. Learn about full-text and metadata indexing and capture from scanned documents, print streams or existing files. This is a tutorial to define document indexing and discuss the technologies and methods used to identify and capture the data.
Some might think Docker is for developers only, but this is not really the case.Docker is here to stay and we will only see more of it in the future.
In this session learn what Docker is and how it works.This session will be covering core areas such as volumes, but also stepping it up to a few tips and tricks to help you get the most out of your Docker environment.The session will dive into a few examples of how to create a database environment within just a few minutes - perfect for testing,development, and possibly even production systems.
Machine Learning explained with Examples
Everybody is talking about machine learning. What is it actually and how can I use it?
In this presentation we will see some examples of solving real life use cases using machine learning. We will define Tasks and see how that task can be addressed using machine learning.
SQL Server 2017でLinuxに対応し、その延長線でDocker対応やKubernetesによる可用性構成が組めるようになりました。そしてリリースを間近に控えたSQL Server 2019ではKubernetesを活用したBig Data Cluster機能の提供が予定されており、コンテナの活用範囲はさらに広がっています。
本セッションではこれからSQL Serverコンテナに触れていくための基礎知識と実際に触れてみるための手順やサンプルをお届けします。
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
2. Who am I
• Been with Oracle since 1993
• User of Oracle since 1987
• The “Tom” behind AskTom in
Oracle Magazine
www.oracle.com/oramag
• Expert Oracle Database• Expert Oracle Database
Architecture
• Effective Oracle by Design
• Expert One on One Oracle
• Beginning Oracle
4. B*Tree
• What I call ‘conventional’ indexes
• Most common, some people might have only used this type
and nothing else
• Similar in implementation to a binary search tree
– Only not “binary” – they are N-ary, branches don’t go just left or
rightright
• Goal: minimize time to find small amounts of data
– Go ahead, define small
• Structurally they look like
5. 0..50
51..100
101..150
….
10000.. 10050
0..10 51..58 10000.. 10009
Create index I on T(numColumn)
Lowest level
blocks are called
Leaf blocks
Contain every
indexed key and a
rowid
Interior blocks are
known as branch
blocks,
navigational
Leaf Nodes are
actually a doubly
linked list – once
we find where to
start – range
scanning is easyIt all starts with a
0..10
11..19
20..25
….
47.. 50
51..58
59..63
64..75
….
98.. 100
10000.. 10009
10010.. 10020
10021..10028
…
10046..10050
0,rowid
0,rowid
1,rowid
….
10,rowid
11,rowid
11,rowid
12,rowid
….
19,rowid
10046,rowid
10048,rowid
10048,rowid
….
10050,rowid
….
….
10021,rowid
10022,rowid
10023,rowid
….
10028,rowid ….
Leaf blocksrowid scanning is easyIt all starts with a
root block, root
could be all there
is
6. B*Tree Facts
• No such thing as a non-unique index under the covers
– Create index I on T(x,y) is sort of like Create UNIQUE index I on
T(x,y,rowid)
• All leaf blocks are at the same level
– Level is also known as the HEIGHT, BLEVEL (another metric
reported frequently) differs from height by one (does not count leafreported frequently) differs from height by one (does not count leaf
blocks)
– Any traversal from root to leaf takes the same number of IO’s
• Select indexed_col from T where indexed_col = :x will
take same number of IO’s regardless of the value of :x at
runtime.
– Most B*Trees will have a height of 2 or 3, even for millions of
records, for example
7. B*Tree Facts
ops$tkyte%ORA11GR1> create table t
2 as
3 select level x, level y
4 from dual
5 connect by level <= 10000000;
Table created.
ops$tkyte%ORA11GR1> alter table t add constraint t_pk primary key(x);
Table altered.
ops$tkyte%ORA11GR1> select index_name, blevel, num_rows
2 from user_indexes
3 where index_name = 'T_PK';
INDEX_NAME BLEVEL NUM_ROWS
------------------------------ ---------- ----------
T_PK 2 10000000
8. B*Tree Facts
ops$tkyte%ORA11GR1> set autotrace traceonly statistics
ops$tkyte%ORA11GR1> select x from t where x = 1;
3 consistent gets
ops$tkyte%ORA11GR1> select * from t where x = 1;
4 consistent gets
ops$tkyte%ORA11GR1> select x from t where x = 5000000;
3 consistent gets
ops$tkyte%ORA11GR1> select * from t where x = 5000000;
4 consistent gets
ops$tkyte%ORA11GR1> select x from t where x = 10000000;
3 consistent gets
ops$tkyte%ORA11GR1> select * from t where x = 10000000;
4 consistent gets
9. B*Tree Facts
• Index_stats is an important “V$” table
– Only one row
– Result of last validate structure
– Validate structure is an OFFLINE (blocking) operation
• Excellent General Purpose Indexing Mechanism
• Works well for small tables• Works well for small tables
• Works well for large tables
• Experiences little, if any, degradation in retrieval
performance as the size of the underlying table grows
• We’ll investigate when to use them shortly
– But first, compression and reverse key
Index_stats.sql
10. Index Key Compression
• Remove redundant leading edge values in index
keys
– Break key into “prefix” and “suffix”
– Repeating prefix values are not stored on leaf block –
only stored once
– Each leaf block is self contained– Each leaf block is self contained
– For example
12. Index Key Compression
• Some Facts
– Available with Oracle8i R1 (version 8.1.5) and above
– Index probably consumes less disk space
– Can reduce I/Os on the system -- both physical and
logical
– Can improve buffer cache efficiency, there is less to– Can improve buffer cache efficiency, there is less to
cache
– May increase contention as there are now more rows
per leaf block
– May require increased CPU to access
Indc.sql
Indc2.sql
14. B*Tree When to use
• I do not like rules of thumb (ROT)
• Why? Consider these two – both are valid:
– Use a B*Tree index to index columns if you are going to access a
very small number of the rows in the table via the index.
– Use a B*Tree index if you are going to process many/most/all of
the rows in a table via the index
– They conflict – but they are both valid– They conflict – but they are both valid
– Discuss
• What is small
• What is initial response versus total throughput about
• What about when the index can be used instead of the table
• I like to understand how something works – and use that to
decide
15. B*Tree When to use
• So, the ‘rules are’
– As the means to access rows in a table: You will read the index
to get to a row in the table. Here you want to access a very small
number of the rows in the table.
– As the means to answer a query: The index contains enough
information to answer the entire query—we will not have to go to
the table at all. The index will be used as a “thinner” version of thethe table at all. The index will be used as a “thinner” version of the
table.
– As the means to optimize for initial response time: You want to
retrieve all of the rows in a table, including columns that are not in
the index itself – in some sorted order – the index will possibly
allow for immediate response (but slower overall throughput).
16. B*Tree When to use
• Organization counts
– Index on primary key populated by sequence/SYSDATE.
• Data in table is mostly sorted by sequence/SYSDATE
• Data in index is sorted by sequence/SYSDATE.
• Index very efficient for range scans
• But, how often do you range scan on a primary key populated
by sequence?by sequence?
• How often by date range?
– Index on LAST_NAME
• Data in table is randomly organized (your don’t hire everyone
with a last name of ‘A%’ on the same day)
• Data in index is sorted
• Index is not efficient for large range scans
o It would skip all around in the table
17. B*Tree When to use
• Enter the CLUSTERING FACTOR
– A measure of how sorted the table is by the key in the
index
– It measures how many IO’s it would take to read the
entire table via the index – row after row after row
– If table is sorted by key, clustering factor near number
of blocks in the table.
– If table is not sorted by key, clustering factor nearer
number of ROWS in table.
– Please ask yourself, how many ways can the table be
sorted on disk?
cf.sql
18. 2 total IO’S
Against the
Table
0..50
51..100
101..150
….
10000.. 10050
0..10
11..19
20..25
….
47.. 50
51..58
59..63
64..75
….
98.. 100
10000.. 10009
10010.. 10020
10021..10028
…
10046..10050
….
Create index nm_idx on name)
Select * from t where pk between 1 and 8
1,Alice
2,Bob
3,Candy
4,Doug
5,Ellen
6,Frank
7,George
8,Hank
….
….
….
….
….
… …
…
…
…
….
Pk Name
1 Alice
2 Sue
3 Victor
4 Will
Pk Name
5 Irene
6 Kelly
7 Melanie
8 Oliver
Pk Name
9 George
10 Candy
11 Uwe
12 Wally
Pk Name
13 Ellen
14 Tom
15 Rick
16 Paul
Pk Name
17 Doug
18 Irene
19 Lance
20 Jack
Pk Name
21 Hank
22 Frank
23 Nicole
24 Bob
19. 0..50
51..100
101..150
….
10000.. 10050
0..10
11..19
20..25
….
47.. 50
51..58
59..63
64..75
….
98.. 100
10000.. 10009
10010.. 10020
10021..10028
…
10046..10050
….
Create index nm_idx on name)
Select * from t where Name between ‘Alice’ and ‘Hank’
8 total IO’S
Against the
Table
cf.sql
1,Alice
2,Bob
3,Candy
4,Doug
5,Ellen
6,Frank
7,George
8,Hank
….
….
….
….
….
… …
…
…
…
….
Pk Name
1 Alice
2 Sue
3 Victor
4 Will
Pk Name
5 Irene
6 Kelly
7 Melanie
8 Oliver
Pk Name
9 George
10 Candy
11 Uwe
12 Wally
Pk Name
13 Ellen
14 Tom
15 Rick
16 Paul
Pk Name
17 Doug
18 Irene
19 Lance
20 Jack
Pk Name
21 Hank
22 Frank
23 Nicole
24 Bob
20. B*Tree in summary
• Most common
• Well understood (and mis-understood!)
• Very scalable access times
– To return a row from a 1,000 row index takes about as
much work as from a 10,000,000 row indexmuch work as from a 10,000,000 row index
• Indexing should be thought of as a design time
thing
• You might index to retrieve a few rows, all rows, or
to avoid the table in the first place
22. Bitmap Index
• Introduced in version 7.3, EE
• Designed for read mostly or read only application
• Specifically not designed or usable with OLTP
tables
– Tables undergoing concurrent modification– Tables undergoing concurrent modification
– Tables undergoing single row modifications
• A single bitmap key entry points to many rows
– In contrast to b*tree where there is a 1:1 relation
between keys in the index and rows in the table
bm1.sql
23. Bitmap Index – bitwise operations
JOB BITS
--------- ------------------------------
ANALYST 0-0-0-0-0-0-0-1-0-0-0-0-1-0
CLERK 1-0-0-0-0-0-0-0-0-0-1-1-0-1
MANAGER 0-0-0-1-0-1-1-0-0-0-0-0-0-0
PRESIDENT 0-0-0-0-0-0-0-0-1-0-0-0-0-0PRESIDENT 0-0-0-0-0-0-0-0-1-0-0-0-0-0
SALESMAN 0-1-1-0-1-0-0-0-0-1-0-0-0-0
CLERK OR 1-0-0-1-0-1-1-0-0-0-1-1-0-1
MANAGER
CLERK AND 0-0-0-0-0-0-0-0-0-0-0-0-0-0
MANAGER
24. Bitmap Index – structure
JOB LO-ROWID HI-ROWID BITS
--------- -------- -------- --------------------------
ANALYST AAAR4AAA AAABEEAAH 0-0-0-0-0-0-0-1-0-0-0-0-1-0
ANALYST AAAR4AAB AAABEEAAM 1-0-0-0-0-0-0-0-0-0-1-1-0-1
ANALYST AAAR4AAC AAABEEAAN 0-0-0-1-0-1-1
PRESIDENT AAAR4AAA AAABEEAAI 0-0-0-0-0-0-0-0-1-0-0-0-0-0
PRESIDENT AAAR4AAX AAABEEAAC 0-1-1-0-1-0-0-0-0-1-0-0-0-0
PRESIDENT AAAR4AAY AAABEEAAJ 1-0-1-0-0-1-1
• Key + Lo Rowid – Hi Rowid + Bitmap
• 0’s and 1’s map to rowids in that range
• If we know max number of rows/block – simple math
• alter table emp minimize records_per_block;
• Note the multiple entries per JOB!
25. Bitmap Index
• Can answer questions like:
– How many of this match (count 0’s and 1’s)
– How many of this that or the other thing match
• Bitwise and/or bitmaps
• Count 0’s and 1’s• Count 0’s and 1’s
– Good for accessing a few rows (just like B*Tree)
– Good for counting, identifying many, all, some of the
rows (just like B*Tree)
26. Bitmap Index - when
• Most common rule of thumb going is “low distinct
cardinality”
• Now, I defy you to define “low distinct cardinality”
– Is 2 “low distinct cardinality”?
– Yes it is
– No it isn’t
– It depends
• In pure ad-hoc, even high distinct cardinality
columns could/should be considered for bitmaps
• Consider
27. Bitmap Index - when
• How many men in regions 1, 10 and 30 are there in the 41
and over age group?
• How many men in region 20 or women in region 22 are 18
and under?
• How many people are in regions 11, 20 or 30
• How many over 41 year olds are there that are women?• How many over 41 year olds are there that are women?
• Etc etc etc
• Now, come up with an indexing scheme using B*Trees for
that
• And then maintain it as the questions change (and change
and change)
bm2.sql
29. Function Based Indexes
• Added in Oracle 8i release 1 (8.1.5) as a feature of
EE and PE
• In 9i, a feature of SE, EE and PE
• Great for
– Case insensitive searches/sorts– Case insensitive searches/sorts
– Searching on derived attributes -- complex formulas
• Provide immediate, transparent value to the
application
• Function you index must be “pure” - deterministic
pure.sql
30. • Thinking outside the box with FBI's
– Two facts
• B*Tree indexes will never have an entirely NULL
entry. If the entire key is NULL, it will not be placed
in the B*Tree
• We have function based indexes that allow us to
Function Based Indexes
• We have function based indexes that allow us to
incorporate complex, procedural logic in them
– We can solve common problems
• Indexing only some of the rows in a table (like
indexing a where clause)
• Enforcing complex integrity in the database
• Using an index for “where column is null”
31. • Selective Indexing
– You have a table with a flag column
– You want to index this column when the flag = 'x'
– A small % of the table is accessed by this values
– Sounds like what you've read about bitmap indexes but
• This table is modified all day long
Function Based Indexes
• This table is modified all day long
• Bitmaps would kill concurrency
• The bitmap would grow to outrageous sizes quickly
– The answer -- a function based index
• Or a better model
• Or a better structure (AQ) fbi.sql
32. • Selective Uniqueness
– You have a table with versioned information in it
• A project table with status "ACTIVE" and
"INACTIVE"
– When status is "ACTIVE", some set of columns must be
unique
Function Based Indexes
unique
– When status is "INACTIVE", those columns may
contain any values -- any number of duplicates
– How can you do it?
selind.sql
33. • Where column is NULL
– Nulls are not indexed right?
– Wrong – entirely NULL key entries are not, but if *any*
bit of a concatenated index is not null Entry is made.
• Fear no longer the NULL value!
Function Based Indexes
null.sql
35. • Do nulls and indexes work together?
– Obviously, we’ve seen an example with FBI’s
– Bitmap indexes – always index nulls.
– B*Tree cluster indexes – always index nulls.
– Conventional B*Tree indexes do not if and only if the
entire key is null (all of the columns)
Mythology
entire key is null (all of the columns)
• Why is that? See null2.sql
null2.sql
36. • Do I need to index foreign keys?
– Probably
– If you
• Update the parent table primary key OR
• Delete from parent OR
• Merge into parent
– 9i and later – lock taken on child table for duration of update or
Mythology
– 9i and later – lock taken on child table for duration of update or
delete
• Which could take long since the table lock gets blocked,
blocking others
• Which could take long due to full scan of child, since there is no
index!
– 8i and before – lock taken for duration of transaction
fkey.sql
37. • Maybe we are not using the leading edge?
• Create index I on T(x,y)
– Where y = value will tend to not use the index
– Unless
• We index skip scan
• We select x,y from t where y = value – and we use
Mythology – why isn’t it using my index
• We select x,y from t where y = value – and we use
the index I as a skinny table to full scan
case1.sql
38. • Select count(*) from t
• Full scanning table, not using any existing index
• Two causes
– You are still using the RBO
– None of the columns indexed was defined “NOT NULL”
Mythology – why isn’t it using my index
39. • Select * from table where indexed_column = value
• Indexed column is on the leading edge
• No index being used
• Likely an implicit conversion
Mythology – why isn’t it using my index
case2.sql
40. • This is my favorite one
• Because if the index were to be used, the query
would run incredibly slow.
• Remember:
– Loop
Mythology – why isn’t it using my index
• Say indexes are not all goodness
• Say full scans are not evil
• Exit when (you really believe it)
– End loop
• Following example from asktom..
41. So, joe (or josephine) sql coder needs to run the following query:
select t1.object_name, t2.object_name
from t t1, t t2
where t1.object_id = t2.object_id
and t1.owner = 'WMSYS'
Rows Row Source Operation
Mythology – why isn’t it using my index
Rows Row Source Operation
------- ---------------------------------------------------
528384 HASH JOIN
8256 TABLE ACCESS FULL T
1833856 TABLE ACCESS FULL T
suppose they ran it or explain planned it -- and saw that plan. "Stupid
stupid CBO" they say -- "I have indexes, why won't it use them. We all know
that indexes mean fast=true! Ok, let me use the faithful RBO and see what
happens"
42. select /*+ RULE */ t1.object_name, t2.object_name
from t t1, t t2
where t1.object_id = t2.object_id
and t1.owner = 'WMSYS'
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T'
Mythology – why isn’t it using my index
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T'
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'T'
4 2 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
See, now that’s what I’m talking about – indexes are good…
Or are they?
43. call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 35227 5.63 9.32 23380 59350 0 528384
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 35229 5.63 9.33 23380 59350 0 528384
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Mythology – why isn’t it using my index
Optimizer goal: CHOOSE
Parsing user id: 80
Rows Row Source Operation
------- ---------------------------------------------------
528384 HASH JOIN
8256 TABLE ACCESS FULL T
1833856 TABLE ACCESS FULL T
44. call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 35227 912.07 3440.70 1154555 121367981 0 528384
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 35229 912.07 3440.70 1154555 121367981 0 528384
Misses in library cache during parse: 0
Optimizer goal: RULE
Mythology – why isn’t it using my index
Optimizer goal: RULE
Parsing user id: 80
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'T'
2 1 NESTED LOOPS
3 2 TABLE ACCESS (FULL) OF 'T'
4 2 INDEX (RANGE SCAN) OF 'T_IDX' (NON-UNIQUE)
45. 1 SELECT phy.value,
2 cur.value,
3 con.value,
4 1-((phy.value)/((cur.value)+(con.value))) "Cache hit ratio"
5 FROM v$sysstat cur, v$sysstat con, v$sysstat phy
6 WHERE cur.name='db block gets'
7 AND con.name='consistent gets'
8* AND phy.name='physical reads'
Mythology – why isn’t it using my index
VALUE VALUE VALUE Cache hit ratio
-------- ---------- ---------- ---------------
1277377 58486 121661490 .989505609
98.9% cache hit, not bad eh?
46. • Space is never reused in an index
– Indexes are a complex data structures
– Data has a location
• If you insert monotonically increasing values
(1,2,3, )
• And you delete many – not all, many – of the older
values over time (1,3,5,7, .)
Mythology
values over time (1,3,5,7, .)
• Then, since the value 123456 does not fit “near” the
value 2 – that space won’t be reused (the block that
2 is on)
• But indexes on say “last name” or a reverse key
index
47. • Most discriminating elements should go first
– Say you have copy of all objects
– You ask
• What does scott own?
• What tables does scott own?
• What about that EMP table scott owns?
Mythology
• What about that EMP table scott owns?
– The only sensible index would be on
(owner,object_type,object_name).
– Object_name is the most ‘discriminating’
– Object_name would be the worst thing to put first
– How you query the data dictates the ordering
order.sql
48. • NOSEGMENT
– What would happen to my plan if I created this index?
– Would the optimizer likely choose to use it?
– ALTER SESSION SET “_use_nosegment_indexes” =
true;
– Dbms_stats.set_index_stats with your best guess might
Interesting
– Dbms_stats.set_index_stats with your best guess might
be necessary
nosegment.sql
49. • INVISIBLE
– What would happen to my plan if I created this index?
– Would the optimizer likely choose to use it?
– Invisible indexes actually
• Create the index
• Maintain the index
Interesting – but wait, there’s more!
• Maintain the index
• But keep the index from the optimizers view!
• alter session set
OPTIMIZER_USE_INVISIBLE_INDEXES =true;
invisible.sql
50. • Where string like ‘%stuff’
– B*tree – nope
– Bitmap – nope
– Text – yes
• Indexes the substrings
• Case insensitive even
Interesting – and inclosing
• Case insensitive even
• Everyone has it
leading.sql