SlideShare a Scribd company logo
Physical Database
Design and Tuning
R&G - Chapter 20
Contents
• Physical Database Design
• Database Workloads
• Physical Design and tuning Decisions
• Need for Tuning
• Guidelines for Index Selection
• Clustering & indexing tools for index selection
• Database Tuning: Tuning index
• Tuning Conceptual schema
• Tuning queries and views
• Impact of Concurrency
• Benchmarking
Physical Database Design
• Process of producing a description of the implementation
of the database on secondary storage.
• It describes the base relations, file organizations, and
indexes used to achieve efficient access to the data, and
any associated integrity constraints and security
measures.
Physical Database Design
• We will describe the plan for how to build the tables,
including appropriate data types, field sizes, attribute
domains, and indexes.
• The plan should have enough detail that if someone else
were to use the plan to build a database, the database they
build is the same as the one you are intending to create.
• The conceptual design and logical design were
independent of physical considerations. Now, we not only
know that we want a relational model, we have selected a
database management system (DBMS) such as Access or
Oracle, and we focus on those physical considerations.
Logical vs. Physical Design:
• Logical database design is concerned with what to
store;
• physical database design is concerned with how to
store it.
Introduction
• We will be talking at length about “database
design”
– Conceptual Schema: info to capture, tables, columns,
views, etc.
– Physical Schema: indexes, clustering, etc.
• Physical design linked tightly to query optimization
– So we’ll study this “bottom up”
– But note: DB design is usually “top-down”
• conceptual then physical. Then iterate.
• We must begin by understanding the workload:
– The most important queries and how often they arise.
– The most important updates and how often they arise.
– The desired performance for these queries and updates.
Understanding the Workload
• For each query in the workload:
– Which relations does it access?
– Which attributes are retrieved?
– Which attributes are involved in selection/join conditions?
How selective are these conditions likely to be?
• For each update in the workload:
– Which attributes are involved in selection/join conditions?
How selective are these conditions likely to be?
– The type of update (INSERT/DELETE/UPDATE), and the
attributes that are affected.
– For the Update command, the fields that are modified by
update
Creating an ISUD Chart
Employee Table
Transaction Frequency% table Name Salary Address
Payroll Run monthly 100 S S S
Add Emps daily 0.1 I I I
Delete Emps daily 0.1 D D D
Give Raises monthly 10 S U NA
Insert, Select, Update, Delete Frequencies
Physical Design and tuning
Decisions
• Choice of indexes to create
– Which relations to index and which field
– What field(s) should be the search key
– Should we build several indexes
– For each index, should it be Clustered or un clustered
• Tuning the conceptual schema
– Alternative normalization,
– De normalization,
– Vertical partitioning,
– Views
• Query and transaction tuning
– Frequently executed queries are rewritten to run faster.
Need for Database Tuning
• Hard to get detailed workload at initial design.
• Concept of design and tuning are arbitrary.
• Design process is over after conceptual schema and
set of clustering and indexing decisions are made.
• Tuning process is subsequent changes to the
conceptual schema or the indexes.
Index Selection
• One approach:
– Consider most important queries.
– Consider best plan using the current indexes, and see if
better plan is possible with an additional index.
– If so, create it.
• Before creating an index, must also consider the
impact on updates in the workload.
– Trade-off of slowing some updates in order to speed up
some queries.
Whether to index (Guideline 1)
• Do not build an index unless some query including the
query components of updates benefit from it.
• Whenever possible choose indexes that speed up more than
one query
Multi attribute Search keys (Guideline 3)
– Two situations should be considered
1. a WHERE clause includes conditions on
more than one attribute of a relation
2. They enable index only evaluation
strategies (i.e. accessing relation can be
avoided) for important queries.
Whether to cluster (Guideline 4)
• As a rule of thumb, range queries are likely
to benefit the most from clustering
• If an index enables an index-only evaluation
strategy for the query it is intended to speed
up, the index need not be clustered
Hash verses Tree index (Guideline 5)
– Hash index is better in following situations
• The index is intended to support index
nested loops join; the indexed relation is the
inner relation, and the search key includes
the join columns
• There is very important equality query, and
no range queries, involving the search key
attributes
Balancing the cost of Index
Maintenance (Guideline 6)
• If maintaining an index slows down frequent
update operations, consider dropping the
index
• Keep in mind however , that adding an index
may well speed up a given update operation.
– E.g. an index on employee IDs could speed up the
operation of increasing the salary of a employee
(specified by ID)
Example 1
• Hash index on D.dname supports ‘Toy’ selection.
– Given this, index on D.dno is not needed.
• Hash index on E.dno allows us to get matching
(inner) Emp tuples for each selected (outer) Dept
tuple.
• What if WHERE included: `` ... AND E.age=25’’ ?
– Could retrieve Emp tuples using index on E.age, then join
with Dept tuples satisfying dname selection. Comparable to
strategy that used E.dno index.
– So, if E.age index is already created, this query provides
much less motivation for adding an E.dno index.
SELECT E.ename, D.mgr
FROM Emp E, Dept D
WHERE D.dname=‘Toy’ AND E.dno=D.dno
Example 2
• All selections are on Emp so it should be the outer
relation in any Index NL join.
– Suggests that we build a B+ tree index on D.dno.
• What index should we build on Emp?
– B+ tree on E.sal could be used, OR an index on E.hobby
could be used. Only one of these is needed, and which is
better depends upon the selectivity of the conditions.
• As a rule of thumb, equality selections more selective than range
selections.
• As both examples indicate, our choice of indexes is
guided by the plan(s) that we expect an optimizer to
consider for a query. Have to understand optimizers!
SELECT E.ename, D.mgr
FROM Emp E, Dept D
WHERE E.sal BETWEEN 10000 AND 20000
AND E.hobby=‘Stamps’ AND E.dno=D.dno
Clustering and Indexing
• Clustered indexes can be especially
important while accessing the inner relation
in an index nested loops joins
• Revisit the same e.g
• Should the used indexes be clustered?
• Unclustered index on dname
• On the other hand Emp is the inner relation
in an index NL join and dno is not candidate
key
• Dno should be clustered index
SELECT E.ename, D.mgr
FROM Emp E, Dept D
WHERE D.dname=‘Toy’ AND E.dno=D.dno
Examples of Clustering
• B+ tree index on E.age can be
used to get qualifying tuples.
– How selective is the condition?
– Is the index clustered?
• Consider the GROUP BY query.
– If many tuples have E.age > 10,
using E.age index and sorting the
retrieved tuples may be costly.
– Clustered E.dno index may be
better!
• Equality queries and duplicates:
– Clustering on E.hobby helps!
SELECT E.dno
FROM Emp E
WHERE E.age>40
SELECT E.dno, COUNT (*)
FROM Emp E
WHERE E.age>10
GROUP BY E.dno
SELECT E.dno
FROM Emp E
WHERE E.hobby=Stamps
Impact of Clustering
Co-clustering Two Relations
• It can speed up joins, in particular key
foreign key joins corresponding to 1:N
relations
• A sequential scan of either relation becomes
slower.
• All inserts, deletes and updates that alter
record lengths become slower, thanks to
overhead involved in maintaining the
clustering
Index-Only Plans
• A number of
queries can be
answered
without
retrieving any
tuples from one
or more of the
relations
involved if a
suitable index
is available.
SELECT D.mgr
FROM Dept D, Emp E
WHERE D.dno=E.dno
SELECT D.mgr, E.eid
FROM Dept D, Emp E
WHERE D.dno=E.dno
SELECT E.dno, COUNT(*)
FROM Emp E
GROUP BY E.dno
SELECT E.dno, MIN(E.sal)
FROM Emp E
GROUP BY E.dno
SELECT AVG(E.sal)
FROM Emp E
WHERE E.age=25 AND
E.sal BETWEEN 3000 AND 5000
<E.dno>
<E.dno,E.eid>
<E.dno>
<E.dno,E.sal>
B-tree trick!
<E. age,E.sal>
or
<E.sal, E.age>
Tools to Assist in Index Selection
• First generation of such tools:
– Index tuning wizards or
– Index advisors
• Drawback of these systems
– They had to replicate the database query
optimizers cost model
• The DB2 Index Advisor
– Tool for automatic index recommendation given a
workload
– Workload table: ADVISE_WORKLOAD
– It is populated either
• By SQL stmts from DB2 dynamic SQL stmt cache for
recently executed SQL stmts
• With SQL stmts from packages statically compiled stmts
OR
• With SQL stmts from online monitor called Query
Patroller
• Output: SQL DDL statements whose execution creates
recommended indexes
Tools to Assist in Index Selection
• The Microsoft SQL server 2000 Index Tuning
wizard
– Tuning wizard integrated with the database query
optimizer
– 3 tuning modes that permits user to trade off running
time of analysis and no. of candidate index
configurations examined: fast, medium and thorough
with fast having lowest running time and thorough
examining the largest no. of configurations
– Max space allowed for indexes, Allows table scaling
– Reduces running time by sampling mode
– Table scaling
Tools to Assist in Index Selection
Overview of Database Tuning
• Actual use of DB provides a valuable source
of detailed information that can be used to
refine the initial design
• Original assumptions are replaced
• Initial workload is validated
• Initial guesses about size of data can be
replaced with actual statistics
• Tuning imp to get best possible performance
• 3 kinds of tuning: - tuning indexes, tuning
the conceptual schema, and tuning queries
Tuning indexes
• Queries and updates considered important at
initial level are not very frequent
• Observed workload may also identify some
new queries and updates
• Initial choice of indexes has to be reviewed
in light of this new information
• Some original indexes may be dropped and
new ones added
• It uses index only scan with Emp as inner
relation
• If this query takes an unexpectedly long
time to execute replace previous plan with
dno field and clustered index
Tuning indexes continues…
SELECT E.ename, D.mgr
FROM Emp E, Dept D
WHERE D.dname=‘Toy’ AND E.dno=D.dno
• In addition we have to Periodically reorganize
indexes
– E.g Static index (ISAM index) may have developed long
overflow chains, drop or rebuilt- if feasible, improves
access time through this index
– Dynamic structure (B+ tree) - if the implementation does
not merge pages on deletes, space occupancy can
decrease considerably in some situations. This in turn
makes the size of the index (in pages) larger than
necessary, and could increase the height and therefore
the access time
Tuning indexes continues…
Tuning conceptual Schema
• If initial schema doesn’t meet our performance
objectives for the given workload with any set
of physical design if so redesign conceptual
schema
• Such change is called as schema evolution
• Issues involved in tuning conceptual schema:
– Decide to settle for a 3NF design instead of BCNF
– Among 3NF or BCNF our choice should be guided by
workload
– Sometime we might decide to further decompose
relation that is already in BCNF
– We might denormalize
– partitioning
Tuning Queries and Views
• If a query runs slower than expected, check if an index needs
to be re-built, or if statistics are too old and rebuilt the queries.
• Sometimes, the DBMS may not be executing the plan you had
in mind. Common areas of optimizer weakness:
– Selections involving null values (bad selectivity estimates)
– Selections involving arithmetic or string expressions (ditto)
– Selections involving OR conditions (ditto)
– Complex, correlated subqueries
– Lack of evaluation features like index-only strategies or certain join
methods or poor size estimation.
• Check the plan that is being used! Then adjust the choice of
indexes or rewrite the query/view.
– E.g. check via POSTGRES “Explain” command
– Some systems rewrite for you under the covers (e.g. DB2)
• Can be confusing and/or helpful!
More Guidelines for Query Tuning
• Minimize the use of DISTINCT: don’t need it if
duplicates are acceptable, or if answer contains a
key.
• Minimize the use of GROUP BY and HAVING:
SELECT MIN (E.age)
FROM Employee E
GROUP BY E.dno
HAVING E.dno=102
SELECT MIN (E.age)
FROM Employee E
WHERE E.dno=102
Consider DBMS use of index when writing arithmetic
expressions: E.age=2*D.age will benefit from index on
E.age, but might not benefit from index on D.age!
Guidelines for Query Tuning (Contd.)
• Avoid using intermediate
relations:
SELECT * INTO Temp
FROM Emp E, Dept D
WHERE E.dno=D.dno
AND D.mgrname=‘Joe’
SELECT T.dno, AVG(T.sal)
FROM Temp T
GROUP BY T.dno
vs.
SELECT E.dno, AVG(E.sal)
FROM Emp E, Dept D
WHERE E.dno=D.dno
AND D.mgrname=‘Joe’
GROUP BY E.dno
and
Does not materialize the intermediate reln Temp.
Choices in Tuning The Conceptual
Schema
– Consider the following schema
• Contracts(cid: integer, supplierid : integer, projectid: integer,
depti: integer, partid: integer, qty: integer, value: real)
• Departments(did: integer, budget: real, annualreport:
varchar)
• Parts(pid: integer, cost: integer)
• Projects( jid: integer, mgr: char(20))
• Suppliers(sid: integer, address: char(50))
Choices in Tuning The Conceptual
Schema contd…
• the relation Contracts, denoted as CSJDPQV
– The meaning of a tuple in this relation is that the contract
with cid C is an agreement that supplier S (with sid equal to
supplierid) will supply Q items of part P (with pid equal to
partid) to project J (with jid equal to projectid) associated
with department D (with deptid equal to did), and that the
value V of this contract is equal to value
Choices in Tuning The Conceptual
Schema contd…
• There are two known integrity constraints with
respect to Contracts
• 1. A project purchases a given part using a
single contract
• JP C
• 2. a department purchases at most one part
from any given supplier
• SD  P
Settling for a Weaker Normal Form
• Consider contract relation
• We will see what normal form it is in
• candidate keys for this relation are C and JP
• only nonkey dependency is SD P, and P is
a prime attribute because it is part of
candidate key JP
• It is in 3NF
• We will decompose it and convert it into
BCNF
• we obtain a lossless-join and dependency-
preserving decomposition into BCNF by
decomposing schema we will get schemas
CJP, SDP, and CSJDQV
Horizontal Decompositions
• Usual Def. of decomposition: Relation is replaced by
collection of relations that are projections. Most
important case.
– We will talk about this at length as part of Conceptual DB
Design
• Sometimes, might want to replace relation by a
collection of relations that are selections.
– Each new relation has same schema as original, but subset
of rows.
– Collectively, new relations contain all rows of the original.
– Typically, the new relations are disjoint.
Horizontal Decompositions (Contd.)
• Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val)
• Suppose that contracts with value > 10000 are
subject to different rules.
– So queries on Contracts will often say WHERE val>10000.
• One approach: clustered B+ tree index on the val
field.
• Second approach: replace contracts by two new
relations, LargeContracts and SmallContracts, with
the same attributes (CSJDPQV).
– Performs like index on such queries, but no index overhead.
– Can build clustered indexes on other attributes, in addition!
Masking Conceptual Schema Changes
• Horizonal Decomposition from above
• Masked by a view.
– NOTE: queries with condition val>10000 must be asked wrt
LargeContracts for efficiency: so some users may have to
be aware of change.
• I.e. the users who were having performance problems
• Arguably that’s OK -- they wanted a solution!
CREATE VIEW Contracts(cid, sid, jid, did, pid, qty, val)
AS SELECT *
FROM LargeContracts
UNION
SELECT *
FROM SmallContracts
Impact of Concurrency
• In a system with many concurrent users,
several additional points must be considered
• Transaction obtains locks on the pages that
it reads or writes and others may be blocked
• 2 specific ways to reduce blocking
– Reduce the time that transactions hold locks
– Reducing hot spots
Reducing Lock Durations
• Delay lock requests
– Tune transaction by writing to local prog. variables
and deferring changes to database until the end of
transaction
• Make transaction Faster
– Tuning indexing and rewriting queries
– Careful partitioning of the tuples in relation and
associated indexes across a collection of discs
• Replace long transactions by short ones
– Rewriting into two or more smaller transactions
Reducing Lock Durations contd…
• Build a warehouse
– Complex queries can hold shared lock for longer
time, involve statistical analysis of business trends
– Can run on copy of data that is little out of date
• Consider a lower Isolation Level
– In many situations such as queries generating
aggregate info or statistical summaries
– Use lower SQL isolation level as REPEATABLE
READ or READ COMMITTED
Reducing Hot Spots
• Delay operations on Hot Spots
– Requests using frequently used objects
• Optimize Access Patterns
– Pattern of updates
• Partitioning operations on Hot Spots
– Batch append
• Choice of Index
– In Frequent updating relation, B+ tree indexes can
become bottleneck so root and index pages
becomes hot spots
– Specialized locking protocols help (fine granularity
locks)
– Leads to ISAM index (only leafs gets locks)
DBMS Benchmarking
• Includes benchmarks for measuring the
performance of a certain class of applications
(e.g., the TPC benchmarks) and
• benchmarks for measuring how well a DBMS
performs various operations (e.g., the Wisconsin
benchmark)
– Benchmarks should be portable, easy to understand,
and scale naturally to larger problem instances. They
should measure peak performance (e.g., transactions
per second, or tps) as well as price/performance ratios
(e.g., $/tps) for typical workloads in a given application
domain
• The Transaction Processing Council (TPC)
was created to define benchmarks for
transaction processing and database
systems
• Well-Known DBMS Benchmarks
– The TPC-A and TPC-B benchmarks constitute the
standard definitions of the tps and $/tps measures
– TPC-A measures the performance and price of a
computer network in addition to the DBMS,
– whereas the TPC-B benchmark considers the
DBMS by itself
DBMS Benchmarking
DBMS Benchmarking
– The TPC-C benchmark is a more complex suite of
transactional tasks than TPC-A and TPC-B
– It models a warehouse that tracks items supplied to
customers and involves five types of transactions
– Much more expensive than TPC-A and TPC-B
– exercises a much wider range of system capabilities
– TPC-D TPC-D represents a broad range of decision
support (DS) applications that require complex, long
running queries against large complex data structures. 
DBMS Benchmarking
• The TPC Benchmark™H (TPC-H) is a decision support
benchmark.
• It consists of a suite of business oriented ad-hoc queries and
concurrent data modifications.
• The queries and the data populating the database have been
chosen to have broad industry-wide relevance.
• This benchmark illustrates decision support systems that
examine large volumes of data, execute queries with a high
degree of complexity, and give answers to critical business
questions.
Points to Remember
• Indexes must be chosen to speed up important
queries (and perhaps some updates!).
– Index maintenance overhead on updates to key fields.
– Choose indexes that can help many queries, if possible.
– Build indexes to support index-only strategies.
– Clustering is an important decision; only one index on a
given relation can be clustered!
– Order of fields in composite index key can be important.
• Static indexes may have to be periodically re-built.
• Statistics have to be periodically updated.
Points to remember (Contd.)
• Over time, indexes have to be fine-tuned (dropped,
created, re-clustered, ...) for performance.
– Should determine the plan used by the system, and adjust
the choice of indexes appropriately.
• System may still not find a good plan:
– Only left-deep plans?
– Null values, arithmetic conditions, string expressions, the
use of ORs, nested queries, etc. can confuse an optimizer.
• So, may have to rewrite the query/view:
– Avoid nested queries, temporary relations, complex
conditions, and operations like DISTINCT and GROUP BY.

More Related Content

Viewers also liked

SQL Functions and Operators
SQL Functions and OperatorsSQL Functions and Operators
SQL Functions and Operators
Mohan Kumar.R
 
Sql data types for various d bs by naveen kumar veligeti
Sql data types for various d bs by naveen kumar veligetiSql data types for various d bs by naveen kumar veligeti
Sql data types for various d bs by naveen kumar veligeti
Naveen Kumar Veligeti
 
Optimizing Queries with Explain
Optimizing Queries with ExplainOptimizing Queries with Explain
Optimizing Queries with ExplainMYXPLAIN
 
Open GIS Data
Open GIS DataOpen GIS Data
Open GIS Data
Amazon Web Services
 
CAPSTONE-FINAL-5-11-15
CAPSTONE-FINAL-5-11-15CAPSTONE-FINAL-5-11-15
CAPSTONE-FINAL-5-11-15Rusty Mooney
 
Conférence TechnoArk 2016 - 06 alpiq
Conférence TechnoArk 2016 - 06 alpiqConférence TechnoArk 2016 - 06 alpiq
Conférence TechnoArk 2016 - 06 alpiq
Laurent Borella
 
¿Hotel 7 Estrellas?
¿Hotel 7 Estrellas?¿Hotel 7 Estrellas?
¿Hotel 7 Estrellas?
Nickyto
 
Course syllabus ป.ตรี ฉ3
Course syllabus ป.ตรี ฉ3Course syllabus ป.ตรี ฉ3
Course syllabus ป.ตรี ฉ3sewnipa
 
Conférence TechnoArk 2016 - 13 hesso-genoud
Conférence TechnoArk 2016 - 13 hesso-genoudConférence TechnoArk 2016 - 13 hesso-genoud
Conférence TechnoArk 2016 - 13 hesso-genoud
Laurent Borella
 
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3Rusty Mooney
 
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
bernardcarnoy
 
HTML5 JS APIs
HTML5 JS APIsHTML5 JS APIs
HTML5 JS APIs
Remy Sharp
 
TK Polen
TK PolenTK Polen
TK Polen
BeataGyori
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
Snowflake Computing
 
Routines et improvisation, à propos de Computation and Human Experience de Ph...
Routines et improvisation, à propos de Computation and Human Experience de Ph...Routines et improvisation, à propos de Computation and Human Experience de Ph...
Routines et improvisation, à propos de Computation and Human Experience de Ph...
Alexandre Monnin
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Ireland's Economic & Competitiveness Update - Q3 2016
Ireland's Economic & Competitiveness Update - Q3 2016Ireland's Economic & Competitiveness Update - Q3 2016
Ireland's Economic & Competitiveness Update - Q3 2016
IDA-Ireland
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systems
IBM Cloud Data Services
 
Plan de Accion
Plan de AccionPlan de Accion
Plan de Accion
Dimmy Durandis
 

Viewers also liked (19)

SQL Functions and Operators
SQL Functions and OperatorsSQL Functions and Operators
SQL Functions and Operators
 
Sql data types for various d bs by naveen kumar veligeti
Sql data types for various d bs by naveen kumar veligetiSql data types for various d bs by naveen kumar veligeti
Sql data types for various d bs by naveen kumar veligeti
 
Optimizing Queries with Explain
Optimizing Queries with ExplainOptimizing Queries with Explain
Optimizing Queries with Explain
 
Open GIS Data
Open GIS DataOpen GIS Data
Open GIS Data
 
CAPSTONE-FINAL-5-11-15
CAPSTONE-FINAL-5-11-15CAPSTONE-FINAL-5-11-15
CAPSTONE-FINAL-5-11-15
 
Conférence TechnoArk 2016 - 06 alpiq
Conférence TechnoArk 2016 - 06 alpiqConférence TechnoArk 2016 - 06 alpiq
Conférence TechnoArk 2016 - 06 alpiq
 
¿Hotel 7 Estrellas?
¿Hotel 7 Estrellas?¿Hotel 7 Estrellas?
¿Hotel 7 Estrellas?
 
Course syllabus ป.ตรี ฉ3
Course syllabus ป.ตรี ฉ3Course syllabus ป.ตรี ฉ3
Course syllabus ป.ตรี ฉ3
 
Conférence TechnoArk 2016 - 13 hesso-genoud
Conférence TechnoArk 2016 - 13 hesso-genoudConférence TechnoArk 2016 - 13 hesso-genoud
Conférence TechnoArk 2016 - 13 hesso-genoud
 
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3
FINAL- Fostering Socialization among Residents in Long-term Care Setting-2-3
 
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
Réussir la transition énergétique - Présentation pour CapitalatWork au Châtea...
 
HTML5 JS APIs
HTML5 JS APIsHTML5 JS APIs
HTML5 JS APIs
 
TK Polen
TK PolenTK Polen
TK Polen
 
Introducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data WarehouseIntroducing the Snowflake Computing Cloud Data Warehouse
Introducing the Snowflake Computing Cloud Data Warehouse
 
Routines et improvisation, à propos de Computation and Human Experience de Ph...
Routines et improvisation, à propos de Computation and Human Experience de Ph...Routines et improvisation, à propos de Computation and Human Experience de Ph...
Routines et improvisation, à propos de Computation and Human Experience de Ph...
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Ireland's Economic & Competitiveness Update - Q3 2016
Ireland's Economic & Competitiveness Update - Q3 2016Ireland's Economic & Competitiveness Update - Q3 2016
Ireland's Economic & Competitiveness Update - Q3 2016
 
dashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systemsdashDB: the GIS professional’s bridge to mainstream IT systems
dashDB: the GIS professional’s bridge to mainstream IT systems
 
Plan de Accion
Plan de AccionPlan de Accion
Plan de Accion
 

Similar to Hpd 1

Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)
Guy Harrison
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
sonykhan3
 
Predicting the NBA MVP
Predicting the NBA MVPPredicting the NBA MVP
Predicting the NBA MVP
Thinkful
 
Optimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser ArchitectureOptimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser Architecture
DAGEOP LTD
 
Lecture2 (1).ppt
Lecture2 (1).pptLecture2 (1).ppt
Lecture2 (1).ppt
Minakshee Patil
 
algo 1.ppt
algo 1.pptalgo 1.ppt
algo 1.ppt
example43
 
Informational Referential Integrity Constraints Support in Apache Spark with ...
Informational Referential Integrity Constraints Support in Apache Spark with ...Informational Referential Integrity Constraints Support in Apache Spark with ...
Informational Referential Integrity Constraints Support in Apache Spark with ...
Databricks
 
Statistics and Indexes Internals
Statistics and Indexes InternalsStatistics and Indexes Internals
Statistics and Indexes Internals
Antonios Chatzipavlis
 
BI Apps Architecture
BI Apps ArchitectureBI Apps Architecture
BI Apps Architecture
Dylan Wan
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
MumitAhmed1
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
SharabiNaif
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
Anonymous9etQKwW
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
Simon Hughes
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Simon Hughes
 
Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)
Thinkful
 
Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?
IDERA Software
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Lucidworks
 
Unit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptxUnit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptx
MaryJoseph79
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithms
iqbalphy1
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
Si Krishan
 

Similar to Hpd 1 (20)

Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)Top 10 tips for Oracle performance (Updated April 2015)
Top 10 tips for Oracle performance (Updated April 2015)
 
Algorithms and Data Structures
Algorithms and Data StructuresAlgorithms and Data Structures
Algorithms and Data Structures
 
Predicting the NBA MVP
Predicting the NBA MVPPredicting the NBA MVP
Predicting the NBA MVP
 
Optimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser ArchitectureOptimising Queries - Series 1 Query Optimiser Architecture
Optimising Queries - Series 1 Query Optimiser Architecture
 
Lecture2 (1).ppt
Lecture2 (1).pptLecture2 (1).ppt
Lecture2 (1).ppt
 
algo 1.ppt
algo 1.pptalgo 1.ppt
algo 1.ppt
 
Informational Referential Integrity Constraints Support in Apache Spark with ...
Informational Referential Integrity Constraints Support in Apache Spark with ...Informational Referential Integrity Constraints Support in Apache Spark with ...
Informational Referential Integrity Constraints Support in Apache Spark with ...
 
Statistics and Indexes Internals
Statistics and Indexes InternalsStatistics and Indexes Internals
Statistics and Indexes Internals
 
BI Apps Architecture
BI Apps ArchitectureBI Apps Architecture
BI Apps Architecture
 
Intro_2.ppt
Intro_2.pptIntro_2.ppt
Intro_2.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Intro.ppt
Intro.pptIntro.ppt
Intro.ppt
 
Evolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.comEvolving the Optimal Relevancy Ranking Model at Dice.com
Evolving the Optimal Relevancy Ranking Model at Dice.com
 
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.comEnhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
Enhancing Enterprise Search with Machine Learning - Simon Hughes, Dice.com
 
Predict oscars (4:17)
Predict oscars (4:17)Predict oscars (4:17)
Predict oscars (4:17)
 
Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?Geek Sync | Why Did My Clever Index Change Backfire?
Geek Sync | Why Did My Clever Index Change Backfire?
 
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
Evolving The Optimal Relevancy Scoring Model at Dice.com: Presented by Simon ...
 
Unit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptxUnit 2_DBMS_10.2.22.pptx
Unit 2_DBMS_10.2.22.pptx
 
Data Structure and Algorithms
Data Structure and AlgorithmsData Structure and Algorithms
Data Structure and Algorithms
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 

More from dikshagupta111

Osd diksha presentation
Osd diksha presentationOsd diksha presentation
Osd diksha presentation
dikshagupta111
 
Dik seminar
Dik seminarDik seminar
Dik seminar
dikshagupta111
 
Diksha sda presentation
Diksha sda presentationDiksha sda presentation
Diksha sda presentation
dikshagupta111
 
Dik acn presentation
Dik acn presentationDik acn presentation
Dik acn presentation
dikshagupta111
 
Final ppt
Final pptFinal ppt
Final ppt
dikshagupta111
 
Hpd ppt
Hpd pptHpd ppt
Ppt final
Ppt finalPpt final
Ppt final
dikshagupta111
 
Diksha gupta
Diksha guptaDiksha gupta
Diksha gupta
dikshagupta111
 
Information filtering
Information filteringInformation filtering
Information filtering
dikshagupta111
 
Parallel language &amp; compilers
Parallel language &amp; compilersParallel language &amp; compilers
Parallel language &amp; compilers
dikshagupta111
 
Ppt final
Ppt finalPpt final
Ppt final
dikshagupta111
 
Final ppt
Final pptFinal ppt
Final ppt
dikshagupta111
 

More from dikshagupta111 (12)

Osd diksha presentation
Osd diksha presentationOsd diksha presentation
Osd diksha presentation
 
Dik seminar
Dik seminarDik seminar
Dik seminar
 
Diksha sda presentation
Diksha sda presentationDiksha sda presentation
Diksha sda presentation
 
Dik acn presentation
Dik acn presentationDik acn presentation
Dik acn presentation
 
Final ppt
Final pptFinal ppt
Final ppt
 
Hpd ppt
Hpd pptHpd ppt
Hpd ppt
 
Ppt final
Ppt finalPpt final
Ppt final
 
Diksha gupta
Diksha guptaDiksha gupta
Diksha gupta
 
Information filtering
Information filteringInformation filtering
Information filtering
 
Parallel language &amp; compilers
Parallel language &amp; compilersParallel language &amp; compilers
Parallel language &amp; compilers
 
Ppt final
Ppt finalPpt final
Ppt final
 
Final ppt
Final pptFinal ppt
Final ppt
 

Recently uploaded

Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
Kamal Acharya
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
AmarGB2
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
VENKATESHvenky89705
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
manasideore6
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
Vijay Dialani, PhD
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
BrazilAccount1
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
ydteq
 

Recently uploaded (20)

Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
Cosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdfCosmetic shop management system project report.pdf
Cosmetic shop management system project report.pdf
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
Investor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptxInvestor-Presentation-Q1FY2024 investor presentation document.pptx
Investor-Presentation-Q1FY2024 investor presentation document.pptx
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 
road safety engineering r s e unit 3.pdf
road safety engineering  r s e unit 3.pdfroad safety engineering  r s e unit 3.pdf
road safety engineering r s e unit 3.pdf
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
Fundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptxFundamentals of Electric Drives and its applications.pptx
Fundamentals of Electric Drives and its applications.pptx
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
ML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptxML for identifying fraud using open blockchain data.pptx
ML for identifying fraud using open blockchain data.pptx
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
AP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specificAP LAB PPT.pdf ap lab ppt no title specific
AP LAB PPT.pdf ap lab ppt no title specific
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
一比一原版(UofT毕业证)多伦多大学毕业证成绩单如何办理
 

Hpd 1

  • 1. Physical Database Design and Tuning R&G - Chapter 20
  • 2. Contents • Physical Database Design • Database Workloads • Physical Design and tuning Decisions • Need for Tuning • Guidelines for Index Selection • Clustering & indexing tools for index selection • Database Tuning: Tuning index • Tuning Conceptual schema • Tuning queries and views • Impact of Concurrency • Benchmarking
  • 3. Physical Database Design • Process of producing a description of the implementation of the database on secondary storage. • It describes the base relations, file organizations, and indexes used to achieve efficient access to the data, and any associated integrity constraints and security measures.
  • 4. Physical Database Design • We will describe the plan for how to build the tables, including appropriate data types, field sizes, attribute domains, and indexes. • The plan should have enough detail that if someone else were to use the plan to build a database, the database they build is the same as the one you are intending to create. • The conceptual design and logical design were independent of physical considerations. Now, we not only know that we want a relational model, we have selected a database management system (DBMS) such as Access or Oracle, and we focus on those physical considerations.
  • 5. Logical vs. Physical Design: • Logical database design is concerned with what to store; • physical database design is concerned with how to store it.
  • 6. Introduction • We will be talking at length about “database design” – Conceptual Schema: info to capture, tables, columns, views, etc. – Physical Schema: indexes, clustering, etc. • Physical design linked tightly to query optimization – So we’ll study this “bottom up” – But note: DB design is usually “top-down” • conceptual then physical. Then iterate. • We must begin by understanding the workload: – The most important queries and how often they arise. – The most important updates and how often they arise. – The desired performance for these queries and updates.
  • 7. Understanding the Workload • For each query in the workload: – Which relations does it access? – Which attributes are retrieved? – Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? • For each update in the workload: – Which attributes are involved in selection/join conditions? How selective are these conditions likely to be? – The type of update (INSERT/DELETE/UPDATE), and the attributes that are affected. – For the Update command, the fields that are modified by update
  • 8. Creating an ISUD Chart Employee Table Transaction Frequency% table Name Salary Address Payroll Run monthly 100 S S S Add Emps daily 0.1 I I I Delete Emps daily 0.1 D D D Give Raises monthly 10 S U NA Insert, Select, Update, Delete Frequencies
  • 9. Physical Design and tuning Decisions • Choice of indexes to create – Which relations to index and which field – What field(s) should be the search key – Should we build several indexes – For each index, should it be Clustered or un clustered • Tuning the conceptual schema – Alternative normalization, – De normalization, – Vertical partitioning, – Views • Query and transaction tuning – Frequently executed queries are rewritten to run faster.
  • 10. Need for Database Tuning • Hard to get detailed workload at initial design. • Concept of design and tuning are arbitrary. • Design process is over after conceptual schema and set of clustering and indexing decisions are made. • Tuning process is subsequent changes to the conceptual schema or the indexes.
  • 11. Index Selection • One approach: – Consider most important queries. – Consider best plan using the current indexes, and see if better plan is possible with an additional index. – If so, create it. • Before creating an index, must also consider the impact on updates in the workload. – Trade-off of slowing some updates in order to speed up some queries.
  • 12. Whether to index (Guideline 1) • Do not build an index unless some query including the query components of updates benefit from it. • Whenever possible choose indexes that speed up more than one query
  • 13. Multi attribute Search keys (Guideline 3) – Two situations should be considered 1. a WHERE clause includes conditions on more than one attribute of a relation 2. They enable index only evaluation strategies (i.e. accessing relation can be avoided) for important queries.
  • 14. Whether to cluster (Guideline 4) • As a rule of thumb, range queries are likely to benefit the most from clustering • If an index enables an index-only evaluation strategy for the query it is intended to speed up, the index need not be clustered
  • 15. Hash verses Tree index (Guideline 5) – Hash index is better in following situations • The index is intended to support index nested loops join; the indexed relation is the inner relation, and the search key includes the join columns • There is very important equality query, and no range queries, involving the search key attributes
  • 16. Balancing the cost of Index Maintenance (Guideline 6) • If maintaining an index slows down frequent update operations, consider dropping the index • Keep in mind however , that adding an index may well speed up a given update operation. – E.g. an index on employee IDs could speed up the operation of increasing the salary of a employee (specified by ID)
  • 17. Example 1 • Hash index on D.dname supports ‘Toy’ selection. – Given this, index on D.dno is not needed. • Hash index on E.dno allows us to get matching (inner) Emp tuples for each selected (outer) Dept tuple. • What if WHERE included: `` ... AND E.age=25’’ ? – Could retrieve Emp tuples using index on E.age, then join with Dept tuples satisfying dname selection. Comparable to strategy that used E.dno index. – So, if E.age index is already created, this query provides much less motivation for adding an E.dno index. SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=‘Toy’ AND E.dno=D.dno
  • 18. Example 2 • All selections are on Emp so it should be the outer relation in any Index NL join. – Suggests that we build a B+ tree index on D.dno. • What index should we build on Emp? – B+ tree on E.sal could be used, OR an index on E.hobby could be used. Only one of these is needed, and which is better depends upon the selectivity of the conditions. • As a rule of thumb, equality selections more selective than range selections. • As both examples indicate, our choice of indexes is guided by the plan(s) that we expect an optimizer to consider for a query. Have to understand optimizers! SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE E.sal BETWEEN 10000 AND 20000 AND E.hobby=‘Stamps’ AND E.dno=D.dno
  • 19. Clustering and Indexing • Clustered indexes can be especially important while accessing the inner relation in an index nested loops joins • Revisit the same e.g • Should the used indexes be clustered? • Unclustered index on dname • On the other hand Emp is the inner relation in an index NL join and dno is not candidate key • Dno should be clustered index SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=‘Toy’ AND E.dno=D.dno
  • 20. Examples of Clustering • B+ tree index on E.age can be used to get qualifying tuples. – How selective is the condition? – Is the index clustered? • Consider the GROUP BY query. – If many tuples have E.age > 10, using E.age index and sorting the retrieved tuples may be costly. – Clustered E.dno index may be better! • Equality queries and duplicates: – Clustering on E.hobby helps! SELECT E.dno FROM Emp E WHERE E.age>40 SELECT E.dno, COUNT (*) FROM Emp E WHERE E.age>10 GROUP BY E.dno SELECT E.dno FROM Emp E WHERE E.hobby=Stamps
  • 22. Co-clustering Two Relations • It can speed up joins, in particular key foreign key joins corresponding to 1:N relations • A sequential scan of either relation becomes slower. • All inserts, deletes and updates that alter record lengths become slower, thanks to overhead involved in maintaining the clustering
  • 23. Index-Only Plans • A number of queries can be answered without retrieving any tuples from one or more of the relations involved if a suitable index is available. SELECT D.mgr FROM Dept D, Emp E WHERE D.dno=E.dno SELECT D.mgr, E.eid FROM Dept D, Emp E WHERE D.dno=E.dno SELECT E.dno, COUNT(*) FROM Emp E GROUP BY E.dno SELECT E.dno, MIN(E.sal) FROM Emp E GROUP BY E.dno SELECT AVG(E.sal) FROM Emp E WHERE E.age=25 AND E.sal BETWEEN 3000 AND 5000 <E.dno> <E.dno,E.eid> <E.dno> <E.dno,E.sal> B-tree trick! <E. age,E.sal> or <E.sal, E.age>
  • 24. Tools to Assist in Index Selection • First generation of such tools: – Index tuning wizards or – Index advisors • Drawback of these systems – They had to replicate the database query optimizers cost model
  • 25. • The DB2 Index Advisor – Tool for automatic index recommendation given a workload – Workload table: ADVISE_WORKLOAD – It is populated either • By SQL stmts from DB2 dynamic SQL stmt cache for recently executed SQL stmts • With SQL stmts from packages statically compiled stmts OR • With SQL stmts from online monitor called Query Patroller • Output: SQL DDL statements whose execution creates recommended indexes Tools to Assist in Index Selection
  • 26. • The Microsoft SQL server 2000 Index Tuning wizard – Tuning wizard integrated with the database query optimizer – 3 tuning modes that permits user to trade off running time of analysis and no. of candidate index configurations examined: fast, medium and thorough with fast having lowest running time and thorough examining the largest no. of configurations – Max space allowed for indexes, Allows table scaling – Reduces running time by sampling mode – Table scaling Tools to Assist in Index Selection
  • 27. Overview of Database Tuning • Actual use of DB provides a valuable source of detailed information that can be used to refine the initial design • Original assumptions are replaced • Initial workload is validated • Initial guesses about size of data can be replaced with actual statistics • Tuning imp to get best possible performance • 3 kinds of tuning: - tuning indexes, tuning the conceptual schema, and tuning queries
  • 28. Tuning indexes • Queries and updates considered important at initial level are not very frequent • Observed workload may also identify some new queries and updates • Initial choice of indexes has to be reviewed in light of this new information • Some original indexes may be dropped and new ones added
  • 29. • It uses index only scan with Emp as inner relation • If this query takes an unexpectedly long time to execute replace previous plan with dno field and clustered index Tuning indexes continues… SELECT E.ename, D.mgr FROM Emp E, Dept D WHERE D.dname=‘Toy’ AND E.dno=D.dno
  • 30. • In addition we have to Periodically reorganize indexes – E.g Static index (ISAM index) may have developed long overflow chains, drop or rebuilt- if feasible, improves access time through this index – Dynamic structure (B+ tree) - if the implementation does not merge pages on deletes, space occupancy can decrease considerably in some situations. This in turn makes the size of the index (in pages) larger than necessary, and could increase the height and therefore the access time Tuning indexes continues…
  • 31. Tuning conceptual Schema • If initial schema doesn’t meet our performance objectives for the given workload with any set of physical design if so redesign conceptual schema • Such change is called as schema evolution • Issues involved in tuning conceptual schema: – Decide to settle for a 3NF design instead of BCNF – Among 3NF or BCNF our choice should be guided by workload – Sometime we might decide to further decompose relation that is already in BCNF – We might denormalize – partitioning
  • 32. Tuning Queries and Views • If a query runs slower than expected, check if an index needs to be re-built, or if statistics are too old and rebuilt the queries. • Sometimes, the DBMS may not be executing the plan you had in mind. Common areas of optimizer weakness: – Selections involving null values (bad selectivity estimates) – Selections involving arithmetic or string expressions (ditto) – Selections involving OR conditions (ditto) – Complex, correlated subqueries – Lack of evaluation features like index-only strategies or certain join methods or poor size estimation. • Check the plan that is being used! Then adjust the choice of indexes or rewrite the query/view. – E.g. check via POSTGRES “Explain” command – Some systems rewrite for you under the covers (e.g. DB2) • Can be confusing and/or helpful!
  • 33. More Guidelines for Query Tuning • Minimize the use of DISTINCT: don’t need it if duplicates are acceptable, or if answer contains a key. • Minimize the use of GROUP BY and HAVING: SELECT MIN (E.age) FROM Employee E GROUP BY E.dno HAVING E.dno=102 SELECT MIN (E.age) FROM Employee E WHERE E.dno=102 Consider DBMS use of index when writing arithmetic expressions: E.age=2*D.age will benefit from index on E.age, but might not benefit from index on D.age!
  • 34. Guidelines for Query Tuning (Contd.) • Avoid using intermediate relations: SELECT * INTO Temp FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=‘Joe’ SELECT T.dno, AVG(T.sal) FROM Temp T GROUP BY T.dno vs. SELECT E.dno, AVG(E.sal) FROM Emp E, Dept D WHERE E.dno=D.dno AND D.mgrname=‘Joe’ GROUP BY E.dno and Does not materialize the intermediate reln Temp.
  • 35. Choices in Tuning The Conceptual Schema – Consider the following schema • Contracts(cid: integer, supplierid : integer, projectid: integer, depti: integer, partid: integer, qty: integer, value: real) • Departments(did: integer, budget: real, annualreport: varchar) • Parts(pid: integer, cost: integer) • Projects( jid: integer, mgr: char(20)) • Suppliers(sid: integer, address: char(50))
  • 36. Choices in Tuning The Conceptual Schema contd… • the relation Contracts, denoted as CSJDPQV – The meaning of a tuple in this relation is that the contract with cid C is an agreement that supplier S (with sid equal to supplierid) will supply Q items of part P (with pid equal to partid) to project J (with jid equal to projectid) associated with department D (with deptid equal to did), and that the value V of this contract is equal to value
  • 37. Choices in Tuning The Conceptual Schema contd… • There are two known integrity constraints with respect to Contracts • 1. A project purchases a given part using a single contract • JP C • 2. a department purchases at most one part from any given supplier • SD  P
  • 38. Settling for a Weaker Normal Form • Consider contract relation • We will see what normal form it is in • candidate keys for this relation are C and JP • only nonkey dependency is SD P, and P is a prime attribute because it is part of candidate key JP • It is in 3NF • We will decompose it and convert it into BCNF • we obtain a lossless-join and dependency- preserving decomposition into BCNF by decomposing schema we will get schemas CJP, SDP, and CSJDQV
  • 39. Horizontal Decompositions • Usual Def. of decomposition: Relation is replaced by collection of relations that are projections. Most important case. – We will talk about this at length as part of Conceptual DB Design • Sometimes, might want to replace relation by a collection of relations that are selections. – Each new relation has same schema as original, but subset of rows. – Collectively, new relations contain all rows of the original. – Typically, the new relations are disjoint.
  • 40. Horizontal Decompositions (Contd.) • Contracts (Cid, Sid, Jid, Did, Pid, Qty, Val) • Suppose that contracts with value > 10000 are subject to different rules. – So queries on Contracts will often say WHERE val>10000. • One approach: clustered B+ tree index on the val field. • Second approach: replace contracts by two new relations, LargeContracts and SmallContracts, with the same attributes (CSJDPQV). – Performs like index on such queries, but no index overhead. – Can build clustered indexes on other attributes, in addition!
  • 41. Masking Conceptual Schema Changes • Horizonal Decomposition from above • Masked by a view. – NOTE: queries with condition val>10000 must be asked wrt LargeContracts for efficiency: so some users may have to be aware of change. • I.e. the users who were having performance problems • Arguably that’s OK -- they wanted a solution! CREATE VIEW Contracts(cid, sid, jid, did, pid, qty, val) AS SELECT * FROM LargeContracts UNION SELECT * FROM SmallContracts
  • 42. Impact of Concurrency • In a system with many concurrent users, several additional points must be considered • Transaction obtains locks on the pages that it reads or writes and others may be blocked • 2 specific ways to reduce blocking – Reduce the time that transactions hold locks – Reducing hot spots
  • 43. Reducing Lock Durations • Delay lock requests – Tune transaction by writing to local prog. variables and deferring changes to database until the end of transaction • Make transaction Faster – Tuning indexing and rewriting queries – Careful partitioning of the tuples in relation and associated indexes across a collection of discs • Replace long transactions by short ones – Rewriting into two or more smaller transactions
  • 44. Reducing Lock Durations contd… • Build a warehouse – Complex queries can hold shared lock for longer time, involve statistical analysis of business trends – Can run on copy of data that is little out of date • Consider a lower Isolation Level – In many situations such as queries generating aggregate info or statistical summaries – Use lower SQL isolation level as REPEATABLE READ or READ COMMITTED
  • 45. Reducing Hot Spots • Delay operations on Hot Spots – Requests using frequently used objects • Optimize Access Patterns – Pattern of updates • Partitioning operations on Hot Spots – Batch append • Choice of Index – In Frequent updating relation, B+ tree indexes can become bottleneck so root and index pages becomes hot spots – Specialized locking protocols help (fine granularity locks) – Leads to ISAM index (only leafs gets locks)
  • 46. DBMS Benchmarking • Includes benchmarks for measuring the performance of a certain class of applications (e.g., the TPC benchmarks) and • benchmarks for measuring how well a DBMS performs various operations (e.g., the Wisconsin benchmark) – Benchmarks should be portable, easy to understand, and scale naturally to larger problem instances. They should measure peak performance (e.g., transactions per second, or tps) as well as price/performance ratios (e.g., $/tps) for typical workloads in a given application domain
  • 47. • The Transaction Processing Council (TPC) was created to define benchmarks for transaction processing and database systems • Well-Known DBMS Benchmarks – The TPC-A and TPC-B benchmarks constitute the standard definitions of the tps and $/tps measures – TPC-A measures the performance and price of a computer network in addition to the DBMS, – whereas the TPC-B benchmark considers the DBMS by itself DBMS Benchmarking
  • 48. DBMS Benchmarking – The TPC-C benchmark is a more complex suite of transactional tasks than TPC-A and TPC-B – It models a warehouse that tracks items supplied to customers and involves five types of transactions – Much more expensive than TPC-A and TPC-B – exercises a much wider range of system capabilities – TPC-D TPC-D represents a broad range of decision support (DS) applications that require complex, long running queries against large complex data structures. 
  • 49. DBMS Benchmarking • The TPC Benchmark™H (TPC-H) is a decision support benchmark. • It consists of a suite of business oriented ad-hoc queries and concurrent data modifications. • The queries and the data populating the database have been chosen to have broad industry-wide relevance. • This benchmark illustrates decision support systems that examine large volumes of data, execute queries with a high degree of complexity, and give answers to critical business questions.
  • 50. Points to Remember • Indexes must be chosen to speed up important queries (and perhaps some updates!). – Index maintenance overhead on updates to key fields. – Choose indexes that can help many queries, if possible. – Build indexes to support index-only strategies. – Clustering is an important decision; only one index on a given relation can be clustered! – Order of fields in composite index key can be important. • Static indexes may have to be periodically re-built. • Statistics have to be periodically updated.
  • 51. Points to remember (Contd.) • Over time, indexes have to be fine-tuned (dropped, created, re-clustered, ...) for performance. – Should determine the plan used by the system, and adjust the choice of indexes appropriately. • System may still not find a good plan: – Only left-deep plans? – Null values, arithmetic conditions, string expressions, the use of ORs, nested queries, etc. can confuse an optimizer. • So, may have to rewrite the query/view: – Avoid nested queries, temporary relations, complex conditions, and operations like DISTINCT and GROUP BY.