Diseño fisico indices_2

DATA WAREHOUSING
Physical Design

 Provide efficient access to relevant records
 Based on values of particular attribute(s)
 Same idea as index in back of a book
 An index is a “thin” copy of a relation
 Not all columns from the relation are included
 The index is sorted in a particular way
 Index supports efficient lookup
 Useful when filters are selective
 Avoid scanning rows that will be filtered out

 Indexes organized based on some search key
 Column (or set of columns) whose values are used to access the index
 Organization can be sorting or hashing
 Index is built for some relation
 One index entry per record in the relation
 Index consists of <Value, RID> pairs
 Value = value of the search key for this record
 RID = record identifier
▪ Tells the DBMS where the record is stored
▪ Usually (page number, offset in page)

 Traditional Access Methods
 B-trees, hash tables, R-trees, grids, …
 Popular in Warehouses
 Covering indexes
 Multi column indexes
 join indexes
 bit map indexes

5

 Idea behind fact index:
 Thinner version of fact table
 Index takes up less space than fact table
 Fewer I/Os required to scan it

 Index has 1 index entry per fact table row
 Regardless of how many columns are in the
index

 Sometimes an index has all the data you need
 Allows index-only query plan
 Not necessary to access the actual tuples
 Such an index is called a covering index

 SELECT COUNT(*) FROM R WHERE A=5
 Use index on A
 Count number of <5,RID> entries
 No need to look up records referenced by RIDs

 Multi-column indexes are very useful in data warehousing
 We say such an index has a composite key
 Example: B-Tree index on (A,B)
 Search key is (A,B) combination
 Index entries sorted by A value
 Entries with same A value are sorted by B value
 Called a lexicographic sort
 SELECT SUM(B) FROM R WHERE A=5
 Our (A,B) index covers this query!
 Coverage vs. size trade-off
 More attributes in search key → index covers more queries
 More attributes in search key → index takes up more disk space

 Advantages
 efficient computation of joins involving first index
columns (or all columns)
 Disadvantages
 useful only for specific join combinations
▪ for general usage, it is necessary to store a high number
of indices
 required space may be significant
▪ joins always involve the fact table

11

Base table Index on Region Index on Type
Cust Region Type RecIDAsia Europe America RecID Retail Dealer
C1 Asia Retail 1 1 0 0 1 1 0
C2 Europe Dealer 2 0 1 0 2 0 1
C3 Asia Dealer 3 1 0 0 3 0 1
C4 America Retail 4 0 0 1 4 1 0
C5 Europe Dealer 5 0 1 0 5 0 1

Query:
Get customer with region = „Asia‟ AND type = “Dealer”

12

 Good if domain cardinality small
 Most useful for attributes with low or
medium cardinality
▪ Not good for something like LastName

13

 Index intersection plans with bitmap indexes
are fast
 Just perform bitwise AND!
 Index intersection with B-Trees requires a
join

 Save space for low-cardinality attributes
 As compared to a B-Tree or Hash index

 Bit vectors can be compressed
 Compression Pros and Cons
 Reduce storage space → reduce number of I/Os required
 Need to compress/uncompress → increase CPU work
required
 Each compression scheme negotiates this trade-off
differently
 Operate directly on compressed bitmap → improved
performance

16

 Bit matrix which precomputes the join between a
dimension and the fact table
 one column for each dimension RID
 one row for each fact table RID
 cell (i,j) is 1 if fact table tuple i joins dimension tuple j, 0
otherwise

 Indexing dimensions
 attributes frequently involved in selection predicates
 if domain cardinality is high, then B-tree index
 if domain cardinality is low, then bitmap index
 Indices for join
 indexing only foreign keys in the fact table is rarely
appropriate
 star join index should be used with caution (column order
issue)
 bitmapped join index is suggested (if available)
 Indices for group by
 use materialized views

Diseño fisico indices_2

Recommended

Recommended

More Related Content

Viewers also liked

Viewers also liked (11)

Similar to Diseño fisico indices_2

Similar to Diseño fisico indices_2 (20)

Diseño fisico indices_2