SlideShare a Scribd company logo
1 of 12
Download to read offline
Primary Index Mechanics
After completing this module, you will be able to:
• Explain the role of the hashing algorithm and the hash map in
locating a row.
• Explain the makeup of the Row ID and its role in row storage.
• Describe the sequence of events for locating a row given its PI
value.
Hashing Primary Index Values
Hashing
Algorithm
RH Data
Row Hash PI values
DSW and data
PARSER
Data Table
Message Passing Layer (Hash Maps)
AMP 1 AMP n - 1
AMP x
... ...
AMP 0 AMP n
PI value = 38
Hashing
Algorithm
1177 7C3C
SQL with primary index values
and data.
For example:
Assume PI value is 38
Summary
The MPL uses the DSW of
1177 and uses this value to
locate bucket #1177 in the
Hash Map.
Bucket# 1177 contains the
AMP number that has this
hash value – effectively the
AMP with this row.
DSW
Hash Maps
AMP #
Row ID Row Data
Row Hash Uniq Value
x '00000000'
x'1177 7C3C' 0000 0001 38
x 'FFFFFFFF'
Hashing Down to the AMPs
Index value(s)
hashing algorithm
Hash Map
AMP #
The hashing algorithm is designed to insure even distribution of
unique values across all AMPs.
Different hashing algorithms are used for different international
character sets.
A Row Hash is the 32-bit result of applying a hashing algorithm to
an index value.
The DSW or Hash Bucket is represented by the high order 16 bits
of the Row Hash.
A Hash Map is uniquely configured for each system.
It is a array of 65,536 entries (buckets) which associates bucket
numbers with specific AMPs.
Two systems with the same number of AMPs will have the same
Hash Map.
Changing the number of AMPs in a system requires a change to
the Hash Map.
{
{
{
{
DSW or
Hash Bucket #
Row Hash
A Hashing Example
Order
Order
Number
PK
UPI
Customer
Number
Order
Date
Order
Status
7325 2 4/13 O
7324 3 4/13 O
7415 3 4/13 O
7415 1 4/13 C
7103 1 4/10 O
7225 2 4/15 C
7384 1 4/12 C
7402 3 4/12 C
7188 1 4/13 C
7202 2 4/09 C
SELECT * FROM order
WHERE order_number = 7202;
7202
Hashing Algorithm
691B 14AE
32 bit Row Hash
Remaining 16 bits
Destination Selection Word
0110 1001 0001 1011 0001 0100 1010 1110
6 9 1 B
The Hash Map
7202 Hashing Algorithm
(Hexadecimal)
691B 14AE
HASH MAP
07 06 07 06 07 04 05 06 05 05 14 09 14 13 03 04
15 08 02 04 01 00 14 14 03 02 03 09 01 00 02 15
01 00 15 11 14 14 13 13 14 14 08 09 15 10 09 09
07 06 15 13 11 06 15 08 15 15 08 08 11 07 05 10
04 12 11 13 05 10 07 07 03 02 11 04 01 00 11 13
11 11 12 10 03 02 06 13 01 00 06 05 07 06 05 12
0 1 2 3 4 5 6 7 8 9 A B C D E F
690
691
692
693
694
695
32 bit Row Hash
Remaining 16 bits
Destination Selection Word
0110 1001 0001 1011 0001 0100 1010 1110
6 9 1 B
AMP 9
7202 2 4/09 C
Note: This partial Hash Map is based on a 16 AMP system and AMPs are shown in decimal format.
Identifying Rows
Consideration #1
A Row Hash = 32 bits = 4.2 billion possible
values
Because there is an infinite number of
possible data values, some data values will
have to share the same row hash.
Hash Algorithm
1254 7769
10A2 2936 10A2 2936 Hash Synonyms
Data values input
Consideration #2
A Primary Index may be non-unique (NUPI).
Different rows will have the same PI value
and thus the same row hash.
A row hash is not adequate to uniquely identify a row.
Conclusion
A row hash is not adequate to uniquely identify a row.
Hash Algorithm
(John)
'Smith'
0016 5557
(Dave)
'Smith' NUPI Duplicates
Rows have
same hash
0016 5557
The Row ID
To uniquely identify a row, we add a 32-bit uniqueness value.
The combined row hash and uniqueness value is called a Row ID.
Row Hash
(32 bits)
Uniqueness Id
(32 bits)
Row ID
Each stored row
has a Row ID as a
prefix.
Rows are logically
maintained in Row
ID sequence.
Row ID Row Data
3B11 5032 0000 0001 1018 Reynolds Jane
3B11 5032 0000 0002 1020 Davidson Evan
3B11 5032 0000 0003 1031 Green Jason
3B11 5033 0000 0001 1014 Jacobs Paul
3B11 5034 0000 0001 1012 Chevas Jose
3B11 5034 0000 0002 1021 Carnet Jean
: : : : :
Row Hash Unique ID Emp_No Last_Name First_Name
Row ID Row Data
Storing Rows (1 of 2)
Assumptions:
Last_Name is defined as a NUPI.
All rows in this example hash to the same AMP.
Add a row for 'John Smith'
'Smith' Hash Algorithm 0016 5557 Hash Map AMP #3
Row ID Row Data
Row Hash Unique ID Last_Name First_Name Etc.
0016 5557 0000 0001 Smith John
Add a row for 'Sam Adams'
'Adams' Hash Algorithm 1058 9829 Hash Map AMP #3
Row ID Row Data
Row Hash Unique ID Last_Name First_Name Etc.
0016 5557 0000 0001 Smith John
1058 9829 0000 0001 Adams Sam
Storing Rows (2 of 2)
Add a row for 'Fred Smith' - (NUPI Duplicate)
Row ID Row Data
Row Hash Unique ID Last_Name First_Name Etc.
0016 5557 0000 0001 Smith John
0016 5557 0000 0002 Smith Fred
1058 9829 0000 0001 Adams Sam
'Smith' Hash Algorithm 0016 5557 Hash Map AMP #3
Add a row for 'Dan Jones' - (Hash Synonym)
'Jones' Hash Algorithm 0016 5557 Hash Map AMP #3
Row ID Row Data
Row Hash Unique ID Last_Name First_Name Etc.
0016 5557 0000 0001 Smith John
0016 5557 0000 0002 Smith Fred
0016 5557 0000 0003 Jones Dan
1058 9829 0000 0001 Adams Sam
Given the row hash, what other information would be needed to find the 'Dan Jones' row?
… The 'Fred Smith' row?
Locating a Row On An AMP Using a PI
Locating a row on an AMP
requires three input elements:
1. The Table ID
2. The Row Hash of the PI
3. The PI value itself
Cyl 1
Index
Cyl 2
Index
Cyl 3
Index
Cyl 4
Index
Cyl 5
Index
Cyl 6
Index
Cyl 7
Index
M
a
s
t
e
r
I
n
d
e
x
Data Row
Data Row
DATA
BLOCK
AMP #3
Cylinder #
PI Value
Master
Index
Cylinder
Index
Data
Block
Table Id
Row Hash
Table Id
Row Hash
Cylinder #
Row Hash
PI Value
Cylinder #
Data Block Address
Data Row
START WITH: FIND:
APPLY TO:
Table ID
Row Hash
Review Questions
Fill in the Blanks
1. The output of the hashing algorithm is called the _____ _____.
2. To determine the target AMP, the Message Passing Layer must lookup an entry in the
Hash Map based on the ________ number.
3. Two different PI values which hash to the same value are called Hash ___________ .
4. A Row ID consists of a row hash plus a ____________ value.
5. A uniqueness value is required to produce a unique Row ID because of _______
_________ and ______ ___________ .
6. Once the target AMP has been determined for a PI search, the _______ ________ for that
AMP must be consulted.
7. The Cylinder Index points us to the address and length of the data _______ .
Review Question Answers
Fill in the Blanks
1. The output of the hashing algorithm is called the Row Hash.
2. To determine the target AMP, the Message Passing Layer must lookup an entry in the
Hash Map based on the DSW or bucket number.
3. Two different PI values which hash to the same value are called Hash Synonyms .
4. A Row ID consists of a row hash plus a uniqueness value.
5. A uniqueness value is required to produce a unique Row ID because of hash synonyms
and NUPI duplicates .
6. Once the target AMP has been determined for a PI search, the Master Index for that AMP
must be consulted.
7. The Cylinder Index points us to the address and length of the data block .

More Related Content

Similar to 1.6 PI Mechanics.pdf

How to build ADaM BDS dataset from mock up table
How to build ADaM BDS dataset from mock up tableHow to build ADaM BDS dataset from mock up table
How to build ADaM BDS dataset from mock up table
Kevin Lee
 

Similar to 1.6 PI Mechanics.pdf (20)

SAS cheat sheet
SAS cheat sheetSAS cheat sheet
SAS cheat sheet
 
Variables In Php 1
Variables In Php 1Variables In Php 1
Variables In Php 1
 
unit 1 ppt.pptx
unit 1 ppt.pptxunit 1 ppt.pptx
unit 1 ppt.pptx
 
Notes 8086 instruction format
Notes 8086 instruction formatNotes 8086 instruction format
Notes 8086 instruction format
 
Bit Vectors Siddhesh
Bit Vectors SiddheshBit Vectors Siddhesh
Bit Vectors Siddhesh
 
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) DataDynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
Dynamic Columns of Phoenix for SQL on Sparse(NoSql) Data
 
Mastering Assembly Language: Programming with 8086
Mastering Assembly Language: Programming with 8086Mastering Assembly Language: Programming with 8086
Mastering Assembly Language: Programming with 8086
 
How to build ADaM BDS dataset from mock up table
How to build ADaM BDS dataset from mock up tableHow to build ADaM BDS dataset from mock up table
How to build ADaM BDS dataset from mock up table
 
2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx2. Data Preprocessing with Numpy and Pandas.pptx
2. Data Preprocessing with Numpy and Pandas.pptx
 
Simple Queriebhjjnhhbbbbnnnnjjs In SQL.pdf
Simple Queriebhjjnhhbbbbnnnnjjs In SQL.pdfSimple Queriebhjjnhhbbbbnnnnjjs In SQL.pdf
Simple Queriebhjjnhhbbbbnnnnjjs In SQL.pdf
 
pandasppt with informative topics coverage.pptx
pandasppt with informative topics coverage.pptxpandasppt with informative topics coverage.pptx
pandasppt with informative topics coverage.pptx
 
K mer index of dna sequence based on hash
K mer index of dna sequence based on hashK mer index of dna sequence based on hash
K mer index of dna sequence based on hash
 
Chapter 4 programming concepts III
Chapter 4  programming concepts IIIChapter 4  programming concepts III
Chapter 4 programming concepts III
 
address5ng modes.pptx IS A GOOD MATERIAL
address5ng  modes.pptx IS A GOOD MATERIALaddress5ng  modes.pptx IS A GOOD MATERIAL
address5ng modes.pptx IS A GOOD MATERIAL
 
How to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analyticsHow to use Parquet as a basis for ETL and analytics
How to use Parquet as a basis for ETL and analytics
 
3.ArraysandPointers.pptx
3.ArraysandPointers.pptx3.ArraysandPointers.pptx
3.ArraysandPointers.pptx
 
Wk1to4
Wk1to4Wk1to4
Wk1to4
 
Predicting Winner of DOTA2 Game
Predicting Winner of DOTA2 GamePredicting Winner of DOTA2 Game
Predicting Winner of DOTA2 Game
 
Big Data Analytics Part2
Big Data Analytics Part2Big Data Analytics Part2
Big Data Analytics Part2
 
Data Structures and Agorithm: DS 24 Hash Tables.pptx
Data Structures and Agorithm: DS 24 Hash Tables.pptxData Structures and Agorithm: DS 24 Hash Tables.pptx
Data Structures and Agorithm: DS 24 Hash Tables.pptx
 

More from ssuser8b6c85 (10)

5.Analytical Function.pdf
5.Analytical Function.pdf5.Analytical Function.pdf
5.Analytical Function.pdf
 
5.Agg. Function.pdf
5.Agg. Function.pdf5.Agg. Function.pdf
5.Agg. Function.pdf
 
2.1 Data types.pdf
2.1 Data types.pdf2.1 Data types.pdf
2.1 Data types.pdf
 
1.8 Data Protection.pdf
1.8 Data Protection.pdf1.8 Data Protection.pdf
1.8 Data Protection.pdf
 
1.5 PI Access.pdf
1.5 PI Access.pdf1.5 PI Access.pdf
1.5 PI Access.pdf
 
1.4 System Arch.pdf
1.4 System Arch.pdf1.4 System Arch.pdf
1.4 System Arch.pdf
 
Spark basic.pdf
Spark basic.pdfSpark basic.pdf
Spark basic.pdf
 
1.1 Overview.pdf
1.1 Overview.pdf1.1 Overview.pdf
1.1 Overview.pdf
 
1.1 Intro to WinDDI.pdf
1.1 Intro to WinDDI.pdf1.1 Intro to WinDDI.pdf
1.1 Intro to WinDDI.pdf
 
6.3 Mload.pdf
6.3 Mload.pdf6.3 Mload.pdf
6.3 Mload.pdf
 

Recently uploaded

原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
pwgnohujw
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
Amil baba
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
fztigerwe
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
acoha1
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
mikehavy0
 

Recently uploaded (20)

Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"Aggregations - The Elasticsearch "GROUP BY"
Aggregations - The Elasticsearch "GROUP BY"
 
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
原件一样(UWO毕业证书)西安大略大学毕业证成绩单留信学历认证
 
How to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data AnalyticsHow to Transform Clinical Trial Management with Advanced Data Analytics
How to Transform Clinical Trial Management with Advanced Data Analytics
 
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
Jual Obat Aborsi Bandung (Asli No.1) Wa 082134680322 Klinik Obat Penggugur Ka...
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
What is Insertion Sort. Its basic information
What is Insertion Sort. Its basic informationWhat is Insertion Sort. Its basic information
What is Insertion Sort. Its basic information
 
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
NO1 Best Kala Jadu Expert Specialist In Germany Kala Jadu Expert Specialist I...
 
Predictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting TechniquesPredictive Precipitation: Advanced Rain Forecasting Techniques
Predictive Precipitation: Advanced Rain Forecasting Techniques
 
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotecAbortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
Abortion pills in Riyadh Saudi Arabia (+966572737505 buy cytotec
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae CoolbethDigital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
Digital Marketing Demystified: Expert Tips from Samantha Rae Coolbeth
 
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...Genuine love spell caster )! ,+27834335081)   Ex lover back permanently in At...
Genuine love spell caster )! ,+27834335081) Ex lover back permanently in At...
 
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor NetworksSensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
Sensing the Future: Anomaly Detection and Event Prediction in Sensor Networks
 
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
如何办理哥伦比亚大学毕业证(Columbia毕业证)成绩单原版一比一
 
The Significance of Transliteration Enhancing
The Significance of Transliteration EnhancingThe Significance of Transliteration Enhancing
The Significance of Transliteration Enhancing
 
社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction社内勉強会資料_Object Recognition as Next Token Prediction
社内勉強会資料_Object Recognition as Next Token Prediction
 
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
如何办理(WashU毕业证书)圣路易斯华盛顿大学毕业证成绩单本科硕士学位证留信学历认证
 
Bios of leading Astrologers & Researchers
Bios of leading Astrologers & ResearchersBios of leading Astrologers & Researchers
Bios of leading Astrologers & Researchers
 
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital AgeCredit Card Fraud Detection: Safeguarding Transactions in the Digital Age
Credit Card Fraud Detection: Safeguarding Transactions in the Digital Age
 
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
Abortion Clinic in Randfontein +27791653574 Randfontein WhatsApp Abortion Cli...
 

1.6 PI Mechanics.pdf

  • 1. Primary Index Mechanics After completing this module, you will be able to: • Explain the role of the hashing algorithm and the hash map in locating a row. • Explain the makeup of the Row ID and its role in row storage. • Describe the sequence of events for locating a row given its PI value.
  • 2. Hashing Primary Index Values Hashing Algorithm RH Data Row Hash PI values DSW and data PARSER Data Table Message Passing Layer (Hash Maps) AMP 1 AMP n - 1 AMP x ... ... AMP 0 AMP n PI value = 38 Hashing Algorithm 1177 7C3C SQL with primary index values and data. For example: Assume PI value is 38 Summary The MPL uses the DSW of 1177 and uses this value to locate bucket #1177 in the Hash Map. Bucket# 1177 contains the AMP number that has this hash value – effectively the AMP with this row. DSW Hash Maps AMP # Row ID Row Data Row Hash Uniq Value x '00000000' x'1177 7C3C' 0000 0001 38 x 'FFFFFFFF'
  • 3. Hashing Down to the AMPs Index value(s) hashing algorithm Hash Map AMP # The hashing algorithm is designed to insure even distribution of unique values across all AMPs. Different hashing algorithms are used for different international character sets. A Row Hash is the 32-bit result of applying a hashing algorithm to an index value. The DSW or Hash Bucket is represented by the high order 16 bits of the Row Hash. A Hash Map is uniquely configured for each system. It is a array of 65,536 entries (buckets) which associates bucket numbers with specific AMPs. Two systems with the same number of AMPs will have the same Hash Map. Changing the number of AMPs in a system requires a change to the Hash Map. { { { { DSW or Hash Bucket # Row Hash
  • 4. A Hashing Example Order Order Number PK UPI Customer Number Order Date Order Status 7325 2 4/13 O 7324 3 4/13 O 7415 3 4/13 O 7415 1 4/13 C 7103 1 4/10 O 7225 2 4/15 C 7384 1 4/12 C 7402 3 4/12 C 7188 1 4/13 C 7202 2 4/09 C SELECT * FROM order WHERE order_number = 7202; 7202 Hashing Algorithm 691B 14AE 32 bit Row Hash Remaining 16 bits Destination Selection Word 0110 1001 0001 1011 0001 0100 1010 1110 6 9 1 B
  • 5. The Hash Map 7202 Hashing Algorithm (Hexadecimal) 691B 14AE HASH MAP 07 06 07 06 07 04 05 06 05 05 14 09 14 13 03 04 15 08 02 04 01 00 14 14 03 02 03 09 01 00 02 15 01 00 15 11 14 14 13 13 14 14 08 09 15 10 09 09 07 06 15 13 11 06 15 08 15 15 08 08 11 07 05 10 04 12 11 13 05 10 07 07 03 02 11 04 01 00 11 13 11 11 12 10 03 02 06 13 01 00 06 05 07 06 05 12 0 1 2 3 4 5 6 7 8 9 A B C D E F 690 691 692 693 694 695 32 bit Row Hash Remaining 16 bits Destination Selection Word 0110 1001 0001 1011 0001 0100 1010 1110 6 9 1 B AMP 9 7202 2 4/09 C Note: This partial Hash Map is based on a 16 AMP system and AMPs are shown in decimal format.
  • 6. Identifying Rows Consideration #1 A Row Hash = 32 bits = 4.2 billion possible values Because there is an infinite number of possible data values, some data values will have to share the same row hash. Hash Algorithm 1254 7769 10A2 2936 10A2 2936 Hash Synonyms Data values input Consideration #2 A Primary Index may be non-unique (NUPI). Different rows will have the same PI value and thus the same row hash. A row hash is not adequate to uniquely identify a row. Conclusion A row hash is not adequate to uniquely identify a row. Hash Algorithm (John) 'Smith' 0016 5557 (Dave) 'Smith' NUPI Duplicates Rows have same hash 0016 5557
  • 7. The Row ID To uniquely identify a row, we add a 32-bit uniqueness value. The combined row hash and uniqueness value is called a Row ID. Row Hash (32 bits) Uniqueness Id (32 bits) Row ID Each stored row has a Row ID as a prefix. Rows are logically maintained in Row ID sequence. Row ID Row Data 3B11 5032 0000 0001 1018 Reynolds Jane 3B11 5032 0000 0002 1020 Davidson Evan 3B11 5032 0000 0003 1031 Green Jason 3B11 5033 0000 0001 1014 Jacobs Paul 3B11 5034 0000 0001 1012 Chevas Jose 3B11 5034 0000 0002 1021 Carnet Jean : : : : : Row Hash Unique ID Emp_No Last_Name First_Name Row ID Row Data
  • 8. Storing Rows (1 of 2) Assumptions: Last_Name is defined as a NUPI. All rows in this example hash to the same AMP. Add a row for 'John Smith' 'Smith' Hash Algorithm 0016 5557 Hash Map AMP #3 Row ID Row Data Row Hash Unique ID Last_Name First_Name Etc. 0016 5557 0000 0001 Smith John Add a row for 'Sam Adams' 'Adams' Hash Algorithm 1058 9829 Hash Map AMP #3 Row ID Row Data Row Hash Unique ID Last_Name First_Name Etc. 0016 5557 0000 0001 Smith John 1058 9829 0000 0001 Adams Sam
  • 9. Storing Rows (2 of 2) Add a row for 'Fred Smith' - (NUPI Duplicate) Row ID Row Data Row Hash Unique ID Last_Name First_Name Etc. 0016 5557 0000 0001 Smith John 0016 5557 0000 0002 Smith Fred 1058 9829 0000 0001 Adams Sam 'Smith' Hash Algorithm 0016 5557 Hash Map AMP #3 Add a row for 'Dan Jones' - (Hash Synonym) 'Jones' Hash Algorithm 0016 5557 Hash Map AMP #3 Row ID Row Data Row Hash Unique ID Last_Name First_Name Etc. 0016 5557 0000 0001 Smith John 0016 5557 0000 0002 Smith Fred 0016 5557 0000 0003 Jones Dan 1058 9829 0000 0001 Adams Sam Given the row hash, what other information would be needed to find the 'Dan Jones' row? … The 'Fred Smith' row?
  • 10. Locating a Row On An AMP Using a PI Locating a row on an AMP requires three input elements: 1. The Table ID 2. The Row Hash of the PI 3. The PI value itself Cyl 1 Index Cyl 2 Index Cyl 3 Index Cyl 4 Index Cyl 5 Index Cyl 6 Index Cyl 7 Index M a s t e r I n d e x Data Row Data Row DATA BLOCK AMP #3 Cylinder # PI Value Master Index Cylinder Index Data Block Table Id Row Hash Table Id Row Hash Cylinder # Row Hash PI Value Cylinder # Data Block Address Data Row START WITH: FIND: APPLY TO: Table ID Row Hash
  • 11. Review Questions Fill in the Blanks 1. The output of the hashing algorithm is called the _____ _____. 2. To determine the target AMP, the Message Passing Layer must lookup an entry in the Hash Map based on the ________ number. 3. Two different PI values which hash to the same value are called Hash ___________ . 4. A Row ID consists of a row hash plus a ____________ value. 5. A uniqueness value is required to produce a unique Row ID because of _______ _________ and ______ ___________ . 6. Once the target AMP has been determined for a PI search, the _______ ________ for that AMP must be consulted. 7. The Cylinder Index points us to the address and length of the data _______ .
  • 12. Review Question Answers Fill in the Blanks 1. The output of the hashing algorithm is called the Row Hash. 2. To determine the target AMP, the Message Passing Layer must lookup an entry in the Hash Map based on the DSW or bucket number. 3. Two different PI values which hash to the same value are called Hash Synonyms . 4. A Row ID consists of a row hash plus a uniqueness value. 5. A uniqueness value is required to produce a unique Row ID because of hash synonyms and NUPI duplicates . 6. Once the target AMP has been determined for a PI search, the Master Index for that AMP must be consulted. 7. The Cylinder Index points us to the address and length of the data block .