Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Introduction to Teradata And How Teradata Works

3,188 views

Published on

Watch How Teradata works with Introduction to teradata ,How Teradata Visual Explain Works,teradata database and tools,teradata database model,teradata hardware and software architecture,teradata database security,teradata storage based on primary index

Published in: Education
  • Be the first to comment

Introduction to Teradata And How Teradata Works

  1. 1. 1 Introduction to Teradata
  2. 2. 2 How Teradata Works
  3. 3. 3 How DoesTeradata Store Rows?  Teradata uses hashing algorithm to randomly and evenly distribute data across all AMPs.  The rows of every table are distributed among all AMPs - and ideally will be evenly distributed among all AMPs.  Each AMP is responsible for a subset of the rows of each table.  Evenly distributed tables result in evenly distributed workloads.  The data is not placed in any particular order The benefits of unordered data include:  No maintenance needed to preserve order, and  It is independent of any query being submitted. The benefits of automatic data placement include:  Distribution is the same regardless of data volume  Distribution is based on row content, not data demographics
  4. 4. 4 Primary Indexes  The mechanism used to assign a row to an AMP  A table must have a Primary Index  The Primary Index cannot be changed UPI  If the index choice of column(s) is unique, we call this a UPI (Unique Primary Index).  A UPI choice will result in even distribution of the rows of the table across all AMPs. NUPI  If the index choice of column(s) isn’t unique, we call this a NUPI (Non- Unique Primary Index).  A NUPI choice will result in even distribution of the rows of the table proportional to the degree of uniqueness of the index. UPI’s guarantee even data distribution and eliminate duplicate row checking. Why would you choose an Index that is different from the Primary Key? • Join performance • Known access paths
  5. 5. 5 Data Storage based on Primary Index  The value of the Primary Index for a specific row determines its AMP assignment.  This is done using the hashing algorithm. PIValue AMP AMP AMP PE Row assignment Row access Hashing algorithm Accessing the row by its Primary Index value is:  Always a one-AMP operation  The most efficient way to access a row
  6. 6. 6 Row Distribution Using a UPI Order Number Customer Number Order Date Order Status PK UPI 7325 7324 7415 7103 7225 7384 7402 7188 7202 2 3 1 1 2 1 3 1 2 4/13 4/13 4/13 4/10 4/15 4/12 4/16 4/13 4/09 O O C O C C C C C The PK column(s) will often be used as a UPI. PI values for Order_Number are known to be unique (it’s a PK). Teradata will distribute different index values evenly across AMPs. Resulting row distribution among AMPs is uniform. AMP 1 AMP 2 AMP 3 AMP 4 7202 2 4/09 C 7402 3 4/16 C 7325 2 4/13 C 7225 2 4/15 C 7188 1 4/13 C 7384 1 4/12 C 7324 3 4/13 C 7103 1 4/10 C 7415 1 4/13 C Order
  7. 7. 7 Row Distribution Using a NUPI Order Number Customer Number Order Date Order Status PK NUPI 7325 7324 7415 7103 7225 7384 7402 7188 7202 2 3 1 1 2 1 3 1 2 4/13 4/13 4/13 4/10 4/15 4/12 4/16 4/13 4/09 O O C O C C C C C Order Customer_Number may be the referred access column for ORDER table, thus a good index candidate. Values for Customer_Number are non-unique and therefore a NUPI. Rows with the same PI value distribute to the same AMP causing row distribution to be less uniform or skewed. 7225 2 4/15 C 7325 2 4/13 0 7415 1 4/13 C 7384 1 4/12 C 7324 3 4/13 0 7402 3 4/16 C7103 1 4/10 C AMP 1 AMP 2 AMP 4 7202 2 4/09 C 7188 1 4/13 C AMP 3
  8. 8. 8 Secondary Indexes Three general ways to access a table:  Primary index access (one-AMP access)  Secondary index access (two-or all-AMP access)  FullTable Scan (all-AMP access) A secondary index is an alternate path to the rows of a table. A table can have from 0 to 32 secondary indexes. Secondary indexes: • Do not affect table distribution. • Add overhead, both in terms of disk space and maintenance. • May be added or dropped dynamically as needed. • Are chosen to improve table performance.
  9. 9. 9 Customer table Id = 100 USI Value = 56 Table ID Row Hash USI Value 100 602 56 Hashing Algorithm PE CREATE UNIQUE INDEX (cust) on customer; SELECT * FROM customer WHERE cust = 56; Create USI Access via USI - * - AMP 1 AMP 2 AMP 3 AMP 4 RowID Cust RowID RowID Cust RowID RowID CustRowID RowID Cust RowID BYNET AMP 2 Table ID 100 Row Hash 778 Unique Val 7 USI Subtable USI Subtable USI SubtableUSI Subtable BYNET AMP 1 AMP 3 AMP 4 74 77 51 27 884, 1 639, 1 915, 9 388, 1 244, 1 505, 1 744, 4 757, 1 84 98 56 49 536, 5 555, 6 778, 7 147, 1 296, 1 135, 1 602, 1 969, 1 31 40 45 95 638, 1 640, 1 471, 1 778, 3 288, 1 339, 1 372, 2 588, 1 175, 1 37 107, 1 489, 1 72 717, 2 838, 1 12 147, 2 919, 1 62 822, 1 Adams Smith Rice White555-4444 111-2222 222-3333 666-5555 31 37 40 84 107, 1 536, 5 638, 1 640, 1 RowIDCustName Phone NUPI Base Table US I RowIDCust Name Phone NUPI Base Table US I Base Table Base Table Adams Smith Brown Adams444-6666 666-7777 555-6666 333-9999 72 45 74 98 471, 1 555, 6 717, 2 884, 1 Jones Black Young Smith111-6666 222-8888 444-5555 777-4444 27 49 62 12 147, 1 147, 2 388, 1 822, 1 RowIDCustName Phone NUPIUS I Smith Marsh Peters Jones 777-6666 555-7777 888-2222 555-7777 56 77 51 95 639, 1 778, 3 778, 7 915, 9 RowIDCustName Phone NUPIUS I Unique Secondary Index (USI) Access
  10. 10. 10 Non-Unique Secondary Index (NUSI) Access Table ID 100 Row Hash 567 NUSI Value ‘Adams’ Hashing Algorithm Customer table Id = 100 BYNET AMP 2 NUSI Value = ‘Adams’ PE CREATE INDEX (name) on customer; SELECT * FROM customer WHERE name = ‘Adams’; Create NUSI Access via NUSI AMP 1 Brown Adams Smith 555, 6 471, 1 717, 2 884, 1 852, 1 567, 2 432, 3 RowID Name RowID White Rice Adams Smith 107, 1 536, 5 638, 1 640, 1 448, 1 656, 1 567, 3 432, 8 RowID Name RowID NUSI Subtable NUSI Subtable Smith Young Jones Black 147, 1 147, 2 338, 1 822, 1 432, 1 770, 1 567, 6 448, 4 RowID Name RowID NUSI Subtable Jones Peters Smith Marsh 639, 1 778, 3 778, 7 915, 9 262, 1 396, 1 432, 5 155, 1 RowID Name RowID NUSI Subtable AMP 4AMP 3 Adams Smith Rice White 555-4444 111-2222 222-3333 666-5555 31 37 40 84 107, 1 536, 5 638, 1 640, 1 RowIDCust Name Phone NUPI Base Table RowIDCust Name Phone NUPI Base Table Base Table Base Table Adams Smith Brown Adams444-6666 666-7777 555-6666 333-9999 72 45 74 98 471, 1 555, 6 717, 2 884, 1 Jones Black Young Smith 111-6666 222-8888 444-5555 777-4444 27 49 62 12 147, 1 147, 2 388, 1 822, 1 RowIDCust Name Phone NUPI Smith Marsh Peters Jones 777-6666 555-7777 888-2222 555-7777 56 77 51 95 639, 1 778, 3 778, 7 915, 9 RowIDCust Name Phone NUPINUSI NUSI NUSI NUSI
  11. 11. 11 Comparison of Primary and Secondary Indexes Index Feature Primary Secondary Required? Yes No Number per Table 1 0-32 Max Number of Columns 16 16 Unique or Non-Unique? Both Both Affects Row Distribution Yes No Created/Dropped Dynamically No Yes Improves Access Yes Yes Multiple Data Types Yes Yes Separate Physical Structure None Sub-table Extra Processing Overhead No Yes
  12. 12. 12 Primary Keys and Primary Indexes Primary Key Logical concept of data modeling Teradata doesn’t need to recognize No limit on column numbers Documented in data model (Optional in CREATETABLE) Must be unique Uniquely identifies each row Values should not change May not be NULL—requires a value Does not imply an access path Chosen for logical correctness Primary Index Physical mechanism for access and storage Each table must have exactly one 16-column limit Defined in CREATETABLE statement May be unique or non-unique Used to place and locate each row on an AMP Values may be changed (Del+ Ins) May be NULL Defines most efficient access path Chosen for physical performance Indexes are conceptually different from keys: A PK is a relational modeling convention which uniquely identified each row. A PI is aTeradata convention which determines how the rows are stored and accessed. A significant percentage of tables may use the same columns for both the PK and PI. A well-designed database will use a PI that is different from the PK for some tables.
  13. 13. LearnTeradata Online Contact: USA: +1 732 325 1626 India: +91 800 811 4040 Mail: info@bigclasses.com /bigclasses /bigclasses /bigclasses http://bigclasses.com/teradata-online-training.html
  14. 14. 1 4 WatchTeradata DEMOVideo OnYouTube www.youtube.com/user/bigclassescom

×