Your SlideShare is downloading. ×
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Introduction to teradata warehouse
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Introduction to teradata warehouse

2,451

Published on

Published in: Technology
2 Comments
0 Likes
Statistics
Notes
  • http://www.dbmanagement.info/Tutorials/TeraData.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • http://www.dbmanagement.info/Tutorials/TeraData.htm
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total Views
2,451
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
185
Comments
2
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Teradata DatabaseIntroduction to Teradata Warehouse Release V2R6.2 (Teradata Database)/ Release 8.2 (Teradata Warehouse) B035-1091-096A September 2006
  • 2. The product described in this book is a licensed product of Teradata, a division of NCR Corporation.NCR, Teradata and BYNET are registered trademarks of NCR Corporation.Adaptec and SCSISelect are registered trademarks of Adaptec, Inc.EMC, PowerPath, SRDF, and Symmetrix are registered trademarks of EMC Corporation.Engenio is a trademark of Engenio Information Technologies, Inc.Ethernet is a trademark of Xerox Corporation.GoldenGate is a trademark of GoldenGate Software, Inc.Hewlett-Packard and HP are registered trademarks of Hewlett-Packard Company.IBM, CICS, DB2, MVS, RACF, OS/390, Tivoli, and VM are registered trademarks of International Business Machines Corporation.Intel, Pentium, and XEON are registered trademarks of Intel Corporation.KBMS is a registered trademark of Trinzic Corporation.Linux is a registered trademark of Linus Torvalds.LSI, SYM, and SYMplicity are registered trademarks of LSI Logic Corporation.Active Directory, Microsoft, Windows, Windows Server, and Windows NT are either registered trademarks or trademarks of MicrosoftCorporation in the United States and/or other countries.Novell is a registered trademark of Novell, Inc., in the United States and other countries. SUSE is a trademark of SUSE LINUX Products GmbH,a Novell business.QLogic and SANbox are registered trademarks of QLogic Corporation.SAS and SAS/C are registered trademark of SAS Institute Inc.Sun Microsystems, Sun Java, Solaris, SPARC, and Sun are trademarks or registered trademarks of Sun Microsystems, Inc. in the U.S. or othercountries.Unicode is a registered trademark of Unicode, Inc.UNIX is a registered trademark of The Open Group in the US and other countries.NetVault is a trademark and BakBone is a registered trademark of BakBone Software, Inc.NetBackup and VERITAS are trademarks of VERITAS Software Corporation.Other product and company names mentioned herein may be the trademarks of their respective owners.THE INFORMATION CONTAINED IN THIS DOCUMENT IS PROVIDED ON AN “AS-IS” BASIS, WITHOUT WARRANTY OF ANY KIND, EITHEREXPRESS OR IMPLIED, INCLUDING THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO THE ABOVE EXCLUSION MAYNOT APPLY TO YOU. IN NO EVENT WILL NCR CORPORATION (NCR) BE LIABLE FOR ANY INDIRECT, DIRECT, SPECIAL, INCIDENTAL ORCONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS OR LOST SAVINGS, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCHDAMAGES.The information contained in this document may contain references or cross references to features, functions, products, or services that arenot announced or available in your country. Such references do not imply that NCR intends to announce such features, functions, products,or services in your country. Please consult your local NCR representative for those features, functions, products, or services available in yourcountry.Information contained in this document may contain technical inaccuracies or typographical errors. Information may be changed or updatedwithout notice. NCR may also make improvements or changes in the products or services described in this information at any time without notice.To maintain the quality of our products and services, we would like your comments on the accuracy, clarity, organization, and value of thisdocument. Please e-mail: teradata-books@lists.ncr.comAny comments or materials (collectively referred to as “Feedback”) sent to NCR will be deemed non-confidential. NCR will have no obligationof any kind with respect to Feedback and will be free to use, reproduce, disclose, exhibit, display, transform, create derivative works of anddistribute the Feedback and derivative works thereof without limitation on a royalty-free basis. Further, NCR will be free to use any ideas,concepts, know-how or techniques contained in such Feedback for any purpose whatsoever, including developing, manufacturing, or marketingproducts or services incorporating Feedback.Copyright © 2002–2006 by NCR Corporation. All Rights Reserved.
  • 3. PrefacePurpose This book provides an introduction to the Teradata Warehouse covering the following broad topics: • The relational model and Teradata Database architecture • Teradata Database hardware and software architecture • Teradata RASUI (reliability, availability, serviceability, usability, and installability) • Communication between the client and the Teradata Database • Data definitions and data manipulation using Structured Query Language (SQL) • Data distribution and data access methods • Concurrent control and transaction recovery • International character support • Query and database analysis tools • Database security and system administration • Managing and monitoring the Teradata DatabaseAudience This book is intended for users who interface with the Teradata Warehouse. Such individuals may include database users or administrators.Supported Software Release This book supports Teradata® Database Release V2R6.2 and Teradata® Warehouse Release 8.2.Prerequisites To gain an understanding of Teradata Warehouse, you should be familiar with the following: • Basic computer technology • System hardware • Teradata Tools and UtilitiesIntroduction to Teradata Warehouse iii
  • 4. PrefaceChanges to This BookChanges to This Book This book includes the following changes to support the current release. Date Description September 2006 • Reorganized parts of the book. • Revised / amplified the chapter on: • The data warehouse • The Teradata Warehouse • Database objects, database and users • International language support • Query and database analysis tools • Teradata database security • Updated the book for V2R6.2-specific features, including Write-Ahead Logging (WAL). November 2005 • Reorganized the book. • Revised the chapter on database security. • Added information on the following features to appropriate chapters: • User-defined types (UDTs) • User-defined methods (UDMs) • Teradata Active System Management (TASM) • Partitioned Primary Index (PPI) for global temporary and volatile tables • Enhancements to Teradata System Emulation Tool (SET)Additional Information Additional information that supports this product and the Teradata Database is available at the following Web sites.iv Introduction to Teradata Warehouse
  • 5. Preface References to Microsoft Windows Type of Information Description Source Overview of the The Release Definition provides the http://www.info.ncr.com/ release following information: Click General Search. In the Publication Product ID Information too late • Overview of all the products in the field, enter 1725 and click Search to bring up the for the manuals release following Release Definition: • Information received too late to be • Base System Release Definition included in the manuals B035-1725-096K • Operating systems and Teradata Database versions that are certified to work with each product • Version numbers of each product and the documentation for each product • Information about available training and support center Additional Use the NCR Information Products http://www.info.ncr.com/ information related Publishing Library site to view or download Click General Search, and select Software - Teradata to this product the most recent versions of all manuals. Database for a list of all of the publications for this Specific manuals that supply related or release. additional information to this manual are listed. CD-ROM images This site contains a link to a downloadable http://www.info.ncr.com/ CD-ROM image of all customer Click General Search. In the Title or Keyword field, documentation for this release. Customers enter CD-ROM, and Click Search. are authorized to create CD-ROMs for their use from this image. Ordering Use the NCR Information Products http://www.info.ncr.com/ information for Publishing Library site to order printed Click How to Order under Print & CD Publications. manuals versions of manuals. General information The Teradata home page provides links to Teradata.com about Teradata numerous sources of information about Teradata. Links include: • Executive reports, case studies of customer experiences with Teradata, and thought leadership • Technical information, solutions, and expert advice • Press releases, mentions and media resourcesReferences to Microsoft Windows This book refers to “Microsoft Windows.” For Teradata Database V2R6.2, such references mean Microsoft Windows Server 2003 32-bit and Microsoft Windows Server 2003 64-bit.Introduction to Teradata Warehouse v
  • 6. PrefaceReferences to Microsoft Windowsvi Introduction to Teradata Warehouse
  • 7. Table of Contents Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Supported Software Release . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii Changes to This Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv Additional Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv References to Microsoft Windows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .v Chapter 1: The Data Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 What is a Data Warehouse? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 What is an Active Data Warehouse? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1 Strategic Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Tactical Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 The Active Teradata Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Active Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2 Active Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Active Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Active Workload Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Active Enterprise Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 Active Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4SECTION 1 Teradata Warehouse Overview Chapter 2: The Teradata Warehouse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 The Teradata Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Attachment Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7 Teradata Structured Query Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7Introduction to Teradata Warehouse vii
  • 8. Table of Contents Character Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 Teradata Database Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8 Teradata Database as a “Single Data Store” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9 Teradata Database Server Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Software Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10 Teradata Tools and Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 Mainframe Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11 Teradata Utility Pack for Network-Attached Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12 Database Management and Query Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13 Load and Unload Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .14 Teradata Meta Data Services. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 Preprocessors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .15 Teradata Query Director . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 Storage Management Utilities. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .16 Chapter 3: The Teradata Database Model . . . . . . . . . . . . . . . . . . . . . . . .19 What is a Relational Model?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 What is a Relational Database? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 Set Theory and Relational Database Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .19 Tables, Rows, and Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Table Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Permanent and Temporary Tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Global Temporary Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20 Volatile Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 Derived Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 Rows and Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21SECTION 2 Teradata Architecture Chapter 4: Teradata Database Hardware and Software Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 SMP and MPP Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 The BYNET. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26viii Introduction to Teradata Warehouse
  • 9. Table of Contents Boardless BYNET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Disk Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Logical Units. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Pdisks and Vdisks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Cliques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Hot Standby Nodes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 Virtual Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Parsing Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Access Module Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 AMP Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Request Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 The Dispatcher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 The AMPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Example: SQL Statement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Parallel Database Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Trusted Parallel Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 PDE and MPP Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Start and Stop PDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 The Teradata File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Workstation Types and Available Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 System Console . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Administration Workstation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 Teradata Graphical User Interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 How the Teradata GUI Communicates with the Teradata Database . . . . . . . . . . . . . . . . 37 Running the Teradata GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Teradata General Security Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 Chapter 5: Teradata RASUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Software Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Vproc Migration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 Fallback Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 AMP Clusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 One-Cluster Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Smaller Cluster Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 Journaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Teradata Archive/Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Table Rebuild Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Introduction to Teradata Warehouse ix
  • 10. Table of Contents Hardware Fault Tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .44 Teradata Replication Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .47 Chapter 6: Communication Between the Client and the Teradata Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 Attachment Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .49 CLIv2 for Channel-Attached Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50 What CLIv2 for Channel-Attached Clients Does . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50 Teradata Director Program. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50 Server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .50 CLIv2 for Network-Attached Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51 What CLIv2 for Network-Attached Clients Does . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .51 Micro Teradata Director Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52 Micro Operating System Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52 Other Types of Data Communications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 WinCLI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53 ODBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54 JDBC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .54SECTION 3 Using the Teradata Database Chapter 7: Database Objects, Databases and Users . . . . . . . . . . .59 Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59 Queue Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59 Queue Tables and Base Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59 Event Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 What is in a View? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Why Use Views? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60 Restrictions on Using Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61 Teradata Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61 Why Use Teradata Stored Procedures? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61 Elements of a Teradata Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62x Introduction to Teradata Warehouse
  • 11. Table of Contents External Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Creating External Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 SQL Statements Related to Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Single-User and Multi-User Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Macro Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Types of Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 When Do Triggers Fire?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 ANSI-Specified Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Why Use a Trigger? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Creating User-Defined Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Table Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 User-Defined Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Functions That Operate on UDTs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 User-Defined Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Instance Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Constructor Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Databases and Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Databases and Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 Creating a Finance and Administration Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Chapter 8: Structured Query Language . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Why SQL? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Types of SQL Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 Data Definition Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Data Control Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 Data Manipulation Language Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 SQL Statement Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 Statement Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 Statement Punctuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 The SELECT Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 SELECT Statement and Set Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 SELECT Statement and Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 SQL Data Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Data Type Phrase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Teradata and ANSI-Compliant Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77Introduction to Teradata Warehouse xi
  • 12. Table of Contents Data Type Attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78 Teradata Recursive Query . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 SQL Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 Scalar Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 Aggregate Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79 Ordered Analytical Functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80 Cursors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .81 Chapter 9: SQL Application Development . . . . . . . . . . . . . . . . . . . . . . . .83 Embedded SQL Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .83 How Does an Application Program Use Embedded SQL?. . . . . . . . . . . . . . . . . . . . . . . . . .83 Supported Languages and Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84 Macros as SQL Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84 SQL Used to Create a Macro. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .84 Macro Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85 SQL Used to Modify a Macro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85 SQL Used to Delete a Macro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85 Teradata Stored Procedures as SQL Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86 SQL Used to Create Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86 Stored Procedure Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .86 SQL Used to Execute a Stored Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87 DDL Statements with Stored Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87 The EXPLAIN Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88 How Is EXPLAIN Useful? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88 EXPLAIN With Simple Join Index Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .88 Third-Party Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 TS/API Products. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 Compatible Third-Party Software Products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 Performance Monitor/Application Programming Interface . . . . . . . . . . . . . . . . . . . . . . . .90 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .90 Chapter 10: Data Distribution and Data Access Methods . . . . .93 Teradata Database Indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .93 Primary Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94 Primary Indexes and Data Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94 Primary Key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .94xii Introduction to Teradata Warehouse
  • 13. Table of Contents Foreign Key. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94 How Are Primary Indexes and Primary Keys Related?. . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 Partitioned Primary Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 How Do Partitioned and Non-Partitioned Primary Indexes Compare? . . . . . . . . . . . . . 96 Secondary Indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Secondary Index Subtables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 How Do Primary and Secondary Indexes Compare? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Join Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Single-Table Join Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 Multi-Table Join Indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Aggregate Join Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Sparse Join Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98 Hash Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Index Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Creating Indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Strengths and Weaknesses of Various Types of Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Identity Column . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Normalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 First, Second, and Third Normal Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Referential Integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Referential Integrity Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Referencing (Child) Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Referenced (Parent) Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Why Is Referential Integrity Important? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 Chapter 11: Concurrency Control and Transaction Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 What is Concurrency Control? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Definition of a Transaction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Definition of Serializability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Transaction Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 ANSI Mode Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108 Teradata Mode Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Locks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Overview of Teradata Database Locking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Introduction to Teradata Warehouse xiii
  • 14. Table of Contents Why Do Database Management Systems Require Locking? . . . . . . . . . . . . . . . . . . . . . . .110 Lock Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .110 Levels of Lock Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111 Automatic Database Lock Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .111 Deadlocks and Deadlock Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 Host Utility Locks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .112 HUT Lock Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 HUT Lock Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 Recovery for Transactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .113 System and Media Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114 System Restarts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114 Transaction Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .114 Down AMP Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Two-Phase Commit Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Definition of Participant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 Definition of Coordinator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .115 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .116 Chapter 12: The Data Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 What is the Data Dictionary? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 Data Dictionary Content. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .117 What is in a Data Dictionary Table?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .118 Data Dictionary Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 Who Uses Data Dictionary Views? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .121 SQL Access to the Data Dictionary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .123 Chapter 13: International Language Support . . . . . . . . . . . . . . . . . .125 Character Set Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125 What Is a Repertoire?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125 Character Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .126 External and Internal Character Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .126 Character Data Translation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .126 What Teradata Database Supports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127 Teradata Database Character Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127 Internal Server Character Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127 User Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .127xiv Introduction to Teradata Warehouse
  • 15. Table of Contents System Dictionary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 Language Support Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Default Character Set for User Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Character Set for System Dictionary Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Character Set for Dictionary Data Other Than Object Names . . . . . . . . . . . . . . . . . . . . 129 Standard Language Support Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 LATIN Character Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 Compatible Languages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Japanese Language Support Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 Advantages (over Latin) of Storing System Dictionary Data Using KANJI1 . . . . . . . . . 130 Advantages (over Latin or Kanji1) of Storing User Data Using UNICODE . . . . . . . . . 130 Extended Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Chapter 14: Query and Database Analysis Tools . . . . . . . . . . . . . 133 Teradata Visual EXPLAIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Teradata System Emulation Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Teradata Index Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Demographics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 Teradata Statistics Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 Query Capture Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 QCD Schema Improvement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Teradata Index Wizard Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Database Query Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Target Level Emulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Database Object Use Count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138SECTION 4 Managing and Monitoring Teradata Chapter 15: Teradata Database Security . . . . . . . . . . . . . . . . . . . . . . 143 Security Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Security Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143 Security Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144Introduction to Teradata Warehouse xv
  • 16. Table of Contents User Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145 Logon Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146 Logon Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146 Password Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .146 Password Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .147 External Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .147 User Authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .148 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .149 Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .149 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150 Logon Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150 Message Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150 Data Integrity Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .150 Directory Management of Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 Supported Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 Directory User Logons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 Integrating Directory Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 Directory Managed Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .151 Profiles for Directory Users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152 Directory Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152 Monitoring Access to the Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152 Defining a Security Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153 Publishing a Security Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .153 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .154 Chapter 16: System Administration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155 Roles and Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155 Session Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156 Session Requests. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156 Establishing a Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156 Logon Operands. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .156 Maintenance Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .157 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .160 Chapter 17: Database Management Tools and Utilities . . . . . .161 Data Archiving Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161 Teradata Archive/Recovery Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .161xvi Introduction to Teradata Warehouse
  • 17. Table of Contents Open Teradata Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Data Load and Export Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Teradata MultiLoad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 Teradata FastLoad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Teradata Parallel Data Pump . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Teradata FastExport Utility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Session and Configuration Management Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 System Resource and Workload Management Tools and Protocols . . . . . . . . . . . . . . . . . . . 165 Write Ahead Logging (WAL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 Ferret Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Priority Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 Teradata MultiTool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Teradata Active System Management (TASM). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Teradata SQL Assistant. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Chapter 18: Aspects of System Monitoring. . . . . . . . . . . . . . . . . . . . . 171 Teradata Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171 Teradata Graphical User Interface (GUI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Resource Usage (ResUsage) Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Resource Usage Tables and Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Resource Usage Data Categories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Resource Usage Data Handling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 Resource Usage Macros . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176 How to Control Collection and Logging of Resource Usage Data . . . . . . . . . . . . . . . . . 176 Summary Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Account String Expansion (ASE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 The TDPTMON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 System Management Facility. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 The Performance Monitor/Application Programming Interface . . . . . . . . . . . . . . . . . . 178 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Chapter 19: Teradata Meta Data Services . . . . . . . . . . . . . . . . . . . . . 181 What is Metadata?. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Types of Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Teradata Meta Data Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182Introduction to Teradata Warehouse xvii
  • 18. Table of Contents Creating the Teradata Meta Data Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183 Connecting to the Teradata Meta Data Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .183 For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .184 Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .185 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .189xviii Introduction to Teradata Warehouse
  • 19. CHAPTER 1 The Data Warehouse This chapter presents an overview of the Teradata Warehouse. Topics include: • What is a data warehouse? • What is an active data warehouse? • The active Teradata WarehouseWhat is a Data Warehouse? Initially, the data warehouse was a historical database, enterprise-wide and centralized, containing data derived from an operational database. The data in the data warehouse was: • Subject-oriented • Integrated • Usually identified by a timestamp • Nonvolatile, that is, nothing was added or removed Rows in the tables supporting the operational database were loaded into the data warehouse (the historical database) after they exceeded some well-defined date. Data could be queried, but the responses returned only reflected historical information. In this sense, a data warehouse was initially static, and even if a historical data warehouse contained data that was being updated, it would still not be an active data warehouse.What is an Active Data Warehouse? An active data warehouse: • Provides a single up-to-date view of the enterprise on one platform. • Represents a logically consistent store of detailed data available for strategic, tactical and event driven business decision making. • Relies on timely updates to the critical data - as close to real time as needed. • Supports short, tactical queries that return in seconds, alongside of traditional decision support.Introduction to Teradata Warehouse 1
  • 20. Chapter 1: The Data WarehouseThe Active Teradata WarehouseStrategic Queries Strategic queries represent business questions that are intended to draw strategic advantage from large stores of data. Strategic queries are often complex queries, involving aggregations and joins across multiple tables in the database. They are sometimes long-running and tend not to have a strict service level expectation. Strategic queries are sometimes ad hoc. They may require significant database resources to execute and they are often submitted from 3rd party tools.Tactical Queries Tactical queries are short, highly tuned that facilitate action-taking or decision-making in a time-sensitive environment. They usually come with a clear service level expectation and consume a very small percentage of the overall system resources. Tactical queries are usually repetitively executed and take advantage of techniques such as request (query plan) caching and session-pooling.The Active Teradata Warehouse As an active data warehouse, the Teradata Warehouse provides both Strategic Intelligence and Operational Intelligence. • Strategic Intelligence entails delivering intelligence through tools and utilities and query mechanisms that support strategic decision-making. This includes, for example, providing users with simple as well as complex reports throughout the day which indicate the business trends that have occurred and that are occurring, which show why such trends occurred, and which predict if they will continue to occur. • Operational Intelligence entails delivering intelligence through tools and utilities and query mechanisms that support front-line or operational decision-making. This includes, for example, ensuring aggressive Service Levels Goals (SLGs) with respect to high performance, data freshness, and system availability.Active Load The Teradata Warehouse is able to load data actively and in a non-disruptive manner and, at the same time, process other workloads. The Teradata Warehouse delivers Active Load through methods that support continuous data loading. These include streaming from a queue, more frequent batch updates, referred to as mini-batches, and moving changed data from another database platform to the Teradata Warehouse. These methods exercise such Teradata Database features as queue tables and triggers, and use FastLoad, MultiLoad and TPump utilities.2 Introduction to Teradata Warehouse
  • 21. Chapter 1: The Data Warehouse The Active Teradata Warehouse The Teradata Warehouse can effectively manage a complex workload environment on a “single version of the truth.”Active Access The Teradata Warehouse is able to access analytical intelligence quickly and consistently in support of operational business processes. But the benefit of Active Access entails more than just speeding up user and customer requests. Active Access provides intelligence for operational and customer interactions consistently. Active Access queries, also referred to a tactical queries, support of tactical decision-making at the front-line. Such queries can be informational, such as simply retrieving a customer record or transaction, or they may include complex analytics.Active Events The Teradata Warehouse is able to detect a business event automatically, apply business rules against current and historical data, and initiate operational actions when appropriate. This enables enterprises to reduce the latency between the identification of an event and taking action with respect to it. Active Events entails more than event detection. When notified of "something important,” the Teradata Warehouse presents users with recommendations for appropriate action. The analysis done for users may prescribe the best course of action or give them alternatives from which to choose.Active Workload Management The Teradata Warehouse is able to manage mixed workloads dynamically and to optimize system resource utilization to meet business goals. Teradata Active System Management (TASM) is a portfolio of products that enables the real- time system management required for delivering Active Enterprise Intelligence. TASM assists the database administrator in analyzing and establishing workloads and resource allocation to meet business needs. TASM facilitates monitoring workload requests to ensure that resources are used efficiently and that dynamic workloads are prioritized automatically. TASM also provides state-of-the-art techniques to visualize the current operational environment and to analyze long-term trends. TASM enables database administrators to set SLGs, to monitor adherence to them, and to take any necessary steps to reallocate resources to meet business objectives.Active Enterprise Integration The Teradata Warehouse is able to integrate itself into enterprise business and technical architectures, especially those that support business users, partners, and customers. This simplifies the task of coordinating enterprise applications and business processes. For example, a Teradata event, generated from a database trigger, calls an external stored procedure. It publishes a message via a WebSphere MQ-Series message bus. The message isIntroduction to Teradata Warehouse 3
  • 22. Chapter 1: The Data WarehouseThe Active Teradata Warehouse delivered to a JMS queue on a Web Logic application server. The Teradata Application Platform (TAP) receives the JMS message, notifies the user via their UI and activates a TAP service or schedules a job for later execution.Active Availability The Teradata Warehouse is able to meet business objectives for its own “uptime.” Moreover, it assists customers in identifying application-specific availability, recoverability, and performance requirements based on the impact of enterprise downtime. The Teradata Warehouse recommends strategies for evolving business continuity. Such strategies range, for example, from Teradatas own “single system” availability through its support for large cliques, hot standby nodes, and fallback4 Introduction to Teradata Warehouse
  • 23. SECTION 1 Teradata Warehouse OverviewIntroduction to Teradata Warehouse 5
  • 24. Section 1: Teradata Warehouse Overview6 Introduction to Teradata Warehouse
  • 25. CHAPTER 2 The Teradata Warehouse This chapter presents an overview of the Teradata Warehouse and its components. Topics include: • The Teradata Database • Teradata database capabilities • Teradata Database as “single data store” • Teradata Database server software • Software Installation • Teradata Tools and UtilitiesThe Teradata Database The Teradata Database is an information repository supported by tools and utilities that make it, as part of the Teradata Warehouse, a complete and active relational database management system.Attachment Methods To support its role in the active warehouse environment, the Teradata Database can use either of two attachment methods to connect to other operational computer systems as illustrated in the following table. This attachment method… Allows the system to be attached… channel directly to an I/O channel of a mainframe computer. network to intelligent workstations and other computers and devices through a Local Area Network (LAN).Teradata Structured Query Language Structured Query Language (SQL) is the language of relational database communication. Teradata SQL, which is broadly compatible with ANSI SQL, extends the capabilities of SQL by adding Teradata-specific extensions to the generic SQL statements. To manipulate data in the Teradata Database, you issue appropriate Teradata SQL statements. You can access, store, and operate on data using Teradata SQL.Introduction to Teradata Warehouse 7
  • 26. Chapter 2: The Teradata WarehouseTeradata Database Capabilities When you develop applications for the Teradata Database, you should use the most current Teradata SQL syntax because it is the most ANSI-compliant. Teradata SQL still supports older applications written in previous non-ANSI-compliant versions of Teradata SQL. You can run transactions in either Teradata or ANSI mode and these modes can be set or changed. For more information about SQL and Teradata SQL, see Chapter 8: “Structured Query Language.”Character Support Teradata has an international customer base. To accommodate communications in different languages, Teradata supports non-Latin character sets, including, among others, Japanese and Chinese. For detailed information about international character set support, see Chapter 13: “International Language Support.”Teradata Database Capabilities Teradata has designed a system that allows users to view and manage large amounts of data as a collection of related tables. Some of the capabilities of the Teradata Database are listed in the following table. Teradata Database provides… That… capacity includes: • Scaling from Gigabytes to Terabytes of detailed data stored in billions of rows. • Scaling to thousands of millions of instructions per second (MIPS) to process data. parallel processing makes Teradata Database faster than other relational systems. single data store • can be accessed by network-attached and channel-attached systems. • supports the requirements of many diverse clients. fault tolerance automatically detects and recovers from hardware failures. data integrity ensures that transactions either complete or rollback to a stable state if a fault occurs. scalable growth allows expansion without sacrificing performance. SQL serves as a standard access language that permits users to control data.8 Introduction to Teradata Warehouse
  • 27. Chapter 2: The Teradata Warehouse Teradata Database as a “Single Data Store” Teradata developers designed the Teradata Database from mostly off-the-shelf hardware components. The result was an inexpensive, high-quality system that exceeded the performance of conventional relational database management systems. The hardware components of the Teradata Database evolved from those of a simple database machine into those of a general-purpose, massively parallel computer running the database software as a Trusted Parallel Application (TPA). The architecture includes both single-node, Symmetric Multi-Processing (SMP) systems and multi-node, Massively Parallel Processing (MPP) systems in which the distributed functions communicate by means of a fast interconnect structure. The interconnect structure in the current architecture is the BYNET for MPP systems and the boardless BYNET for SMP systems.Teradata Database as a “Single Data Store” A design goal of the Teradata Database was to provide a single data store for a variety of client architectures. This approach greatly reduces data duplication and inaccuracies that can creep into data that is maintained in multiple stores. This approach to data storage is known as the single version of the truth, and Teradata implements this through heterogeneous client access. Clients can access a single copy of enterprise data and Teradata takes care of such things as data type translation, connections, concurrency, workload management, and so on. The following figure illustrates the idea of heterogeneous client access, where mainframe clients, network-attached workstations, and personal computers can access and manipulate the same database simultaneously. In this figure, the mainframes are attached via channel connections and other systems are attached via network connections. Teradata Database (Single Data Store) Local Area Network Channel IBM Windows UNIX Linux Mainframe Workstation Workstation Workstation 1091G001Introduction to Teradata Warehouse 9
  • 28. Chapter 2: The Teradata WarehouseTeradata Database Server SoftwareTeradata Database Server Software Teradata Database software resides on the server and implements the relational database environment. The server software includes the following functional modules. This module… Provides… Database Window a tool that you can use to control the operation of the Teradata Database. Teradata Gateway communications support. The server-resident program provides a pathway for applications running on network-attached clients to access the Teradata Database. The Teradata Gateway runs as a separate operating system task. The Gateway software validates messages from clients that generate sessions over the network and it controls encryption. Parallel Data Extensions (PDE) a software interface layer on top of the operating system that enables the database to operate in a parallel environment. For more information about PDE, see “Parallel Database Extensions” on page 34. Teradata Database management software the following modules: • Parsing Engine (PE), which includes: • Session controller • Parser • Optimizer • Step Generator • Dispatcher • Access module processor (AMP) • Teradata file system For more information about the Teradata file system, see “The Teradata File System” on page 35.Software Installation The Parallel Upgrade Tool (PUT) automates much of the installation process for Teradata Database software. There are two major operational modes for PUT.10 Introduction to Teradata Warehouse
  • 29. Chapter 2: The Teradata Warehouse Teradata Tools and Utilities The operational mode… Does the following… Major upgrade upgrades one or more software products to the next version. Patch upgrade applies patch packages to one or more software products.Teradata Tools and Utilities Teradata Tools and Utilities is a comprehensive suite of tools and utilities designed to operate in the client environment. Using them, users of client systems can access the Teradata Database. Note: Teradata Database runs with or without a channel- or network-attached client. Moreover, the computer on which the utilities are installed can be running Teradata Database software as well.Mainframe Utilities The following table describes Teradata Tools and Utilities on channel-attached clients and what each tool or utilities provides.Introduction to Teradata Warehouse 11
  • 30. Chapter 2: The Teradata WarehouseTeradata Tools and Utilities This utility… Provides… For… Basic Teradata Query (BTEQ) an interactive and batch query channel-attached clients. processor/report generator IBM Customer Information Control an interface that enables CICS macro System (CICS) Interface for Teradata or command-level application programs to access Teradata Database resources Host Utility Consoles (HUTCNS) access to a number of AMP-based utilities IBM IMS Interface for Teradata provides an Information Management System (IMS) interface to the Teradata Database Teradata Archive/Recovery (Teradata a means to save and restore data ARC) Teradata Call-Level Interface Version 2 a collection of callable service routines (CLIv2) that provide the interface between applications and the Teradata Gateway. The Gateway is the interface between CLI and the server Teradata Director Program (TDP) a high-performance interface for messages sent between the client and the Teradata Database Teradata Preprocessor2 (PP2) for a method of accessing data stored in Embedded SQL the Teradata Database Preprocessor2 interprets and expands Teradata SQL statements incorporated into an application program Teradata Transparency Series/ gateway services allowing products Application Programming Interface that access either DB2 or SQL/DS (TS/API) databases to access data stored on the Teradata DatabaseTeradata Utility Pack for Network-Attached Clients The following table describes utilities in the Teradata Utility Pack for network-attached clients and what utility provides. This utility… Provides… For… Basic Teradata Query (BTEQ) an interactive and batch query network-attached clients. processor/report generator ODBC Driver for Teradata access to the Teradata Database from network-attached clients. various tools, increasing the portability of access12 Introduction to Teradata Warehouse
  • 31. Chapter 2: The Teradata Warehouse Teradata Tools and Utilities This utility… Provides… For… OLE DB Provider for Teradata an interface for accessing and network-attached clients. manipulating all types of data Teradata Administrator an interface that you can use to network-attached clients. perform database administration tasks Teradata Call-Level Interface Version 2 callable service routines that provide network-attached clients. (CLIv2) the interface between applications and the Teradata Gateway. Teradata Gateway is the interface between CLI and the server Teradata Data Connector a block-level I/O interface to one or network-attached clients. more access modules that interface to a data storage device Teradata Driver for JDBC Interface platform-independent, network-attached clients. Java-application access to the Teradata Database from various tools increasing portability of data Teradata MultiTool an interface to various Teradata network-attached clients. Database utilities Teradata SQL Assistant a means of retrieving data from any network-attached clients. ODBC-compliant database server and of manipulating and storing the data on your desktop PC Windows Call-Level Interface a means to write applications using network-attached clients. Developer’s Kit Dynamic Data Exchange (DDE) or Dynamic Link Library (DLL) to access the Teradata DatabaseDatabase Management and Query Analysis Tools The following table describes database management and query analysis tools and what each tool provides. This tool… Provides… For… Teradata Dynamic Workload Manager a means to set up rules that manage network-attached clients. (Teradata DWM) database access, increase database efficiency, and enhance workload capacity. The rules include workload limits on accounts, users, and objects, such as databases, tables, and more Teradata Index Wizard analyses of various SQL query network-attached clients. workloads and suggests candidate indexes to enhance performance of those queriesIntroduction to Teradata Warehouse 13
  • 32. Chapter 2: The Teradata WarehouseTeradata Tools and Utilities This tool… Provides… For… Teradata Manager a graphical-based systems network-attached clients. management platform containing a suite of specialized tools and applications for monitoring and controlling Teradata Database resource usage on one or more systems Teradata Performance Monitor an orderly presentation of performance, usage, status, contention, and availability data for Teradata Database at the overall, resource, and session levels Teradata Query Scheduler a means to manage requests input to network-attached clients. the Teradata Database and keep the database running at optimum performance levels. The product consists of client, server, and administrator components, plus a separate database within the Teradata Database called tdwm Teradata Statistics Wizard automation for collecting workload network-attached clients. statistics, or selecting recommended indexes or columns for statistics collection or re-collection Teradata System Emulation Tool the capability to examine the query network-attached clients. (Teradata SET) plans generated by the test system Optimizer as if the queries were processed on the production system Teradata Visual Explain (Teradata VE) a simplified depiction of the execution network-attached clients. plan of complex SQL statements Teradata Workload Analyzer (Teradata a means for DBAs to identify classes of network-attached clients. WA) queries (workloads) and provides recommendations on workload definitions and operating rules to ensure that database performance meets Service Level Goals (SLG). Teradata WA helps DBAs effectively manage distribution of resources using graphical displays and supports the conversion of existing Priority Definitions (PD) sets into new workloadsLoad and Unload Utilities The following table describes utilities to be used to load data into and unload data from the Teradata Database.14 Introduction to Teradata Warehouse
  • 33. Chapter 2: The Teradata Warehouse Teradata Tools and Utilities This utility… Provides… For… Teradata FastExport a means of reading large volumes of channel- and network-attached clients. data from the Teradata Database Teradata FastLoad high-performance data loading from channel- and network-attached clients. client files into empty tables Teradata MultiLoad high-performance data maintenance, channel- and network-attached clients. including inserts, updates, and deletions to existing tables Teradata Tools and Utilities Access a block-level I/O interface to data channel- and network-attached clients. Modules residing on a specific external data storage device Teradata Parallel Data Pump (TPump) continuous update of tables; performs channel- and network-attached clients. insert, update, and delete operations or a combination of these operations on multiple tables using the same source feed Teradata Parallel Transporter a means to load data into and export channel- and network-attached clients. data from any accessible database in the Teradata Database or other data store for which an access operator or an access module existsTeradata Meta Data Services The following table describes Teradata Meta Data Services (Teradata MDS). This utility… Provides… For… Teradata MDS a set of object-oriented Application network-attached clients. Programming Interfaces (APIs) that can be used to: • Create Application Information Metamodels (AIMs) that define how metadata is stored in the MDS repository • Add, update, and delete metadata objects and collections in the MDS repository MDS also includes a set of utilities and GUIs that provide the infrastructure for a metadata repository.Preprocessors The following table describes utilities that enable applications to access the Teradata Database by interpreting SQL statements in C, COBOL, or Programming Language 1 (PL/I) programs.Introduction to Teradata Warehouse 15
  • 34. Chapter 2: The Teradata WarehouseFor More Information This preprocessor… Provides a mechanism for… For… Teradata C Preprocessor embedding SQL in C programs channel- and network-attached clients. Teradata COBOL Preprocessor embedding SQL in COBOL programs channel-attached and some network- attached clients. Teradata PL/I Preprocessor embedding SQL in PL/I programs channel-attached clients.Teradata Query Director The following table describes Teradata Query Director (Teradata QD). This utility… Provides… For… Teradata QD a program that routes sessions for high network-attached clients. availability purposesStorage Management Utilities The following table describes storage management utilities and what each utility provides. This utility… Provides… For… Archive/Recovery (Teradata ARC) a means of archiving data to tape and channel- and network-attached clients. restoring tape data to the Teradata Database Teradata ASF2 Tape Reader channel-attached clients. Tivoli Storage Manager for Teradata network-attached clients. Open Teradata Backup (OTB) includes open architecture products for backup channel- and network-attached clients. the following: and restore functions for Microsoft® Windows® clients • BakBone NetVault • Symantec VERITAS NetBackup Note: Contact Teradata Global Sales Support for information about the controlled distribution of NetBackup.For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books.16 Introduction to Teradata Warehouse
  • 35. Chapter 2: The Teradata Warehouse For More Information IF you want to learn more about… THEN see… Archive utilities • Teradata Archive/Recovery Utility Reference • Teradata ASF2 Tape Reader User Guide • Teradata Access Module for Tivoli Installation and User Guide BTEQ Basic Teradata Query Reference Communication using CLIv2 • Teradata Call-Level Interface Version 2 Reference for Channel-Attached Systems • Teradata Call-Level Interface Version 2 Reference for Network-Attached Systems • Teradata Call-Level Interface Version 2 Developers Kit for Microsoft Windows CICS Interface for Teradata IBM CICS Interface for Teradata Reference Embedded SQL • Teradata Preprocessor2 for Embedded SQL Programmer Guide • SQL Reference: Stored Procedures and Embedded SQL General Teradata Database architecture Database Design Host utility consoles • Utilities • Database Administration IMS Interface for Teradata IBM IMS/DC Interface for Teradata Reference JDBC Teradata Driver for the JDBC Interface User Guide Load and unload utilities • Teradata FastExport Reference • Teradata FastLoad Referencee • Teradata MultiLoad Reference • Teradata Parallel Data Pump Reference ODBC Teradata ODBC Driver User Guide OLE DB Provider for Teradata OLE DB Provider for Teradata Installation and User Guide SQL SQL Reference: Fundamentals Teradata Administrator Teradata Administrator User Guide Teradata Data Connector Teradata Tools and Utilities Access Module Reference Teradata Director Program Teradata Director Program Reference Teradata Dynamic Workload Manager (DWM) Teradata Dynamic Workload Manager User Guide Teradata Index Wizard Teradata Index Wizard User GuideIntroduction to Teradata Warehouse 17
  • 36. Chapter 2: The Teradata WarehouseFor More Information IF you want to learn more about… THEN see… Teradata Manager • Teradata Manager User Guide • Teradata Manager Installation Guide Teradata Meta Data Services (MDS) • Teradata Meta Data Services Installation and Administration Guide • Teradata Meta Data Services Programmer Guide Teradata MultiTool Graphical User Interfaces: Database Window and Teradata MultiTool Teradata Performance Monitor • Teradata Manager User Guide • Resource Usage Macros and Tables • Database Administration Teradata PT • Teradata Parallel Transporter Application Programming Interface Programmer Guide • Teradata Parallel Transporter Operator Programmer Guide • Teradata Parallel Transporter Operator Reference • Teradata Parallel Transporter Reference • Teradata Parallel Transporter User Guide Teradata Query Director Teradata Query Director User Guide Teradata Query Scheduler • Teradata Query Scheduler Administrator Guide • Teradata Query Scheduler User Guide Teradata SQL Assistant Teradata SQL Assistant for Microsoft Windows User Guide Teradata Statistics Wizard Teradata Statistics Wizard User Guide Teradata System Level Emulation (SET) Database Design Teradata System Emulation Tool User Guide Teradata Tools and Utilities Access Modules Teradata Tools and Utilities Access Module Reference Teradata Visual Explain (VE) Teradata Visual Explain User Guide Teradata Workload Analyzer (WA) Teradata Workload Analyzer User Guide TS/API products Teradata Transparency Series/ Application Programming Interface User Guide18 Introduction to Teradata Warehouse
  • 37. CHAPTER 3 The Teradata Database Model This chapter describes the concepts on which relational databases are modeled and discusses some of the objects that are part of a relational database. Topics include: • What is a relational model? • What is a relational database? • Tables, rows, and columnsWhat is a Relational Model? The relational model for database management is based on concepts derived from the mathematical theory of sets. Roughly speaking, set theory defines a table as a relation. The number of rows is the cardinality of the relation, and the number of columns is the degree. Any manipulation of a table in a relational database has a consistent, predictable outcome, because the mathematical operations on relations are well-defined. By way of comparison, database management products based on hierarchical, network, or object-oriented architectures are not built on rigorous theoretical foundations. Therefore, the behavior of such products is not as predictable as that of relational products. The SQL Optimizer in the database uses relational algebra to build the most efficient access path to requested data. The Optimizer can readily adapt to changes in system variables by rebuilding access paths without programmer intervention. This adaptability is necessary because database definitions can change from time to time.What is a Relational Database? Users perceive a relational database as a collection of objects, that is, as tables, views, macros, stored procedures, and triggers that are easily manipulated using SQL directly or specifically developed applications.Set Theory and Relational Database Terminology Relational databases are a generalization of the mathematics of set theory relations. Thus, the correspondences between set theory and relational databases are not always direct. The information in the following table notes the corresponds between set theory and relational database terms.Introduction to Teradata Warehouse 19
  • 38. Chapter 3: The Teradata Database ModelTables, Rows, and Columns Set Theory Term Relational Database Term Relation Table Tuple Row (or record) Attribute ColumnTables, Rows, and Columns Tables are two-dimensional objects consisting of rows and columns. Data is organized in table format and presented to the users of a relational database. References between tables define the relationships and constraints of data inside the tables themselves.Table Constraints You can define conditions that must be met before the Teradata Database writes a given value to a column in a table. These conditions are called constraints. Constraints can include value ranges, equality or inequality conditions, and intercolumn dependencies. The Teradata Database supports constraints at both the column and table levels. During table creation and modification, you can specify constraints on single-column values as part of a column definition or on multiple columns using the CREATE and ALTER TABLE statements.Permanent and Temporary Tables To manipulate tabular data, you must submit a query in a language that the database understands. In the case of the Teradata Database, the language is SQL. You can store the results of multiple SQL queries in tables. Permanent storage of tables is necessary when different sessions and users must share table contents. When tables are required for only a single session, you can request that the system creates temporary tables. Using this type of table, you can save query results for use in subsequent queries within the same session. Also, you can break down complex queries into smaller queries by storing results in a temporary table for use during the same session. When the session ends, the system automatically drops the temporary table.Global Temporary Tables Global temporary tables are tables that exist only for the duration of the SQL session in which they are used. The contents of these tables are private to the session, and the system automatically drops the table at the end of that session. However, the system saves the global temporary table definition permanently in the Data Dictionary. The saved definition may be shared by multiple users and sessions with each session getting its own instance of the table.20 Introduction to Teradata Warehouse
  • 39. Chapter 3: The Teradata Database Model For More InformationVolatile Tables If you need a temporary table for a single use only, you can define a volatile table. The definition of a volatile table resides in memory but does not survive across a system restart. Using volatile tables improves performance even more than using global temporary tables because the system does not store the definitions of volatile tables in the Data Dictionary. Access-rights checking is not necessary because only the creator can access the volatile table.Derived Tables A special type of temporary table is the derived table. You can specify a derived table in an SQL SELECT statement. A derived table is obtained from one or more other tables as the result of a subquery. The scope of a derived table is only visible to the level of the SELECT statement calling the subquery. Using derived tables avoids having to use the CREATE and DROP TABLE statements for storing retrieved information and assists in coding more sophisticated, complex queries.Rows and Columns A column always contains the same kind of information. For example, a table that has information about employees would have columns for the first name and last name, and nothing other than the employee names should be placed in those columns. A row is one instance of all the columns in a table. For example, each row in the employee table would contain, among other things, the first name and the last name for that employee. The rows and columns in a table represent entities or relationships. An entity is a person, place, or thing about which the table contains information. The table mentioned in the previous paragraphs contains information about the employee entity. Each table holds only one kind of row. The relational model requires that each row in a table be uniquely identified. To accomplish this, you define a primary key to identify each row in the table. For more information about primary keys, see “How Are Primary Indexes and Primary Keys Related?” on page 95.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books. If you want to learn more about… THEN see… Relational model Database Design Tables, rows, and columns • SQL Reference: Fundamentals • Database DesignIntroduction to Teradata Warehouse 21
  • 40. Chapter 3: The Teradata Database ModelFor More Information22 Introduction to Teradata Warehouse
  • 41. SECTION 2 Teradata ArchitectureIntroduction to Teradata Warehouse 23
  • 42. Section 2: Teradata Architecture24 Introduction to Teradata Warehouse
  • 43. CHAPTER 4 Teradata Database Hardware and Software Architecture This chapter briefly describes the Teradata Database hardware components and software architecture. The hardware that supports Teradata Database software is based on Symmetric Multi- Processing (SMP) technology. The hardware can be combined with a communications network that connects the SMP systems to form Massively Parallel Processing (MPP) systems. Topics include: • SMP and MPP platforms • Disk arrays • Cliques • Hot standby nodes • Virtual processors • Request processing • Parallel Database Extensions • Teradata file system • Workstation types and available platforms • Teradata Graphical User Interface • Teradata General Security Service (TDGSS)SMP and MPP Platforms The components of the SMP and MPP hardware platforms include the following.Introduction to Teradata Warehouse 25
  • 44. Chapter 4: Teradata Database Hardware and Software ArchitectureSMP and MPP Platforms Component Description Function Processor Node A hardware assembly containing Serves as the hardware platform several, tightly coupled, central upon which the database processing units (CPUs) in an SMP software operates. configuration. An SMP node is connected to one or more disk arrays with the following installed on the node: • Teradata Database software • Client interface software • Operating system • Multiple processors with shared- memory • Failsafe power provisions An MPP configuration is a configuration of two or more loosely coupled SMP nodes. BYNET Hardware interprocessor network to Implements broadcast, multicast, link nodes on an MPP system. or point-to-point communication between Note: Single-node SMP systems use a processors, depending on the software-configured virtual BYNET situation. driver to implement BYNET services. These platforms use virtual processors (vprocs) that run a set of software processes on a node under the Parallel Database Extensions (PDE). For information about PDE, see “Parallel Database Extensions” on page 34. Vprocs provide the parallel environment that enables the Teradata Database to run on SMP and MPP systems. Vprocs come in two types: • Access Module Processors (AMPs) • Parsing Engines (PEs). For more detailed information on vprocs see “Virtual Processors” on page 29.The BYNET At the most elementary level, you can look at the BYNET as a switched fabric that loosely couples all the SMP nodes in a multinode system. But the BYNET has capabilities that range far beyond those of a simple system bus. The BYNET possesses high-speed logic that provides bi-directional broadcast, multicast, and point-to-point communication and merge functions. A multinode system has at least two BYNETs. This creates a fault-tolerant environment and enhances interprocessor communication. Load-balancing software optimizes the transmission of messages over the BYNETs. If one BYNET should fail, the second can handle the traffic.26 Introduction to Teradata Warehouse
  • 45. Chapter 4: Teradata Database Hardware and Software Architecture Disk ArraysBoardless BYNET Single-node SMP systems use Boardless BYNET (or virtual BYNET) software to simulate the BYNET hardware driver.Disk Arrays Teradata employs Redundant Array of Independent Disks (RAID) storage technology to provide data protection at the disk level. You use the RAID management software to group disk drives into RAID LUNS (Logical Units) to ensure that data is available in the event of a disk failure. Redundant implies that either data, functions, or components are duplicated in the architecture of the array.Logical Units The RAID Manager uses drive groups. A drive group is a set of drives that have been configured into one or more LUNs. Each LUN is uniquely identified. The operating system recognizes a LUN as a disk and is not aware that it is actually writing to spaces on multiple disk drives. This technique allows RAID technology to provide data availability without affecting the operating system. The PDE translates LUNs into virtual disks (vdisks) using slices (in UNIX MP-RAS) or partitions (in Microsoft Windows or Linux).Pdisks and Vdisks A pdisk is the portion of a LUN that is assigned to an AMP. For information about the role that AMPs play in the Teradata Database architecture, see “Virtual Processors” on page 29. Each pdisk is uniquely identified and independently addressable. The group of pdisks assigned to an AMP is collectively identified as a vdisk. Using vdisks instead of direct connections to physical disk drives permits the use of RAID technology without affecting Teradata Database.Cliques The clique is a feature of some MPP systems that physically groups nodes together by multiported access to common disk array units. Inter-node disk array connections are made using FibreChannel (FC) buses. FC paths enable redundancy to ensure that loss of a processor node or disk controller does not limit data availability. The nodes do not share data. They only share access to the disk arrays. The following figure illustrates a four-node standard clique.Introduction to Teradata Warehouse 27
  • 46. Chapter 4: Teradata Database Hardware and Software ArchitectureHot Standby Nodes 4-Node Clique Node 1 Node 2 Node 3 Node 4 1 to 4 Nodes 0123 0123 0123 0123 0123 0123 0123 0123 Point-to-point Fibre Channel Interconnect I OIO I OI O I OIO I OI O I OIO I OI O I OIO I OI O 0 1 0 1 A B A B A B A B 1 to 4 Disk Arrays Array 1 Array 2 Array 3 Array 4 1091B002 A clique is the mechanism that supports the migration of vprocs under PDE following a node failure. If a node in a clique fails, then vprocs migrate to other nodes in the clique and continue to operate while recovery occurs on their home node. For more detailed information on vprocs see “Virtual Processors” on page 29. PEs for channel-attached hardware cannot migrate because they are dependent on the hardware that is physically attached to the node to which they are assigned. PEs for LAN-attached connections do migrate when a node failure occurs, as do all AMPs.Hot Standby Nodes The Hot Standby Node feature allows spare nodes to be incorporated into the production environment so that the Teradata Database can take advantage of the presence of the spare nodes to improve availability and maintain performance levels. A hot standby node is a node that: • Is a member of a clique. • Does not normally participate in the trusted parallel application (TPA). • Can be brought into the TPA to compensate for the loss of a node in the clique. Configuring a hot standby node can eliminate the system-wide performance degradation associated with the loss of a single node in a single clique. When a node fails, the Hot Standby28 Introduction to Teradata Warehouse
  • 47. Chapter 4: Teradata Database Hardware and Software Architecture Virtual Processors Node feature migrates all AMP and PE vprocs on the failed node to the node that you have designated as the hot standby. The hot standby node becomes a production node. When the failed node returns to service, it becomes the new hot standby node. Configuring hot standby nodes eliminates: • Restarts that are required to bring a failed node back into service. • Degraded service period when vprocs have migrated to other nodes in a clique.Virtual Processors The versatility of the Teradata Database is based on virtual processors (vprocs) that eliminate dependency on specialized physical processors. Vprocs are a set of software processes that run on a node under the Teradata Parallel Database Extensions (PDE) within the multitasking environment of the operating system. The following table contains information about the two types of vprocs. Type Description PE The PE performs session control and dispatching tasks as well as parsing functions. AMP The AMP performs database functions to retrieve and update data on the vdisks. A single system can support a maximum of 16,384 vprocs. The maximum number of vprocs per node can be as high as 128, but is typically between 6 and 12. Each vproc is a separate, independent copy of the processor software, isolated from other vprocs, but sharing some of the physical resources of the node, such as memory and CPUs. Multiple vprocs can run on an SMP platform or a node. Vprocs and the tasks running under them communicate using unique-address messaging, as if they were physically isolated from one another. This message communication is done using the Boardless BYNET Driver software on single-node platforms or BYNET hardware and BYNET Driver software on multinode platforms.Parsing Engine The PE is the vproc that communicates with the client system on one side and with the AMPs (via the BYNET) on the other side. Each PE executes the database software that manages sessions, decomposes SQL statements into steps, possibly in parallel, and returns the answer rows to the requesting client. The PE software consists of the following elements.Introduction to Teradata Warehouse 29
  • 48. Chapter 4: Teradata Database Hardware and Software ArchitectureVirtual Processors Parsing Engine Elements Process Parser Decomposes SQL into relational data management processing steps. Optimizer Determines the most efficient path to access data. Generator Generates and packages steps. Dispatcher Receives processing steps from the parser and sends them to the appropriate AMPs. Monitors the completion of steps and handles errors encountered during processing. Session Control • Manages session activities, such as logon, password validation, and logoff. • Recovers sessions following client or server failures.Access Module Processor The AMP is the heart of the Teradata Database. The AMP is a vproc that controls the management of the Teradata Database and the disk subsystem, with each AMP being assigned to a vdisk. AMP functions include… For example… database management tasks • Accounting • Journaling • Locking tables, rows, and databases • Output data conversion During query processing: • Sorting • Joining data rows • Aggregation file-system management. disk space management. Each AMP, as represented in the following figure, manages a portion of the physical disk space. Each AMP stores its portion of each database table within that space.30 Introduction to Teradata Warehouse
  • 49. Chapter 4: Teradata Database Hardware and Software Architecture Request Processing Parsing Parsing Engine Engine BYNET AMP AMP AMP AMP Disk Storage Disk Storage Disk Storage Disk Storage 1091B022AMP Clusters AMPs are grouped into logical clusters to enhance the fault-tolerant capabilities of the Teradata Database. For more information on this method of creating additional fault tolerance in a system see Chapter 5: “Teradata RASUI.”Request Processing SQL is the language that you use to make requests of the Teradata Database, that is, you use SQL to query Teradata. The SQL parser handles all incoming SQL requests in the following sequence: 1 The Parser looks in the Request cache to determine if the request is already there. IF the request is… THEN the Parser… in the Request cache reuses the plastic steps found in the cache and passes them to gncApply. Go to step 8 after checking access rights (step 4). Plastic steps are directives to the database management system that do not contain data values.Introduction to Teradata Warehouse 31
  • 50. Chapter 4: Teradata Database Hardware and Software ArchitectureRequest Processing IF the request is… THEN the Parser… not in the Request cache begins processing the request with the Syntaxer. 2 The Syntaxer checks the syntax of an incoming request. IF there are… THEN the Syntaxer… no errors converts the request to a parse tree and passes it to the Resolver. errors passes an error message back to the requestor and stops. 3 The Resolver adds information from the Data Dictionary (or cached copy of the information) to convert database, table, view, stored procedure, and macro names to internal identifiers. 4 The security module checks access rights in the Data Dictionary. IF the access rights are… THEN the Security module… valid passes the request to the Optimizer. not valid aborts the request and passes an error message and stops. 5 The Optimizer determines the most effective way to implement the SQL request. 6 The Optimizer scans the request to determine where to place locks, then passes the optimized parse tree to the Generator. 7 The Generator transforms the optimized parse tree into plastic steps and passes them to gncApply. 8 gncApply takes the plastic steps produced by the Generator and transforms them into concrete steps. Concrete steps are directives to the AMPs that contain any needed user- or session-specific values and any needed data parcels. 9 gncApply passes the concrete steps to the Dispatcher.The Dispatcher The Dispatcher controls the sequence in which steps are executed. It also passes the steps to the BYNET to be distributed to the AMP database management software as follows: 1 The Dispatcher receives concrete steps from gncApply. 2 The Dispatcher places the first step on the BYNET; tells the BYNET whether the step is for one AMP, several AMPS, or all AMPs; and waits for a completion response.32 Introduction to Teradata Warehouse
  • 51. Chapter 4: Teradata Database Hardware and Software Architecture Request Processing Whenever possible, the Teradata Database performs steps in parallel to enhance performance. If there are no dependencies between a step and the following step, the following step can be dispatched before the first step completes, and the two execute in parallel. If there is a dependency, for example, the following step requires as input the data produced by the first step, then the following step cannot be dispatched until the first step completes. 3 The Dispatcher receives a completion response from all expected AMPs and places the next step on the BYNET. It continues to do this until all the AMP steps associated with a request are done.The AMPs AMPs obtain the rows required to process the requests (assuming that the AMPs are processing a SELECT statement). The BYNET transmits messages to and from the AMPs. An AMP step can be sent to one of the following: • One AMP • A selected set of AMPs, called a dynamic BYNET group • All AMPs in the system The following figure is based on the example in the next section. If access is through a primary index, and a request is for a single row, the PE transmits steps to a single AMP, as shown at PE1. If the request is for many rows (an all-AMP request), the PE makes the BYNET broadcast the steps to all AMPs as shown in PE2. To minimize system overhead, the PE can send a step to a subset of AMPs, when appropriate. PE 1 PE 2 BYNET or Boardless BYNET AMP 1 AMP 2 AMP 3 AMP 4 Disk Disk Disk Disk R1, R5, R9 R2, R6, R10 R3, R7, R11 R4, R8, R12 HD14A001Introduction to Teradata Warehouse 33
  • 52. Chapter 4: Teradata Database Hardware and Software ArchitectureParallel Database ExtensionsExample: SQL Statement As an example, consider the following Teradata SQL statements using a table containing checking account information. The example assumes that AcctNo column is the unique primary index for Table_01. For information about the types of indexes used by Teradata, see Chapter 10: “Data Distribution and Data Access Methods.” 1. SELECT * FROM Table_01 WHERE AcctNo = 129317 ; 2. SELECT * FROM Table_01 WHERE AcctBal > 1000 ; In this example: • PEs 1 and 2 receive requests 1 and 2. • The data for account 129317 is contained in table row R9 and stored on AMP1. • Information about all account balances is distributed evenly among the disks of all four AMPs. The sample Teradata SQL statement is processed in the following sequence: 1 PE 1 determines that the request is a primary index retrieval, which calls for the access and return of one specific row. 2 The Dispatcher in PE 1 issues a message to the BYNET containing an appropriate read step and R9/AMP 1 routing information. After AMP 1 returns the desired row, PE 1 transmits the data to the client. 3 The PE 2 Parser determines that this is an all-AMPs request, then issues a message to the BYNET containing the appropriate read step to be broadcast to all four AMPs. 4 After the AMPs return the results, PE 2 transmits the data to the client. AMP steps are processed in the following sequence: 1 Lock—Serializes access in situations where concurrent access would compromise data consistency. For some simple requests using Unique Primary Index (UPI), Non-unique Primary Index (NUPI), or Unique Secondary Index (USI) access, the lock step may be incorporated into step 2. For information about indexes and their uses, see Chapter 10: “Data Distribution and Data Access Methods.” 2 Operation—Performs the requested task. For complicated queries, there may be hundreds of operation steps. 3 End transaction—Causes the locks acquired in step 1 to be released. The end transaction step tells all AMPs that worked on the request that processing is complete.Parallel Database Extensions Parallel Database Extensions (PDE) are a software interface layer on top of the operating system. The operating system can be UNIX MP-RAS, Linux, or Microsoft Windows. PDE provides the Teradata Database with the ability to:34 Introduction to Teradata Warehouse
  • 53. Chapter 4: Teradata Database Hardware and Software Architecture The Teradata File System • Run the Teradata Database in a parallel environment. • Execute vprocs. • Apply a flexible priority scheduler to Teradata Database sessions. • Debug the operating system kernel and the Teradata Database using resident debugging facilities.Trusted Parallel Applications The PDE provide a series of parallel operating system services to a special class of tasks called a trusted parallel application (TPA). On an SMP or MPP system, the TPA is the Teradata Database. TPA services include: • Facilities to manage parallel execution of the TPA on multiple nodes. • Dynamic distribution of execution processes. • Coordination of all execution threads, whether on the same or on different nodes. • Balancing of the TPA workload within a clique. • Resident debugging facilities in addition to kernel and application debuggers.PDE and MPP Systems The PDE also enables an MPP system to: • Take advantage of hardware features such as the BYNET and shared disk arrays. • Process user applications that were written on non-Trusted Parallel Application (non- TPA) nodes and disks.Start and Stop PDE You can start, reset, and stop the PDE on Windows systems and Linux using the Teradata MultiTool utility and on UNIX MP-RAS systems using the xctl utility. For information about the ctl and xctl utilities, see “Maintenance Utilities” on page 157. For more information on MultiTool, see “Teradata MultiTool” on page 167.The Teradata File System The Teradata file system is a layer of software between the Teradata Database layer and the PDE layer. Teradata file system service calls allow the Teradata Database to store and retrieve data efficiently and with integrity without being concerned about the specific low-level operating system interfaces. The data block is a disk-resident structure that contains one or more rows from the same table and is the smallest physical I/O unit for the Teradata file system. Data blocks are stored in physical disk space units called sectors which are logically grouped together in cylinders.Introduction to Teradata Warehouse 35
  • 54. Chapter 4: Teradata Database Hardware and Software ArchitectureWorkstation Types and Available PlatformsWorkstation Types and Available Platforms Workstations provide a window into the interworkings of the Teradata Database. The following types of workstations are available: • System console • Administration Workstation Some of the workstation types are only available on specific platforms. The following table shows which workstations are appropriate for the different platforms. Type of Workstation Platform System console SMP Administration Workstation MPPSystem Console The role of the system console is to: • Provide an input mechanism for the system and database administrators. • Display system status. • Display current system configuration. • Display performance statistics. • Allow you to control various utilities.Administration Workstation The Administration Workstation (AWS) performs many of the functions of a system console for MPP systems. The AWS is an intelligent workstation whose primary roles are to: • Provide an input mechanism for the system and database administrator. • Provide a single-system view in the multinode environment. • Monitor system performance.Teradata Graphical User Interface The Teradata Graphical User Interface (GUI) allows database or system administrators to control the operation of the Teradata Database. Running in a graphical X Windows or Microsoft Windows environment, the Teradata GUI is also the primary vehicle for starting and controlling the operation of the Teradata Database utilities.36 Introduction to Teradata Warehouse
  • 55. Chapter 4: Teradata Database Hardware and Software Architecture Teradata General Security ServiceHow the Teradata GUI Communicates with the Teradata Database The Teradata GUI communicates with the Teradata Database through the console subsystem (CNS), which is part of the PDE software. Because the CNS software manages this communication, you might see CNS messages from the system. From the Teradata GUI, you can access to the following subwindows. From this subwindow… You can… Applications 1 through 4 run one Teradata Database utility or program at a time in each of the four subwindows. DBS I/O view messages from Teradata Database programs that are not running in Teradata GUI application subwindows. For example, some SQL diagnostics appear here. Supervisor issue commands and invoke utilities.Running the Teradata GUI You can run the Teradata GUI from the following locations: • System Console • AWS • Remote workstation or PCTeradata General Security Service Network security for Teradata is provided by a software product called Teradata General Security Service (TDGSS). It provides for security communication between a client and the Teradata server.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books. IF you want to learn more about… THEN see… Cliques • Utilities • Database Administration Disk arrays Database Administration Hot standby nodes • Utilities • Database AdministrationIntroduction to Teradata Warehouse 37
  • 56. Chapter 4: Teradata Database Hardware and Software ArchitectureFor More Information IF you want to learn more about… THEN see… Teradata file system Database Design Teradata General Security Service Security Administration Teradata Graphical User Interface Graphical User Interfaces: Database Window and Teradata MultiTool Request processing SQL Reference: Statement and Transaction Processing Virtual processors (AMPS and PEs) • Database Design • Database Administration • SQL Reference: Statement and Transaction Processing • Utilities38 Introduction to Teradata Warehouse
  • 57. CHAPTER 5 Teradata RASUI The Teradata Database addresses the critical requirements of reliability, availability, serviceability, usability, and installability (RASUI) by combining the following elements: • Multiple microprocessors in a symmetric multi-processing, (SMP) arrangement. • RAID disk storage technology. • Protection of the Teradata Database from operating anomalies of the client platform. Both hardware and software provide fault tolerance, some of which is mandatory and some of which is optional. Topics include: • Software fault tolerance • Hardware fault tolerance • Teradata Replication SolutionsSoftware Fault Tolerance This section explains the following Teradata Database facilities for software fault tolerance: • Vproc migration • Fallback tables • AMP clusters • Journaling • Archive/Recovery • Table Rebuild utilityVproc Migration Because the Parsing Engine (PE) and Access Module Processor (AMP) are vprocs and, therefore, software entities, they can migrate from their home node to another node within the same hardware clique if the home node fails for any reason. Although the system normally determines which vprocs migrate to which nodes, a user can configure preferred migratory destinations. Vproc migration permits the system to function completely during a node failure, with some degradation of performance due to the non-functional hardware. The following figure illustrates vproc migration, where the large X indicates a failed node, and arrows pointing to nodes still running indicate the migration of AMP3, AMP4, and PE2.Introduction to Teradata Warehouse 39
  • 58. Chapter 5: Teradata RASUISoftware Fault Tolerance PE1 AMP1 AMP2 AMP3 PE2 AMP4 PE3 AMP5 AMP6 Normal ARRAY PE1 AMP1 AMP4 PE2 AMP6 AMP2 AMP3 PE3 AMP5 Recovery GG01A027 ARRAY Note: PEs for channel-attached connections cannot migrate during a node failure, because they depend on the channel hardware physically attached to their node.Fallback Tables A fallback table is a duplicate copy of a primary table. Each fallback row in a fallback table is stored on an AMP different from the one to which the primary row hashes. This storage technique maintains availability should the system lose an AMP and its associated disk storage in a cluster. In that event, the system would access data in the fallback rows. The disadvantage of fallback is that this method doubles the storage space and the I/O (on INSERTs, UPDATEs, and DELETEs) for tables. The advantage is that data is almost never unavailable because of one down AMP. Data is fully available during an AMP or disk outage, and recovery is automatic after repairs have been made. The Teradata Database permits the definition of fallback for individual tables. As a general rule, you should run all tables critical to your enterprise in fallback mode. You can run other, non-critical tables in non-fallback mode in order to maximize resource usage. Even though RAID disk array technology may provide access to data even when you have not specified fallback, neither RAID1 nor RAID5 provides the same level of protection as fallback does.40 Introduction to Teradata Warehouse
  • 59. Chapter 5: Teradata RASUI Software Fault Tolerance You specify whether a table is fallback or not using the CREATE TABLE (or ALTER TABLE) statement. The default is not to create tables with fallback.AMP Clusters A cluster is a group of 2-16 AMPs that provide fallback capability for each other. A copy of each row is stored on a separate AMP in the same cluster. In a large system, you would probably create many AMP clusters. However, whether large or small, the concept of a cluster exists even if all the AMPs are in one cluster.One-Cluster Configuration Pictures best explain AMP clustering. The following figure illustrates a situation in which fallback is present with one cluster, which is essentially an unclustered system. AMP1 AMP2 AMP3 AMP4 Primary copy area 1,9,17 2,10,18 3,11,19 4,12,20 Fallback copy area 21,22,15 1,23,8 9,2,16 17,10,3 AMP5 AMP6 AMP7 AMP8 Primary copy area 5,13,21 6,14,22 7,15,23 8,16,24 Fallback copy area 18,11,4 19,12,24 20,5,6 13,14,7 FG10A001 Note that the fallback copy of any row is always located on an AMP different from the AMP which holds the primary copy. This is an entry-level fault tolerance strategy. In this example which shows only a few rows, the data on AMP3 is fallback protected on AMPs 4, 5, and 6. However, in practice, some of the data on AMP3 would be fallback protected on each of the other AMPs in the system. The system becomes unavailable if two AMPs in a cluster go down.Smaller Cluster Configuration The following figure illustrates smaller clusters. Decreasing cluster size reduces the likelihood that two AMP failures will occur in the same cluster. The illustration shows the same 8-AMP configuration now partitioned into 2 AMP clusters of 4 AMPs each.Introduction to Teradata Warehouse 41
  • 60. Chapter 5: Teradata RASUISoftware Fault Tolerance AMP1 AMP2 AMP3 AMP4 Primary copy area 1,9,17 2,10,18 3,11,19 4,12,20 Fallback copy area 2,3,4 1,11,12 9,10,20 17,18,19 Cluster A Cluster B AMP5 AMP6 AMP7 AMP8 Primary copy area 5,13,21 6,14,22 7,15,23 8,16,24 Fallback copy area 6,7,8 5,15,16 13,14,24 21,22,23 FG10A002 Compare this clustered configuration with the earlier illustration of an unclustered AMP configuration. In the example, the (primary) data on AMP3 is backed up on AMPs 1, 2, and 4 and the data on AMP6 is backed up on AMPs 5, 7, and 8. If AMPs 3 and 6 fail at the same time, the system continues to function normally. Only if two failures occur within the same cluster does the system halt. Performance is the primary factor that determines cluster size. While 2-AMP clusters provide maximum protection against system loss, because the likelihood of both AMPs in a cluster going down simultaneously is very small, this configuration also suffers from a higher workload per AMP in the event of a failure. Typically, a cluster size is four to eight AMPs. For most applications, a cluster size of four provides a good balance between data availability and system performance.Journaling The Teradata Database supports tables that are devoted to journaling. A journal is a record of some kind of activity. The Teradata Database supports several kinds of journaling. The system does some journaling on its own, while you can specify whether to perform other journaling. The following table explains the capabilities of the different Teradata Database journals. This type of journal… Does the following… And Occurs … Down AMP • Is active during an AMP failure only always. recovery • Journals fallback tables only • Is used to recover the AMP after the AMP is repaired, then is discarded42 Introduction to Teradata Warehouse
  • 61. Chapter 5: Teradata RASUI Software Fault Tolerance This type of journal… Does the following… And Occurs … Transient • Logs BEFORE images for transactions always. • Is used by system to roll back failed transactions aborted either by the user or by the system • Captures: • Begin/End Transaction indicators • "Before" row images for UPDATE and DELETE statements • Row IDs for INSERT statements • Control records for CREATE, DROP, DELETE, and ALTER statements • Keeps each image on the same AMP as the row it describes • Discards images when the transaction or rollback completes Permanent • Is available for tables or databases as specified by the • Can contain "before" images, which permit rollback, user. or after images, which permit rollforward, or both before and after images • Provides rollforward recovery • Provides rollback recovery • Provides full recovery of nonfallback tables • Reduces need for frequent, full-table archivesTeradata Archive/Recovery The Teradata Archive/Recovery utility backs up and restores data for channel-attached and network-attached clients. If you want to… Then… archive data copy all or select one of the following: • Tables (or partitions of a table) • Databases • Data Dictionary tables Note: If your system is used only for decision support and is updated regularly with data loads, you may not want to archive the data. restore data copy an archive from the client or server back to the database, and restore data to all AMPs, to clusters of AMPs, or to a specific AMP (as long as the Data Dictionary contains the definitions of the table or database you want to restore). Note: If the table does not have a definition in the Data Dictionary because of a DROP or RENAME statement, you can still restore data using the COPY statement.Introduction to Teradata Warehouse 43
  • 62. Chapter 5: Teradata RASUIHardware Fault Tolerance Similar restore and recovery capabilities are available for systems running the Microsoft Windows operating system using the Windows NetVault and NetBackup. For more information, see “Open Teradata Backup” on page 162. Note: Contact Teradata Global Sales Support for information about the controlled distribution of NetBackup.Table Rebuild Utility Use the Table Rebuild utility to recreate a table, database, or entire disk on a single AMP under the following conditions: • The table structure or data is damaged because of a software problem, head crash, power failure, or other malfunction. • The affected tables are enabled for fallback protection. Table rebuild can create all of the following on an AMP-by-AMP basis: • Primary or fallback portions of a table. • An entire table (both primary and fallback portions). • All tables in a database. • All tables on an individual AMP. The Table Rebuild utility can also remove inconsistencies in stored procedure tables in a database. An NCR System Engineer, Field Engineer, or System Support Representative usually runs the Table Rebuild utility.Hardware Fault Tolerance The Teradata Database provides the following facilities for hardware fault tolerance. Facility Description Multiple BYNETs Multi-node Teradata Database servers are equipped with at least two BYNETs. Interprocessor traffic is never stopped unless both BYNETs fail. Within a BYNET, traffic can often be rerouted around failed components.44 Introduction to Teradata Warehouse
  • 63. Chapter 5: Teradata RASUI Hardware Fault Tolerance Facility Description RAID disk units • Teradata Database servers use Redundant Arrays of Independent Disks (RAIDs) configured for use as RAID1, RAID5, or RAIDS. Non-array storage cannot use RAID technology. • RAID1 arrays offer mirroring, the method of maintaining identical copies of data. • RAID5 or RAIDS protects data from single-disk failures with a 25 percent increase in disk storage to provide parity. • RAID1 provides better performance and data protection than RAID5/RAIDS, but is more expensive. Multiple-channel and -network connections In a client-server environment, multiple channel connections between mainframe and network-based clients ensure that most processing continues even if one or several connections between the clients and server are not working. Vproc migration is a software feature supporting this hardware issue. Isolation from client hardware defects In a client-server environment, a server is isolated from many client hardware defects and can continue processing in spite of such defects. Battery backup All cabinets have battery backup in case of building power failures. Power supplies and fans Each cabinet in a configuration has redundant power supplies and fans to ensure fail-safe operation. Hot swap capability for node components The Teradata Database can allow some components to be removed and replaced while the system is running. This process is known as hot swap. Teradata Database offers hot swap capability for the following: • Disks within RAID arrays • Fans • Power suppliesIntroduction to Teradata Warehouse 45
  • 64. Chapter 5: Teradata RASUITeradata Replication Solutions Facility Description Cliques • A clique is a group of nodes sharing access to the same disk arrays. The nodes and disks are interconnected through shared FC buses and each node can communicate directly to all disks. This architecture provides and balances data availability in the case of a node failure. • A clique supports the migration of vprocs following a node failure. If a node in a clique fails, then its vprocs migrate to another node in the clique and continue to operate while recovery occurs on their home node. Migration minimizes the performance impact on the system. • PEs for channel-attached hardware cannot migrate, because they depend on the hardware that is physically attached to the assigned node. • PEs for LAN-attached connections do migrate when a node failure occurs, as do all AMP vprocs. • To ensure maximum fault tolerance, no more than one node in a clique is placed in the same cabinet. Usually the battery backup feature makes this precaution unnecessary, but if you want maximum fault tolerance, then plan your cliques so the nodes are never in the same cabinet.Teradata Replication Solutions Teradata Replication Solutions allow you to back up changes made to a specific set of tables on your primary production system. If your primary production system fails, you may continue critical application processing on a secondary system using data that was backed up by Teradata Replication Solutions. GoldenGate for Teradata helps facilitate replication solutions by serving as the intermediary for a primary Teradata Database server and a subscriber server, which may or may not be another Teradata Database server. Database updates are applied to the subscriber system only if the source transaction completes successfully. You can also create user exits for further data processing. For more information on GoldenGate for Teradata, see GoldenGate Operations Guide for Windows and Unix, version 7.3 or later and Teradata Replication Solutions Overview. To replicate changes, you must define the group of tables that contain changes as a Replication Group with the CREATE REPLICATION GROUP statement. For more information on the SQL syntax for defining, changing, and using replication groups, see SQL Reference: Data Definition Statements. Note: The intermediary software is not part of a Teradata release. To purchase and set up Teradata Replication Solutions, contact your customer representative.46 Introduction to Teradata Warehouse
  • 65. Chapter 5: Teradata RASUI For More InformationFor More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Clusters • Teradata Archive/Recovery Utility Reference • Database Administration Database design Database Design Fallback • Database Administration • Utilities Journaling • Teradata Archive/Recovery Utility Reference • Database Administration • SQL Reference: Data Definition Statements Restore/Recovery Teradata Archive/Recovery Utility Reference Table Rebuild Utilities Teradata Replication Solutions • GoldenGate Operations Guide for Windows and Unix • Teradata Replication Solutions OverviewIntroduction to Teradata Warehouse 47
  • 66. Chapter 5: Teradata RASUIFor More Information48 Introduction to Teradata Warehouse
  • 67. CHAPTER 6Communication Between the Client and the Teradata Database This chapter describes various ways the client applications can communicate with the Teradata Database. Teradata uses the Call Level Interface (CLI) which provides the service routines needed by applications. In addition to CLI, Teradata supports other industry standard communications protocols. Topics in this chapter include: • Attachment methods • CLIv2 for channel-attached systems • CLIv2 for network-attached systems • Other types of data communicationsAttachment Methods Clients can connect to the Teradata Database using one of the following methods: • Channel-attached through an IBM mainframe • Network-attached through a Local Area Network (LAN) Client applications that manipulate data on the Teradata Database server communicate with the database directly or indirectly by means of communications interfaces: • Call Level Interface Version 2 (CLIv2) for channel-attached systems • Call Level Interface Version 2 (CLIv2) for network-attached systems Both versions provide the same functions. The CLIv2 is a library of service routines that act as subroutines of the application. The modules in the CLIv2 library vary based on whether the client is channel- or network- attached. Other types of communications interfaces are available including interfaces for systems running Microsoft Windows and interfaces for systems running NCR UNIX MP-RAS. The interfaces include: • Windows Call Level Interface (WinCLI) (Windows-based system) • Open Database Connectivity (ODBC) (Windows and UNIX MP-RAS-based systems) • Java Database Connectivity (JDBC) (Windows and UNIX MP-RAS-based systems)Introduction to Teradata Warehouse 49
  • 68. Chapter 6: Communication Between the Client and the Teradata DatabaseCLIv2 for Channel-Attached Systems The data communications interfaces are discussed in the following sections.CLIv2 for Channel-Attached Systems CLIv2 is a collection of callable service routines that provide the interface between applications and the Teradata Director Program (TDP) on an IBM mainframe client. TDP is the interface between CLIv2 and the Teradata Database server. CLIv2 can operate with all versions of IBM operating systems, including Multiple Virtual Storage (MVS), OS/390, Customer Information Control System (CICS), Information Management System (IMS), and Virtual Machine (VM).What CLIv2 for Channel-Attached Clients Does By way of TDP, CLIv2 sends requests to the server, and provides the application with a response returned from the server by way of TDP. CLIv2 provides support for: • Managing multiple serially executed requests in a session • Managing multiple simultaneous sessions to the same or different servers • Using cooperative processing so that the application can perform operations on the client and the server at the same time • Communicating with two-phase commit coordinators for CICS and IMS transactions • Generally insulating the application from the details of communicating with a serverTeradata Director Program TDP manages communications between CLIv2 and a server. The program executes on the same mainframe as CLIv2, but runs as a different job or virtual machine. An individual TDP is associated with one logical server; note however, that any number of TDPs may operate, and be accessed by CLIv2 simultaneously on the same mainframe. Each TDP is referred to by the application with an identifier called the TDPid (TDP2, for example) that is unique in a mainframe. Functions of TDP include the following: • Session initiation and termination • Logging, verification, recovery, and restart • Physical input to and output from the server, including session balancing and queue maintenance • SecurityServer A server implements the actual relational database that processes requests received from CLIv2 by way of TDP.50 Introduction to Teradata Warehouse
  • 69. Chapter 6: Communication Between the Client and the Teradata Database CLIv2 for Network-Attached Systems The following figure illustrates the logical structure of the client-server interface. Application Program REQUESTS RESPONSES CLIv2 TDP TDP TDP Teradata Teradata Teradata Database Database Database Server Server Server 1091B004CLIv2 for Network-Attached Systems CLIv2 is a collection of callable service routines that provide the interface between applications on a LAN-connected client and the Teradata Database server.What CLIv2 for Network-Attached Clients Does CLI is the interface between the application program and the Micro Teradata Director Program (MTDP). CLIv2 can: • Build parcels that MTDP packages for sending to the Teradata Database using the Micro Operating System Interface (MOSI). • Provide the application with a pointer to each of the parcels returned from the Teradata Database.Introduction to Teradata Warehouse 51
  • 70. Chapter 6: Communication Between the Client and the Teradata DatabaseCLIv2 for Network-Attached SystemsMicro Teradata Director Program The MTDP must be linked to applications that will be network-connected to the Teradata Database. The MTDP performs many of the same functions as the channel-based TDP including: • Session initiation and termination • Physical input to and output from the server • Logging, verification, recovery, and restart Unlike TDP, MTDP does not control session balancing. (Session balancing is controlled by the Teradata Gateway on the server.)Micro Operating System Interface MTDP is the interface between CLI and MOSI. MOSI is a library of service routines that provides operating system independence among the clients that access the Teradata Database. By implementing the MOSI, only one version of MTDP is required to run on all network- connected platforms. These modules and the relationships among them are illustrated in the following figure.52 Introduction to Teradata Warehouse
  • 71. Chapter 6: Communication Between the Client and the Teradata Database Other Types of Data Communications Application Program REQUESTS RESPONSES CLI MTDP MOSI Teradata Database Server 1091B005Other Types of Data Communications Other types of communications interfaces are available for systems running Windows or UNIX MP-RAS.WinCLI WinCLI is a call-level interface for MS-DOS and Windows-based applications. CLI routines are provided as object modules that have been compiled or assembled according to standard linkage conventions. WinCLI uses the Dynamic Data Exchange (DDE) protocol to communicate with application programs.Introduction to Teradata Warehouse 53
  • 72. Chapter 6: Communication Between the Client and the Teradata DatabaseFor More InformationODBC The Open Database Connectivity (ODBC) Driver for the Teradata Database provides an alternate interface to Teradata Databases using the industry standard ODBC Application Programming Interface (API). The ODBC Driver for the Teradata Database provides Core-level SQL and Extension-level 1 (with some Extension-level 2) function call capability using the Windows Sockets (WinSock) Transmission Control Protocol/Internet Protocol (TCP/IP) communications software interface. ODBC operates independently of CLI and WinCLI.JDBC Teradata developed the Teradata JDBC Driver that enables you to access the Teradata Database using the Java language. Java Database Connectivity (JDBC) is a specification for an API. The API allows platform-independent Java applications to access database management systems using SQL. The JDBC API provides a standard set of interfaces for: • Opening connections to databases • Executing SQL statements • Processing results The driver is a set of Java classes that use the TCP/IP communications software to connect to the Teradata JDBC Gateway, which is constantly listening on the network port for connection requests. For each gateway connection, a new session is created. The Java program can select different gateways by using different URLs. All JDBC function requests are routed to the gateway, which in turn accesses the Teradata Database using Teradata CLIv2. More than one gateway can run on the same host if the gateways are configured to use different network ports.For More Information For more information on the topics presented in this chapter, see the following Teradata Tools and Utilities books. To learn more about… THEN see… Call-Level Interface programming • Teradata Call-Level Interface Version 2 Reference for Channel-Attached Systems • Teradata Call-Level Interface Version 2 Reference for Network-Attached Systems JDBC Teradata Driver for the JDBC Interface User Guide54 Introduction to Teradata Warehouse
  • 73. Chapter 6: Communication Between the Client and the Teradata Database For More Information To learn more about… THEN see… Micro Teradata Directory Program • Database Administration (MTDP) • Teradata Call-Level Interface Version 2 Reference for Network-Attached Systems ODBC Teradata ODBC Driver User Guide Teradata Director Program (TDP) Teradata Director Program Reference WinCLI Teradata Call-Level Interface Version 2 Developers Kit for Microsoft WindowsIntroduction to Teradata Warehouse 55
  • 74. Chapter 6: Communication Between the Client and the Teradata DatabaseFor More Information56 Introduction to Teradata Warehouse
  • 75. SECTION 3 Using the Teradata DatabaseIntroduction to Teradata Warehouse 57
  • 76. Section 3: Using the Teradata Database58 Introduction to Teradata Warehouse
  • 77. CHAPTER 7 Database Objects, Databases and Users This chapter provides information about database objects stored in the Teradata Database and space allocation for databases and users. Topics include: • Tables • Queue tables • Views • Teradata stored procedures • External stored procedures • Macros • Triggers • User-Defined Functions • User-Defined Types • User-Defined Methods • Databases and usersTables Tables are two-dimensional objects consisting of rows and columns. Data is organized in table format and presented to the users of a rational database. For information on permanent tables, global temporary tables, volatile tables, and derived tables, see Chapter 3: “The Teradata Database Model.”Queue Tables A queue table is a persistent database table with the properties of an asynchronous first-in-first-out (FIFO) queue.Queue Tables and Base Tables The queue table is different from a standard base table in that a queue table always contains a user-defined Queue Insertion Timestamp (QITS) as the first column of the table.Introduction to Teradata Warehouse 59
  • 78. Chapter 7: Database Objects, Databases and UsersViews The QITS contains the time the row is inserted into the queue table as a way to establish FIFO ordering. Even though the QITS value may not be unique, the Teradata Database ensures uniqueness of the rowid in the queue.Event Processing In the past, in order to implement event processing in the Teradata Database, customer applications would have had to poll an empty table periodically for inserted rows placed there by another application at an earlier time. The Queue Table feature provides a non-polling solution that allows information to be made immediately available to database applications instead of their having to wait for a polling interval to complete.Views Database views are actually virtual tables that you can use as if they were physical tables to retrieve data defining columns from underlying views and/or tables. A view does not contain data and is not materialized until an SQL statement creates it. View definitions are stored in the Data Dictionary.What is in a View? A view is created from one or more base tables or from other views. In fact, you can create hierarchies of views in which views are created from other views. This can be useful, but be aware that deleting any of the lower-level views invalidates dependencies of higher-level views in the hierarchy. A view usually presents only a subset of the columns and rows in the base table or tables. Moreover, some view columns do not exist in the underlying base tables. For example, it is possible to present data summaries in a view (for example, an average), which you cannot obtain from a base table.Why Use Views? There are at least four reasons to use views. Views provide: • A user’s view of data in the database • Security for restricting table access and updates • Well-defined, well-tested, high-performance access to data • Logical data independence60 Introduction to Teradata Warehouse
  • 79. Chapter 7: Database Objects, Databases and Users Teradata Stored ProceduresRestrictions on Using Views You can use views as if they were tables in SELECT statements. Views are subject to some restrictions regarding the INSERT, UPDATE, MERGE, and DELETE statements. For more information, see “SQL Access to the Data Dictionary” on page 122.Teradata Stored Procedures The Teradata stored procedure database object is executed on the Teradata Database server space. It is a combination of procedural control statements, SQL statements, and control declarations that provide a procedural interface to the Teradata Database.Why Use Teradata Stored Procedures? Using stored procedures, you can build large and complex database applications. In addition to a set of SQL control statements and condition handling statements, a Teradata stored procedure can contain the following: • Multiple input and output parameters. • Local variables and cursors. • SQL DDL, DCL, DML, and SELECT statements, including dynamic SQL, with a few exceptions Dynamic SQL is a method of invoking an SQL statement by creating and submitting it at runtime from within a stored procedure. Applications based on stored procedures provide the following benefits: • They reduce network traffic in the client-server environment because stored procedures reside and execute on the server. • They allow encapsulation and enforcement of business rules on the server, contributing to improved application maintenance. • They provide better transaction control. • They provide better security by granting the user access to the procedures rather than to the data tables. • They provide an exception handling mechanism to handle the runtime conditions generated by the application. • All the SQL and SQL control statements embedded in a stored procedure are executed by submitting one CALL statement. Nested CALL statements further extend the versatility.Introduction to Teradata Warehouse 61
  • 80. Chapter 7: Database Objects, Databases and UsersExternal Stored ProceduresElements of a Teradata Stored Procedure A Teradata stored procedure contains some or all of the following elements. This elements… Includes… SQL control statements nested or non-nested compound statements. Control declarations • Condition handlers in DECLARE HANDLER statements for completion and exception conditions. Conditional handlers can be: • CONTINUE or EXIT type. • Defined for a specific SQLSTATE code, the generic exception condition SQLEXCEPTION, or generic completion conditions NOT FOUND and SQLWARNING. • Cursor declarations in DECLARE CURSOR statements. Note: Cursors can be either updatable or read only type. These can also be declared in FOR iteration statements. • Local variable declarations in DECLARE statements. SQL transaction statements DDL, DCL, DML, and SELECT statements, including dynamic SQL statements, with a few exceptions. LOCKING modifiers with all supported SQL statements except CALL. bracketed and simple comments Note: Nested bracketed comments are not allowed. For more information, see “Teradata Stored Procedures as SQL Applications” on page 86.External Stored Procedures You can use the C or C++ programming language to write external stored procedures. You invoke the procedure using the SQL CALL statement just like a Teradata stored procedure. Although external stored procedures cannot contain SQL statements, they can make a call to a C library, and the library can call either another external stored procedure or a Teradata stored procedure.Creating External Stored Procedures Teradata provides a CREATE EXTERNAL PROCEDURE statement that is similar to the existing CREATE PROCEDURE statement that creates a Teradata stored procedure. However, the CREATE EXTERNAL PROCEDURE statement is submitted directly like any other SQL statement while the compile statement for a Teradata stored procedure is submitted using the COMPILE command in Basic Teradata Query Facility (BTEQ) and BTEQ for Microsoft Windows systems (BTEQWIN).62 Introduction to Teradata Warehouse
  • 81. Chapter 7: Database Objects, Databases and Users MacrosMacros The macro database object consists of one or more SQL statements that can be executed by performing a single statement. Each time the macro is performed, one or more rows of data may be returned.SQL Statements Related to Macros The following table lists the basic SQL statements that you can use with macros. Use this statement… To… CREATE MACRO incorporate a frequently used SQL statement or series of statements into a macro. EXECUTE run a macro. Note: A macro can also contain an EXECUTE statement that executes another macro. DROP MACRO delete a macro.Single-User and Multi-User Macros You can create a macro for your own use, or grant execution authorization to others. For example, your macro might enable a user in another department to perform operations on the data in the Teradata Database. When executing the macro, the user need not be aware of the database access, the tables affected, or even the results.Macro Processing Regardless of the number of statements in a macro, the Teradata Database treats it as a single request. When you execute a macro, the system processes either all of the SQL statements, or processes none of the statements. If a macro fails, the system aborts it, backs out any updates, and returns the database to its original state.Introduction to Teradata Warehouse 63
  • 82. Chapter 7: Database Objects, Databases and UsersTriggersTriggers The trigger defines events that happen when some other event, called a triggering event, occurs. This database object is essentially a stored SQL statement associated with a table called a subject table. Teradata implementation complies with ANSI SQL specifications. Triggers execute when any of the following modifies a specified column or columns in the subject table: • DELETE • INSERT • UPDATE Typically, the stored SQL statements perform a DELETE, INSERT, or UPDATE on a table different from the subject table.Types of Triggers The Teradata Database supports two types of triggers. This type of trigger… Fires for each… statement statement that modifies the subject table. row row modified in the subject table.When Do Triggers Fire? You can specify when triggers fire. WHEN you specify… THEN the triggered action… BEFORE executes before the completion of the triggering event. As specified in ANSI SQL standard, a BEFORE trigger cannot have data changing statements in the triggered action. AFTER executes after completion of the triggering event. Note: To support stored procedures the CALL statement is supported in the body of an AFTER trigger. Both row and statement triggers can call a stored procedure.64 Introduction to Teradata Warehouse
  • 83. Chapter 7: Database Objects, Databases and Users User-Defined Functions Sometimes a statement fires a trigger, which in turn, fires another trigger. Thus the outcome of one triggering event can itself become another trigger. The Teradata Database processes and optimizes the triggered and triggering statements in parallel to maximize system performance.ANSI-Specified Order When you specify multiple triggers on a subject table, both BEFORE and AFTER triggers execute in the order in which they were created as determined by the timestamp of each trigger. Triggers are sorted according to the preceding ANSI rule, unless you use the Teradata extension, ORDER. This extension allows you to specify the order in which the triggers execute, regardless of creation time stamp.Why Use a Trigger? You can use triggers to do various things: • Define a trigger on the subject table to ensure that UPDATEs and DELETEs performed to the subject table are propagated to another table. • Use triggers for auditing. For example, you can define a trigger which causes INSERTs in a log table when an employee receives a raise higher than 10%. • Use a trigger to disallow massive UPDATEs, INSERTs, or DELETEs during business hours. • Use a trigger to set a threshold. For example, you can use triggers to set thresholds for inventory of each item by store, to create a purchase order when the inventory drops below a threshold, or to change a price if the daily volume does not meet expectations. • Use a trigger to call Teradata stored procedures and external stored procedures.User-Defined Functions You can create User-Defined Functions (UDFs) to address your particular data needs and to fill the void where system-provided SQL functions are lacking. These special functions can translate into time-saving measures by preprocessing data, or by optimizing query processing. You can use UDFs to map and manipulate non-text data, such as images, in a way that is impossible with SQL constructs. You can write, for example, new: • Scalar functions similar to LOG, SQRT, ABS, and TRIM functions • Aggregate functions similar to SUM, MAX, MIN, and AVG.Creating User-Defined Functions You create the source code for UDFs using the C or C++ programming language. Then you can simply use the CREATE FUNCTION statement and provide the location of the UDF source code. The Teradata Database will do all of the work, including validating the CREATE FUNCTION statement and compiling the C or C++ source. The source may be on the clientIntroduction to Teradata Warehouse 65
  • 84. Chapter 7: Database Objects, Databases and UsersUser-Defined Types system or on the server. Any compilation errors are reported. If no errors occur, the Teradata Database links the function object into a Dynamically Linked Library (DLL) and distributes it to all nodes in the system. The UDF is usable as soon as CREATE FUNCTION completes. Teradata customers can purchase precompiled UDFs from third-party vendors. To protect their intellectual property, vendors may not wish to make their source available. In those instances, they can simply provide a package in the form of a DLL. The DLL code does not have to be written in C or C++, but the code must use C parameter-passing conventions. Teradata customers can use an option in the CREATE FUNCTION statement to provide just the object. The Teradata Database distributes the object automatically to all nodes. Installing just the object is also useful for sites that develop UDFs on a development system and then transfer the object to the production system.Table Function The table function is a form of UDF whose purpose is to return a table a row-at-a-time with each invocation. The function is treated as a derived table subquery and can only be specified in the FROM clause of a SELECT statement. As with other UDFs, you create the source code using the C programming language. Then you can use the CREATE FUNCTION statement and provide the location of the UDF source code. The UDF code is compiled and linked, and the library is distributed as required to all nodes.User-Defined Types Teradata Database supports two types of User-Defined Types (UDTs): • Distinct • Structured UDT Type Description Example Distinct A UDT that is based on a single A distinct UDT named euro that is based predefined data type, such as on a DECIMAL(8,2) data type can store INTEGER or VARCHAR. monetary data. Structured A UDT that consists of one or more A structured UDT named circle can attributes that are based on predefined consist of x-coordinate, y-coordinate, and data types or other UDTs. radius attributes. Distinct and structured UDTs can define methods that operate on instances of the UDT. For example, a distinct UDT named euro can define a user-defined method (UDM) that converts the value to a US dollar amount. Similarly, a structured UDT named circle can define a UDM that computes the area of the circle using the radius attribute.66 Introduction to Teradata Warehouse
  • 85. Chapter 7: Database Objects, Databases and Users User-Defined MethodsFunctions That Operate on UDTs You can use a UDT in most places where a system-supplied type can be specified. An instance of a UDT may be used in most places an instance of a system-supplied types can be used.User-Defined Methods A User-Defined Method (UDM) is a special kind of user-defined function (UDF) that is associated with a UDT. The term method and the acronym UDM are interchangeable. Teradata Database supports two types of UDMs: • Instance • ConstructorInstance Methods An instance method operates on a specific instance of a distinct or structured UDT. For example, an instance method named area might calculate and return the area of an instance of a structured UDT named circle that contains attributes x, y, and radius. Instance methods can also provide transform, ordering, and cast functionality for a distinct or structured UDT. The Teradata Database uses this functionality during certain operations involving the UDT.Constructor Methods A constructor method initializes an instance of a structured UDT. A structured UDT can have more than one constructor method, each one providing different initialization options.Databases and Users While the Teradata Database is a collection of related tables, views, stored procedures, macros, and so on, it also contains databases and users, that is, the Teradata Databases contains an allotment of space from which users can create and maintain their own tables, views, stored procedures, macros, or other databases or users. A database and a user are almost identical in the Teradata Database. The difference is that a user can log on to the system whereas the database cannot.Databases and Users When the Teradata Database is first installed on a server, only one user exists on the system, that is, User DBC.Introduction to Teradata Warehouse 67
  • 86. Chapter 7: Database Objects, Databases and UsersDatabases and Users The database administrator typically manages this user and assigns space from User DBC to all other organizations. User DBC owns all other databases and users in the system. To protect the security of system tables within the Teradata Database, the database administrator typically creates User System Administrator from User DBC. The usual procedure is to assign all database disk space that system tables do not require to User System Administrator. The database administrator then uses this user as a resource from which to allocate space to other databases and users of the system. For information on how to create database and users, see Database Administration.Creating a Finance and Administration Database Consider the following example: the database administrator needs to create a Finance and Administration (F&A) department database with User Jones as a supervisory user, that is, as database administrator within the F&A department. The database administrator first creates F&A Database and then allocates space from it to User Jones to act as the F&A database administrator. The database administrator also allocates space from F&A to Jones for his personal use and for creating a Personnel Database, as well as other databases and other user space allocations. The following figure illustrates the hierarchy implicit in these relationship. User DBC User System Administrator Other F&A • • • Department Database Databases Other Users and Personnel User Database Jones • • • Databases for the Department 1091A00468 Introduction to Teradata Warehouse
  • 87. Chapter 7: Database Objects, Databases and Users For More Information F&A Database owns Personnel Database and all the other department databases. F&A Database also owns User Jones and all other users within the department. Because User DBC ultimately owns all databases and users, it is the final owner of all the databases and user space belonging to the organization. This hierarchical ownership structure provides the owner of a database or user space with complete control over the security of owned data. The owner can archive the database or control access to it by granting or revoking privileges on it.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books. If you want to learn more about… THEN see… Space Allocation for Databases and Users Database Administration Tables, queue tables, views, triggers, macros • Database Design • SQL Reference: Fundamentals • SQL Reference: Data Definition Statements • SQL Reference: Data Manipulation Statements • SQL Reference: Statement and Transaction Processing Teradata stored procedures and external stored • SQL Reference: Fundamentals procedures • SQL Reference: Stored Procedures and Embedded SQL User-Defined Functions SQL Reference: UDF, UDM, and External Stored Procedure Programming User-Defined Methods SQL Reference: UDF, UDM, and External Stored Procedure Programming User-Defined Types • SQL Reference: Data Types and Literals • SQL Reference: FundamentalsIntroduction to Teradata Warehouse 69
  • 88. Chapter 7: Database Objects, Databases and UsersFor More Information70 Introduction to Teradata Warehouse
  • 89. CHAPTER 8 Structured Query Language This chapter describes Structured Query Language (SQL), the ANSI standard language for relational database management. All application programming facilities ultimately make queries against the Teradata Database using SQL because it is the only language the Teradata Database understands. To enhance the capabilities of SQL, Teradata has added extensions that are unique to Teradata. This comprehensive language is referred to as Teradata SQL. Topics include: • Why SQL? • Types of SQL statements • SQL statement syntax • Statement punctuation • The SELECT statement • SQL data types • Teradata recursive query • SQL functions • CursorsWhy SQL? SQL has the advantage of being the most commonly used language for relational database management systems. Because of this, both the data structures in the Teradata Database and the commands for manipulating those structures are controlled using SQL. In addition, all applications, including those written in a client language with embedded SQL, macros, and ad hoc SQL queries, are written and executed using the same set of instructions and syntax. Other database management systems use different languages for data definition and data manipulation and may not permit ad-hoc queries of the database. Teradata Database lets you use one language to define, query, and update your data.Types of SQL Statements The SQL language allows you, using SQL statements, to define database objects, to define user access to those objects, and to manipulate the data stored.Introduction to Teradata Warehouse 71
  • 90. Chapter 8: Structured Query LanguageTypes of SQL Statements These functions form the principal functional families of SQL statements: • Data Definition Language (DDL) statements • Data Control Language (DCL) statements • Data Manipulation Language (DML) statements In addition, SQL provides HELP and SHOW statements the provide help about database object definitions, sessions and statistics, SQL statement syntax, as well as displaying the SQL used to create tables. The following sections contain information about the functional families of Teradata SQL.Data Definition Language Statements You use DDL statements to define the structure and instances of a database. DDL provides statements for the definition and description of database objects. The following table lists some basic DDL statements. The list is not exhaustive. Statement Action CREATE Defines a new database object, such as a database, user, table, view, trigger, index, macro, stored procedure, user-defined type, user-defined function, user-defined macro, depending on the object of the CREATE statement. DROP Removes a database object, such as a database, user, table, view, trigger, index, macro, stored procedure, user-defined type, user-defined function, user-defined macro, depending on the object of the DROP statement. ALTER Changes, for example, a table, column, referential constraint, trigger, index ALTER PROCEDURE Recompiles a stored procedure. MODIFY Changes a database or user definition. RENAME Changes, for example, the names of tables, triggers, views, stored procedures, macros. REPLACE Replaces, for example, macros, triggers, stored procedures, views SET Specifies, for example, time zones, the collation or character set for a session. COLLECT Collects statistics on, for example, a column, group of columns, index DATABASE Specifies a default database. COMMENT Inserts a text comment for a database object. Successful execution of a DDL statement automatically creates, updates, or removes entries in the Data Dictionary.72 Introduction to Teradata Warehouse
  • 91. Chapter 8: Structured Query Language Types of SQL Statements For information about the contents of the Data Dictionary, see Chapter 12: “The Data Dictionary.”Data Control Language Statements You use DCL statements to grant and revoke access to database objects and change ownership of those objects from one user or database to another. The results of DCL statement processing also are recorded in the Data Dictionary. The following table lists some basic DCL statements. The list is not exhaustive. Statement Action GRANT/REVOKE Controls access rights of the users on an object. GRANT LOGON/ Controls logon rights to a host (client) or host group (if the special REVOKE LOGON security user is enabled). GIVE Gives a database object to another database object.Data Manipulation Language Statements You use DML statements to manipulate and process database values. You can insert new rows into a table, update one or more values in stored rows, or delete a row. The following table list some basic DML statements. The list is not exhaustive. Statement Action INSERT Inserts new rows into a table. For more information about a special case of INSERT, see Atomic Upsert later in this table. UPDATE Modifies data in one or more rows of a table. For more information about a special case of UPDATE, see Atomic Upsert later in this table. Atomic Upsert The upsert form of the UPDATE DML statement is a Teradata extension of the ANSI SQL standard designed to enhance the performance of the Teradata TPump utility by allowing the statement to support atomic upsert. For more information about how TPump operates, see “Teradata Tools and Utilities” on page 11. This feature allows Teradata TPump and all other CLIv2-, ODBC-, and JDBC-based applications to perform single-row upsert operations using an optimally efficient single-pass strategy. This single-pass upsert is called atomic to emphasize that its component UPDATE and INSERT SQL statements are grouped together and performed as a single, or atomic, SQL statement. DELETE Removes a row (or rows) from a table.Introduction to Teradata Warehouse 73
  • 92. Chapter 8: Structured Query LanguageSQL Statement Syntax Statement Action MERGE Combines both UPDATE and INSERT in a single SQL statement. Supports primary index operations only, similar to Atomic Upsert but with fewer constraints. These statements: Allows you to manage transactions. • ABORT • ROLLBACK • COMMIT • BEGIN TRANSACTION • END TRANSACTION CHECKPOINT Checks points a journal. CHECKPOINT is a function that writes records to a restart log table that you can use to restart in case of a hardware or software system failure. ECHO Echoes a string or command to a client.SQL Statement Syntax A typical SQL statement consists of the following: • A statement keyword • One or more column names • A database name • A table name • One or more optional clauses introduced by keywords For example, in the following single-statement request, the statement keyword is SELECT: SELECT deptno, name, salary FROM personnel.employee WHERE deptno IN(100, 500) ORDER BY deptno, name ; The select list and FROM clause for this statement is made up of the names: • Deptno, name, and salary (the column names) • Personnel (the database name) • Employee (the table name) The search condition, or WHERE clause, is introduced by the keyword WHERE: WHERE deptno IN(100, 500) The sort ordering, or ORDER BY clause, is introduced by the keywords ORDER BY:74 Introduction to Teradata Warehouse
  • 93. Chapter 8: Structured Query Language Statement Execution ORDER BY deptno, nameStatement Execution Teradata offers the following ways to invoke an executable statement: • Interactively from a terminal • Embedded within an application program • Dynamically created within an embedded application • Embedded within a stored procedure • Dynamically created within a stored procedure • Via a trigger • Embedded within a macroStatement Punctuation You can use punctuation to separate or identify the parts of an SQL statement. This syntax element… Named… Performs this function in a SQL statement… . period separates database names from table names and table names from a particular column name (for example, personnel.employee.deptno). , comma separates and distinguishes column names in the select list, or column names or parameters in an optional clause. ‘ apostrophe delimits the boundaries of character string constants. ( left and right groups expressions or defines the limits of a phrase. parentheses ) ; semicolon separates statements in multi-statement requests and terminates requests submitted via certain utilities such as BTEQ. “ quotation identifies user names that might otherwise conflict with marks SQL reserved words or that would not be valid names in the absence of the quotation marks. : colon prefixes reference parameters or client system variables. To include an apostrophe or show possession in a title, double the apostrophes.Introduction to Teradata Warehouse 75
  • 94. Chapter 8: Structured Query LanguageThe SELECT StatementThe SELECT Statement The SELECT statement is probably the most frequently used SQL statement. It specifies the table columns from which to obtain the data you want, the corresponding database (if different from the current default database), and the table or tables that you need to reference within that database. The SELECT statement further specifies how, in what format, and in what order the system returns the set of result data. You can use the following options, lists, and clauses with the SELECT statement to request data from the Teradata Database. The list is not exhaustive. • DISTINCT option • FROM list • WHERE clause, including subqueries • GROUP BY clause • HAVING clause • QUALIFY clause • ORDER BY clause • CASESPECIFIC option • International sort orders • WITH clause • Query expressions and set operators Another variation is the SELECT INTO statement, which is used in embedded SQL and stored procedures. This statement selects at most one row from a table and assigns the values in that row to host variables in embedded SQL or to local variables or parameters in Teradata stored procedures.SELECT Statement and Set Operators The SELECT statement can use the set operators UNION, INTERSECT, and MINUS/ EXCEPT. These set operators allow you to manipulate the answers to two or more queries by combining the results of each query into a single result set. You can use the set operators within, for example, the following operations: • View definitions • Derived tables • SubqueriesSELECT Statement and Joins A SELECT statement can reference data in two or more tables and the relational join combines the data from the referenced tables.76 Introduction to Teradata Warehouse
  • 95. Chapter 8: Structured Query Language SQL Data Types In this way, the SELECT statement defines a join of specified tables to retrieve data more efficiently than without defining a join of tables. You can specify both inner joins and outer joins: • An inner join selects data from two or more tables that meets specific join conditions. Each source must be named and the join condition, that is the common relationship among the tables to be joined, can be on an ON clause or a WHERE clause. • The outer join is an extension of the inner join that includes rows that qualify for a simple inner join, as well as a specified set of rows that do not match the join conditions expressed by the query.SQL Data Types You must specify a data type for each column when you use SQL to create a table because the Teradata Database does not provide a default data type. You can include a data type to specify data conversions in expressions.Data Type Phrase A data type phrase does the following: • Determines how data is stored on the Teradata Database. • Specifies how data is presented to the user.Teradata and ANSI-Compliant Data Types The Teradata Database supports two forms of data types: • ANSI • Teradata ANSI data types adhere to the ANSI SQL standard. Teradata data types were written in older non-ANSI-compliant versions of Teradata SQL. The Teradata Database supports the following SQL data types. Teradata supports… Including, for example… Teradata data types • Byte • GraphicIntroduction to Teradata Warehouse 77
  • 96. Chapter 8: Structured Query LanguageSQL Data Types Teradata supports… Including, for example… ANSI-compliant data types • Large Objects (LOBs): • Binary Large Objects (BLOBs) • Character Large Objects (CLOBs) • Character • DateTime • Interval • Numeric User-defined Types (UDTs) • Distinct • StructuredData Type Attributes You can use Teradata SQL to define, for example, the attributes of a data value. Data type attributes control, among other things: • Import format (internal representation of stored data). • Export format (how data is presented for a column or an expression result). You must define data type attributes when you define a column. You can override the default values of data type attributes. For example, when you create a table, you can use a FORMAT phrase to override the output format of a data type. The following table summarizes data type attributes. Data Type Attribute ANSI Teradata Extension to ANSI NOT NULL X UPPERCASE X [NOT] CASESPECIFIC X FORMAT quote_string X TITLE quote_string X NAMED name X DEFAULT number X DEFAULT USER X DEFAULT DATE X DEFAULT TIME X DEFAULT NULL X WITH DEFAULT X CHARACTER SET X78 Introduction to Teradata Warehouse
  • 97. Chapter 8: Structured Query Language Teradata Recursive QueryTeradata Recursive Query A recursive query is a named query expression that is allowed to reference itself in its own definition. The self-referencing capability gives the user a simple way to search a table using iterative self-join and set operations. The recursive query feature benefits the user by reducing the complexity of the queries and allowing a certain class of queries to execute more efficiently. Recursive queries are implemented using the WITH RECURSIVE clause in the statement and the RECURSIVE clause in the CREATE VIEW statement.SQL Functions SQL is a nonprocedural language. That means you use SQL statements to tell the Teradata Database what you want. You do not include instructions about how to get what you want. In procedural languages, such as C++, BASIC, or COBOL, you write instructions that define how to get what you want. This is a simple, but important, distinction. Procedural languages contain functions that perform complex operations. The usual SQL statements do not support many functions. However, to reduce the reliance on ancillary application code, SQL does support the following standard functions: • Scalar • Aggregate • Ordered analytical In addition, you can create scalar, aggregate, and table functions to meet specific needs.Scalar Functions A scalar function works on input parameters to create a result. When it is part of an expression, the function is invoked as needed whenever expressions are evaluated for an SQL statement. When a function completes, its result is used by the expression in which the function was referenced. For example, the following statement returns the current date plus 13 years. SELECT ADD_MONTHS (CURRENT_DATE, 12*13); The following statement returns the date 6 months ago. SELECT ADD_MONTHS (CURRENT_DATE, -6);Aggregate Functions Sometimes the information you want can only be derived from data in a set of rows, instead of individual rows.Introduction to Teradata Warehouse 79
  • 98. Chapter 8: Structured Query LanguageCursors Aggregate functions produce results from sets of relational data that you have grouped (optionally) using a GROUP BY or ORDER BY clause. Aggregate functions process each set and produce one result for each set. The following table lists a few examples of aggregate functions. The function… Returns the… AVG arithmetic average of the values in a specified column. COUNT number of qualified rows. MAX maximum column value for the specified column. MIN minimum column value for the specified column. SUM arithmetic sum of a specified column.Ordered Analytical Functions Ordered analytical functions work over a range of data for a particular set of rows in some specific order to produce a result for each row in the set. Like aggregate functions, ordered analytical functions are called for each item in a set. But unlike an aggregate function, an ordered analytical function produces a result for each detail item. Ordered analytical functions allow you to perform sophisticated data mining on the information in your databases to get the answers to questions that SQL otherwise cannot provide. The following table lists a few examples of ordered analytical functions. The following function… Returns the… MSUM sum using the current row and a number of preceding rows that you specify. This is called a moving sum. RANK ordered ranking of rows based on the value of the column being ranked.Cursors Traditional application development languages cannot deal with results tables without some kind of intermediary mechanism because SQL is a set-oriented language. The intermediary mechanism is the cursor.80 Introduction to Teradata Warehouse
  • 99. Chapter 8: Structured Query Language For More Information A cursor is a pointer that the application program uses to move through a results table. You declare a cursor for a SELECT statement, and then open the named cursor. The act of opening the cursor executes the SQL statement. You use the FETCH... INTO... statement to individually fetch and write the rows into host variables. The application can then use the host variables to do computations. Teradata Preprocessor2 uses cursors to mark or tag the first row accessed by an SQL query. Preprocessor2 then increments the cursor as needed. Stored procedures use cursors to fetch one result row at a time and then execute SQL and SQL control statements as required for each row. Local variables or parameters from the stored procedure can be used for computations.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books. If you want to learn more about… THEN see… Cursors SQL Reference: Stored Procedures and Embedded SQL Teradata SQL, including types of SQL • SQL Reference: Fundamentals statements, statement syntax, statement • SQL Reference: Data Definition Statements punctuation, the SELECT statement, data types, • SQL Reference: Data Manipulation recursive query, functions and operators Statements • SQL Reference: Data Types and Literals • SQL Reference: Functions and OperatorsIntroduction to Teradata Warehouse 81
  • 100. Chapter 8: Structured Query LanguageFor More Information82 Introduction to Teradata Warehouse
  • 101. CHAPTER 9 SQL Application Development This chapter describes the tools used to develop applications for the Teradata Database and the interfaces used to establish communications between the applications and the Teradata Database. Topics include: • Embedded SQL applications • Macros as SQL applications • Teradata stored procedures as SQL applications • The EXPLAIN statement • Third-party developmentEmbedded SQL Applications When you write applications using embedded SQL, you insert SQL statements into your application program which must be written in one of the supported programming languages shown in “Supported Languages and Platforms” on page 84. Because third-generation application development languages do not have facilities for dealing with results sets, embedded SQL contains extensions to executable SQL that permit declarations. Embedded SQL declarations include: • Code to encapsulate the SQL from the application language • Cursor definition and manipulation A cursor is a pointer device that you use to read through a results table one record/row at a time. For more information about cursors, see “Cursors” on page 80.How Does an Application Program Use Embedded SQL? The client application languages that support embedded SQL are all compiled languages. SQL is not defined for any of them. For this reason, you must precompile your embedded SQL code to translate the SQL into native code before you can compile the source using a native compiler. The precompiler tool is called Preprocessor2, and you use it to: • Read your application source code to look for the defined SQL code fragments. • Interpret the intent of the code after it isolates all the SQL code in the application and translates it into Call Level Interface (CLI) calls.Introduction to Teradata Warehouse 83
  • 102. Chapter 9: SQL Application DevelopmentMacros as SQL Applications • Comment out all the SQL source. The output of the precompiler is native language source code with CLI calls substituted for the SQL source. After the precompiler generates the output, you can process the converted source code with the native language compiler. For information about Call Level Interface communications interface, see Chapter 6: “Communication Between the Client and the Teradata Database.”Supported Languages and Platforms Preprocessor2 supports the following application development languages on the specified platforms. Application Development Language Platform C • IBM mainframe clients • UNIX clients and some other workstation clients COBOL • IBM mainframe clients • Some workstation clients PL/I IBM mainframesMacros as SQL Applications Teradata macros are SQL statements that the server stores and executes. Macros provide an easy way to execute frequently used SQL operations. Macros are particularly useful for enforcing data integrity rules, providing data security, and improving performance.SQL Used to Create a Macro You use the CREATE MACRO statement to create Teradata macros. For example, suppose you want to define a macro for adding new employees to the Employee table and incrementing the EmpCount field in the Department table. The CREATE MACRO statement looks like this: CREATE MACRO NewEmp (name VARCHAR(12), number INTEGER NOT NULL, dept INTEGER DEFAULT 100 ) AS (INSERT INTO Employee (Name, EmpNo, DeptNo ) VALUES (:name, :number, :dept ) ;84 Introduction to Teradata Warehouse
  • 103. Chapter 9: SQL Application Development Macros as SQL Applications UPDATE Department SET EmpCount=EmpCount+1 WHERE DeptNo=:dept ; ) ; This macro defines parameters that users must fill in each time they execute the macro. A leading colon (:) indicates a reference to a parameter within the macro.Macro Usage The following example shows how to use the NewEmp macro to insert data into the Employee and Department tables. The information to be inserted is the name, employee number, and department number for employee H. Goldsmith. The EXECUTE macro statement looks like this: EXECUTE NewEmp (‘Goldsmith H’, 10015, 600);SQL Used to Modify a Macro The following example shows how to modify a macro. Suppose you want to change the NewEmp macro so that the default department number is 300 instead of 100. The REPLACE MACRO statement looks like this: REPLACE MACRO NewEmp (name VARCHAR(12), number INTEGER NOT NULL, dept INTEGER DEFAULT 300 ) AS (INSERT INTO Employee (Name, EmpNo, DeptNo ) VALUES (:name, :number, :dept ) ; UPDATE Department SET EmpCount=EmpCount+1 WHERE DeptNo=:dept ; ) ;SQL Used to Delete a Macro The example that follows shows how to delete a macro. Suppose you want to drop the NewEmp macro from the database. The DROP MACRO statement looks like this: DROP MACRO NewEmp;Introduction to Teradata Warehouse 85
  • 104. Chapter 9: SQL Application DevelopmentTeradata Stored Procedures as SQL ApplicationsTeradata Stored Procedures as SQLApplications Teradata stored procedures are database applications created by combining SQL control statements with other SQL elements and condition handlers. They provide a procedural interface to the Teradata Database and many of the same benefits as embedded SQL. Teradata stored procedures conform to ANSI SQL standard with some exceptions.SQL Used to Create Stored Procedures Teradata SQL supports creating, modifying, dropping, renaming, and controlling access rights of stored procedures through DDL and DCL statements. You can create or replace a stored procedure through the COMPILE command in Basic Teradata Query (BTEQ) and BTEQ for Microsoft Windows systems (BTEQWIN). You must specify a source file as input for the COMPILE command. You can also create or modify a stored procedure using the CREATE PROCEDURE or REPLACE PROCEDURE statement from CLIv2, ODBC, and JDBC applications. DDL statements are not, however, supported from PP2, so you cannot create or modify stored procedures from PP2. Stored Procedure Example Assume you want to create a stored procedure named NewProc that you can use to add new employees to the Employee table and retrieve the department name of the department to which the employee belongs. You can also report an error, in case the row that you are trying to insert already exists, and handle that error condition. The following stored procedure definition includes nested, labeled compound statements. The compound statement labeled L3 is nested within the outer compound statement L1. Note that the compound statement labeled L2 is the handler action clause of the condition handler. This stored procedure defines parameters that must be filled in each time it is called (executed). CREATE PROCEDURE NewProc (IN name CHAR(12), IN num INTEGER, IN dept INTEGER, OUT dname CHAR(10), INOUT p1 VARCHAR(30)) L1: BEGIN DECLARE CONTINUE HANDLER FOR SQLSTATE value 23505 L2: BEGIN SET p1=Duplicate Row; END L2; L3: BEGIN INSERT INTO Employee (EmpName, EmpNo, DeptNo) VALUES (name, num, dept); SELECT DeptName86 Introduction to Teradata Warehouse
  • 105. Chapter 9: SQL Application Development Teradata Stored Procedures as SQL Applications INTO dname FROM Department WHERE DeptNo = :dept; IF SQLCODE <> 0 THEN LEAVE L3; ... END L3; END L1;SQL Used to Execute a Stored Procedure After compiling a stored procedure, procedures are stored as objects in the Teradata Database. You can execute stored procedures from Teradata client utilities using the SQL CALL statement. Arguments for all input (IN or INOUT) parameters of the stored procedure must be submitted with the CALL statement. BTEQ and other Teradata client utilities support stored procedure execution and DDL. These include: • CLIv2 • JDBC • ODBC • PP2 • Teradata SQL Assistant • BTEQWIN (BTEQ for Windows)DDL Statements with Stored Procedures You can use the following DDL statements with stored procedures. Use This Statement… To… CREATE PROCEDURE direct the stored procedure compiler to create a procedure from the SQL statements in the remainder of the statement text. ALTER PROCEDURE direct the stored procedure compiler to recompile a stored procedure created in an earlier version of Teradata Database without executing SHOW PROCEDURE and REPLACE PROCEDURE statements. DROP PROCEDURE drop a stored procedure. RENAME PROCEDURE rename a procedure. REPLACE PROCEDURE direct the stored procedure compiler to replace the definition of an existing stored procedure. If the specified stored procedure does not exist, create a new procedure by that name from the SQL statements in the remainder of the source text. HELP PROCEDURE … view all the parameters and parameter attributes of a procedure, or ATTRIBUTES the creation time attributes of a procedure. HELP ‘SPL’ display a list of all DDL and control statements associated with stored procedures.Introduction to Teradata Warehouse 87
  • 106. Chapter 9: SQL Application DevelopmentThe EXPLAIN Statement Use This Statement… To… HELP ’SPL’ display help about the command you have named. command_name’ SHOW PROCEDURE view the current definition (source text) of a procedure. The text is returned in the same format as defined by the creator.The EXPLAIN Statement Teradata SQL supplies a very powerful EXPLAIN statement that allows you to see the execution plan of a query. The EXPLAIN modifier in front of any SQL statement displays the execution plan for that statement, which is parsed and optimized in the usual fashion, but is not submitted for execution.How Is EXPLAIN Useful? The EXPLAIN statement not only explains how a statement will be processed, but provides an estimate of the number of rows involved and the performance impact of the request. When you perform an EXPLAIN against any SQL statement, that statement is parsed and optimized. The access and join plans generated by the Optimizer are returned in the form of a text file that explains the (possibly parallel) steps used in the execution of the statement. Also included is the relative time required to complete the statement given the statistics with which the Optimizer had to work. If the statistics are not reasonably accurate, the time estimate may not be accurate. EXPLAIN helps you to evaluate complex queries and to develop alternative, more efficient, processing strategies. You may be able to get a better plan by collecting more statistics on more columns, or by defining additional secondary indexes. Your knowledge of the actual demographics information may allow you to identify row count estimates that seem badly wrong, and help to pinpoint areas where additional statistics would be helpful.EXPLAIN With Simple Join Index Example The EXPLAIN example results from joining tables with the following table definitions. CREATE TABLE customer (c_custkey INTEGER, c_name CHAR(26), c_address VARCHAR(41), c_nationkey INTEGER, c_phone CHAR(16), c_acctbal DECIMAL(13,2), c_mktsegment CHAR(21), c_comment VARCHAR(127)) UNIQUE PRIMARY INDEX( c_custkey ); CREATE TABLE orders (o_orderkey INTEGER NOT NULL, o_custkey INTEGER,88 Introduction to Teradata Warehouse
  • 107. Chapter 9: SQL Application Development The EXPLAIN Statement o_orderstatus CHAR(1), o_totalprice DECIMAL(13,2) NOT NULL, o_orderdate DATE FORMAT yyyy-mm-dd NOT NULL, o_orderpriority CHAR(21), o_clerk CHAR(16), o_shippriority INTEGER, o_commment VARCHAR(79)) UNIQUE PRIMARY INDEX(o_orderkey); CREATE TABLE lineitem (l_orderkey INTEGER NOT NULL, l_partkey INTEGER NOT NULL, l_suppkey INTEGER, l_linenumber INTEGER, l_quantity INTEGER NOT NULL, l_extendedprice DECIMAL(13,2) NOT NULL, l_discount DECIMAL(13,2), l_tax DECIMAL(13,2), l_returnflag CHAR(1), l_linestatus CHAR(1), l_shipdate DATE FORMAT yyyy-mm-dd, l_commitdate DATE FORMAT yyyy-mm-dd, l_receiptdate DATE FORMAT yyyy-mm-dd, l_shipinstruct VARCHAR(25), l_shipmode VARCHAR(10), l_comment VARCHAR(44)) PRIMARY INDEX( l_orderkey ); The following statement defines a join index on these tables. CREATE JOIN INDEX order_join_line AS SELECT ( l_orderkey, o_orderdate, o_custkey, o_totalprice ), ( l_partkey, l_quantity, l_extendedprice, l_shipdate ) FROM lineitem LEFT JOIN orders ON l_orderkey = o_orderkey ORDER BY o_orderdate PRIMARY INDEX (l_orderkey); The following EXPLAIN shows that the Optimizer used the newly created join index, order_join_line, even though there is no reference to the index in the SQL text. EXPLAIN SELECT o_orderdate, o_custkey, l_partkey, l_quantity, l_extendedprice FROM lineitem , orders WHERE l_orderkey = o_orderkey; Explanation -------------------------------------------------------------- 1) First, we lock a distinct LOUISB."pseudo table" for read on a Row Hash to prevent global deadlock for LOUISB.order_join_line. 2) Next, we lock LOUISB.order_join_line for read. 3) We do an all-AMPs RETRIEVE step from join index table LOUISB.order_join_line by way of an all-rows scan with a condition of ("NOT (LOUISB.order_join_line.o_orderdate IS NULL)") into Spool 1, which is built locally on the AMPs. The input table will not be cached in memory, but it is eligible for synchronized scanning. The result spool file will not be cached in memory. The size of Spool 1 is estimated to be 1,000,000 rows. The estimated time for this step is 4 minutes and 27 seconds. 4) Finally, we send out an END TRANSACTION step to all AMPs involved in processing the request.Introduction to Teradata Warehouse 89
  • 108. Chapter 9: SQL Application DevelopmentThird-Party Development For information about the types of indexes that Teradata supports, see Chapter 10: “Data Distribution and Data Access Methods.”Third-Party Development The Teradata Database supports many third-party software products. The two general components of supported products include those of the transparency series and the native interface products.TS/API Products The Transparency Series/Application Program Interface (TS/API) product provides a gateway between the IBM mainframe relational database products DB2 (MVS/TSO) and SQL/DS (VM/CMS) and the Teradata Database. TS/API permits an SQL statement formulated for either DB2 or SQL/DS to be translated into Teradata SQL to allow DB2 or SQL/DS applications to access data stored in a Teradata Database.Compatible Third-Party Software Products Many third-party, interactive query products operate in conjunction with the Teradata Database, permitting queries formulated in a native query language to access a Teradata Database. The list of supported third-party products changes frequently. For a current list, contact your NCR sales office.Performance Monitor/Application Programming Interface The Performance Monitor/Application Programming Interface (PM/API) provides a way for third-party performance monitoring programs to access Performance Monitor and Production Control (PM and PC) functions resident within Teradata Database. PM and PC data is available using a specialized PM/API subset of the Call-Level Interface Version 2 (CLIv2).For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books.90 Introduction to Teradata Warehouse
  • 109. Chapter 9: SQL Application Development For More Information IF you want to learn more about… THEN see… Embedded SQL • SQL Reference: Stored Procedures and Embedded SQL • Teradata Preprocessor2 for Embedded SQL Programmer Guide PM/API PM/API Reference Teradata Director Program Teradata Director Program Reference Teradata SQL, including macros as SQL • SQL Reference: Fundamentals applications, the EXPLAIN statement • SQL Reference: Data Definition Statements • SQL Reference: Data Manipulation Statements • SQL Reference: Statement and Transaction Processing Teradata stored procedures as SQL applications • SQL Reference: Fundamentals • SQL Reference: Stored Procedures and Embedded SQL TS/API products Teradata Transparency Series/ Application Programming Interface User GuideIntroduction to Teradata Warehouse 91
  • 110. Chapter 9: SQL Application DevelopmentFor More Information92 Introduction to Teradata Warehouse
  • 111. CHAPTER 10 Data Distribution and Data Access Methods This chapter describes how the Teradata Database handles data distribution, including the normalization of data and referential integrity, and data access methods. Topics include: • Teradata database indexes • Primary indexes • Partitioned primary indexes • Secondary indexes • Join indexes • Hash indexes • Index specification • Hashing • Identity column • Normalization • Referential integrityTeradata Database Indexes An index is a physical mechanism used to store and access the rows of a table. Indexes on tables in a relational database function much like indexes in books— they speed up information retrieval. In general, the Teradata Database uses indexes to: • Distribute data rows. • Locate data rows. • Improve performance. Indexed access is usually more efficient than searching all rows of a table. • Ensure uniqueness of the index values. Only one row of a table can have a particular value in the column or columns defined as a unique index. The Teradata Database supports the following types of indexes: • PrimaryIntroduction to Teradata Warehouse 93
  • 112. Chapter 10: Data Distribution and Data Access MethodsPrimary Indexes • Secondary • Join • Hash • Special indexes for referential integrity These indexes are discussed in the following sections.Primary Indexes The Teradata Database requires one Primary Index (PI) for each table in the database, except for some data dictionary tables and global temporary tables. • If unique, the PI is a column, or columns, that has no duplicate values. • If non-unique, the PI is a column, or columns, that may have non-unique, or duplicate, values.Primary Indexes and Data Distribution Unique Primary Indexes (UPIs) guarantee uniform distribution of table rows. Non-Unique Primary Indexes (NUPIs) can cause skewed data. While not a guarantor of uniform row distribution, the degree of uniqueness of the index will determine the degree of uniformity of the distribution. Because all rows with the same PI value end up on the same AMP, columns with a small number of distinct values that are repeated frequently do not make good PI candidates. The most efficient access method to data in a table is through the PI. For this reason, choosing a PI should take the following design goal into consideration: choosing a PI that gives good distribution of data across the AMPs must be balanced against choosing a PI that reflects the most common usage pattern of the table.Primary Key A Primary Key (PK), a term that comes from data modeling, defines a column, or columns, that uniquely identify a row in a table. Because it is used for identification, a PK cannot be null. There must be something in that column, or columns, that uniquely identity it. Moreover, PK values cannot be changed. Historical information, as well as relationships with others tables, may be lost if a PK is changed or re-used. A PK is a logical relational database concept. It may or may not be the best column, or columns, to choose as a table’s PI.Foreign Key A Foreign Key (FK) identifies table relationships. They model the relationship between data values across tables. Relational databases, like the Teradata Database, permit data values to associate across more than one table.94 Introduction to Teradata Warehouse
  • 113. Chapter 10: Data Distribution and Data Access Methods Partitioned Primary Indexes Thus each FK has table may have must exist somewhere as a PK. That is, there must be referential integrity between FKs and PKs.How Are Primary Indexes and Primary Keys Related? The following table describes some of the relationships between PKs and PIs. Primary Key Primary Index Identifies a row uniquely. Distributes rows. Does not imply access path. Defines most common access path. Must be unique. May be unique or non-unique. May not be null. May be null. Causes a Unique Primary Index (UPI) or N/A Unique Secondary Index (USI) to be created. Constraint used to ensure referential integrity. Physical access mechanism. Required by the Teradata Database only if Required by Teradata Database for most tables. referential integrity checks are to be performed. 64-column limit. IF the Teradata THEN the column Database performs… limit is… referential integrity 64. checks no referential integrity no arbitrary limit. checks Values should not be changed if you want to Values can be changed. maintain data integrity and preserve historical relations among tables. The columns chosen for the UPI of a table are frequently the same columns identified as the PK during the data modeling process, but no hard-and-fast rule makes this so. In fact, physical database design considerations often lead to a choice of columns other than those of the primary key for the PI of a table.Partitioned Primary Indexes Both Unique Primary Indexes (UPIs) and Non-Unique Primary Indexes (NUPIs) can be partitioned, though a non-partitioned PI is the default Teradata Database PI.Introduction to Teradata Warehouse 95
  • 114. Chapter 10: Data Distribution and Data Access MethodsSecondary Indexes A Partitioned Primary Index (PPI), like a non-partitioned PI, provides an access path to the rows in the base table, as well as global temporary tables, volatile tables, and non-compressed join indexes via the PI values. When a table or join index is created with a PPI, the rows are hashed to the appropriate AMPs based on the PI columns and assigned to an appropriate partition based on the value of a partitioning expression that you define when you create or alter the table. Once assigned to a partition, the rows are stored in row hash order.How Do Partitioned and Non-Partitioned Primary Indexes Compare? PPIs are designed to optimize range queries while also providing efficient PI join strategies. A range query requests data that falls within specified boundaries. The following table provides a comparison of PPI and PI index capabilities. Capabilities Partitioned Non-Partitioned Hash partitioned, that is Yes Yes distributed to the AMPs by the hash of the PI columns. Partitioned on each AMP on Yes No some set of columns. Ordered by hash of the PI Yes (within each partition) Yes columns on each AMP.Secondary Indexes Secondary Indexes (SIs) allow access to information in a table by alternate, less frequently used paths and improve performance by avoiding full table scans. Although SIs add to table overhead, in terms of disk space and maintenance, you can drop and recreate SIs as needed. SIs: • Do not affect the distribution of rows across AMPs. • Can be unique or non-unique. • Are used by the Optimizer when the indexes can improve query performance.Secondary Index Subtables The system builds subtables for all SIs. The subtable contains the index rows that associate the SI value with one or more rows in the base table. When column values change, the system updates the rows in the subtable. When you drop the SI, the system removes the subtable and the disk space it used becomes available.96 Introduction to Teradata Warehouse
  • 115. Chapter 10: Data Distribution and Data Access Methods Join IndexesHow Do Primary and Secondary Indexes Compare? The following table provides a brief comparison of PI and SI features. Feature Primary Secondary Is required Yes No Can be unique or non-unique Both Both Affects row distribution Yes No Create and drop dynamically No Yes Improves access Yes Yes Create using multiple data Yes Yes types Requires separate physical No Yes, a subtable structure Requires extra processing No Yes overheadJoin Indexes A Join Index (JI) is an indexing structure containing columns from one or more base tables. Some queries can be satisfied by examining only the JI when all referenced columns are stored in the index. Such queries are said to be covered by the JI. Other queries may use the JI to qualify a few rows, then refer to the base tables to obtain requested columns that are not stored in the JI. Such queries are said to be partially-covered by the index. Because the Teradata Database supports multi-table, partially-covering JIs, all types of JIs, except the aggregate JI, can be joined to their base tables to retrieve columns that are referenced by a query but are not stored in the JI. Aggregate JIs can be defined for commonly- used aggregation queries. Much like SIs, JIs impose additional processing on insert and delete operations and update operations which change the value of columns stored in the JI. The performance trade-off considerations are similar to those for SIs.Single-Table Join Indexes Join indexes are similar to base tables in that they support a primary index, which can be used for direct access to one or a few rows. A single table JI is a an index structure that contains rows from only a single table. This type of structure has been found to be very useful by Teradata customers because it provides an alternative approach (primary index) to directly accessing data.Introduction to Teradata Warehouse 97
  • 116. Chapter 10: Data Distribution and Data Access MethodsJoin IndexesMulti-Table Join Indexes When queries frequently request a particular join, it may be beneficial to predefine the join with a multi-table JI. The Optimizer can use the predefined join instead of performing the same join repetitively.Aggregate Join Indexes When query performance is of utmost importance, aggregate JIs offer an extremely efficient, cost-effective method of resolving queries that frequently specify the same aggregate operations on the same column or columns. When aggregate JIs are available, the system does not have to repeat aggregate calculations for every query. You can define an aggregate JI on two or more tables, or on a single table. A single-table aggregate JI includes a summary table with: • A subset of columns from a base table. • Additional columns for the aggregate summaries of the base table columns. You can create an aggregate JI using: • SUM function A SUM aggregate JI contains a hidden column containing the row count, so that AVERAGE can be calculated from the JI. • COUNT function • GROUP BY clauseSparse Join Indexes Another capability of the JI allows you to index a portion of the table using the WHERE clause in the CREATE JOIN INDEX statement to limit the rows indexed. You can limit the rows that are included in the JI to a subset of the rows in the table based on an SQL query result. Any JI, whether simple or aggregate, multi-table or single-table, can be sparse. For example, the following DDL creates J1, which is an aggregate JI containing only the sales records from 2005: CREATE JOIN INDEX J1 AS SELECT storeid, deptid, SUM(sales_dollars) FROM sales WHERE EXTRACT(year FROM sales_date) = 2005 GROUP BY storeid, deptid; When you enter a query, the Optimizer determines whether accessing J1 gives the correct answer and is more efficient than accessing the base tables. This sparse JI would be selected by the Optimizer only for queries that restricted themselves to data from the year 2005.98 Introduction to Teradata Warehouse
  • 117. Chapter 10: Data Distribution and Data Access Methods Hash IndexesHash Indexes The hash index provides an index structure that can be hash-distributed to AMPs in various ways. The index has characteristics similar to a single-table JI with a row identifier that provides transparent access to the base table. A hash index may be simpler to create than a corresponding JI. The hash index has been designed to improve query performance in a manner similar to a single-table JI. In particular, you can specify a hash index to: • Cover columns in a query so that the base table does not need to be accessed. • Serve as an alternate access method to the base table in a join or retrieval operation.Index Specification All tables, with a few exceptions, require a PI. If you do not specify a column or set of columns as the PI for the table, then CREATE TABLE specifies a PI by default.Creating Indexes The following table provides general information about creating indexes. To specify a… Use the following statement… And the following clause… Unique Primary Index (UPI) CREATE TABLE UNIQUE PRIMARY INDEX Non-Unique Primary Index CREATE TABLE PRIMARY INDEX (NUPI) Unique Secondary Index (USI) CREATE TABLE UNIQUE INDEX Non-Unique Secondary Index CREATE TABLE INDEX (NUSI) CREATE INDEX N/A Join Index (JI) CREATE JOIN INDEX N/A Note: A JI can provide an index across multiple tables. Hash Index CREATE HASH INDEX N/A Indexes are also created when the PRIMARY KEY and UNIQUE constraints are specified.Strengths and Weaknesses of Various Types of Indexes The Teradata Database does not require or allow users to explicitly dictate how indexes should be used for a particular query. The Teradata Database Optimizer costs all of the reasonable alternatives and selects the least expensive.Introduction to Teradata Warehouse 99
  • 118. Chapter 10: Data Distribution and Data Access MethodsIndex Specification The object of any query plan is to return accurate results as quickly as possible. Therefore, the Optimizer uses an index or indexes only if the index speeds up query processing. In some cases, the Optimizer processes the query without using any index. Selection of indexes: • Can have a direct impact on overall Teradata performance. • Is not always a straightforward process. • Is based partly on usage expectations. The following table assumes execution of a simple SELECT statement and explains the strengths and weaknesses of some of the various indexing methods. This access method… Has the following strengths… And the following weaknesses… Unique Primary Index (UPI) • is the most efficient access method none, provided that the column or when the SQL statement contains columns making up the index are well the PI value chosen. • involves one AMP and one row • requires no spool file (for a simple SELECT) • can obtain the most granular locks Non-unique Primary Index (NUPI) • provides efficient access when the • may slow down INSERTs for a SET SQL statement contains the PI value table with no USIs. • involves one AMP • may decrease the efficiency of • can obtain granular locks SELECTs containing the PI value when some values are repeated in • may not require a spool file as long many rows. as the number of rows returned is small Unique Secondary Index (USI) • provides efficient access when the requires additional overhead for SQL statement contains the USI INSERTs, UPDATEs, MERGEs, and values, and you do not specify PI DELETEs. values • involves two AMPs and one row • requires no spool file (for a simple SELECT) Non-unique Secondary Index (NUSI) • provides efficient access when the • requires additional overhead for number of rows per value in the INSERTs, UPDATEs, MERGEs, and table is relatively small DELETEs. • involves all AMPS and probably • will not be used by the Optimizer if multiple rows the number of data blocks accessed • provides access using information is a significant percentage of the that may be more readily available data blocks in the table because the than a UPI value, such as employee Optimizer will determine that a full last name, compared to an table scan is cheaper. employee number • may require a spool file100 Introduction to Teradata Warehouse
  • 119. Chapter 10: Data Distribution and Data Access Methods Hashing This access method… Has the following strengths… And the following weaknesses… Full table scan • accesses each row only once • examines every row. • provides access using any arbitrary • usually requires a spool file possibly set of column conditions as large as the base table. Multi-table join index (JI) • can eliminate the need to perform • requires additional overhead for certain joins and aggregates INSERTs, UPDATEs, MERGEs, and repetitively DELETEs for any of the base tables • may be able to satisfy a query that contribute to the multi-table JI. without referencing the base tables • usually is not suitable for data in • can have a different PI from that of tables subjected to a large number the base table of daily INSERTs, UPDATEs, MERGEs, and DELETEs. • can replace an NUSI or a USI • imposes some restrictions on operations performed on the base table. Single-table join index (JI) • can isolate frequently used columns • requires additional overhead for (or their aggregates for JIs only) INSERTs, UPDATEs, MERGEs, and or from those that are seldom used DELETEs. hash index • can reduce number of physical I/Os • imposes some restrictions on when only commonly used operations performed on the base columns are referenced table. • can have a different PI from that of the base table Sparse join index (JI) • can be stored in less space than an • requires additional overhead for ordinary JI INSERTs, UPDATEs, MERGEs, and • reduces the additional overhead DELETEs to the base table. associated with INSERTs, • imposes some restrictions on UPDATEs, MERGEs, and DELETEs operations performed on the base to the base table when compared table. with an ordinary JI • can exclude common values that occur in many rows to help ensure that the Optimizer chooses to use the JI to access less common valuesHashing The Teradata Database uses hashing to distribute data to disk storage and uses indexes to access the data. Because the architecture of the Teradata Database is massively parallel, it requires an efficient means of distributing and retrieving its data. That efficient method is hashing. Virtually all Teradata indexes are based on (or partially based on) row hash values rather than table column values. For PIs, the Teradata Database obtains a row hash by hashing the values of the PI columns. The row hash and a sequence number, which is assigned to distinguish between rows with theIntroduction to Teradata Warehouse 101
  • 120. Chapter 10: Data Distribution and Data Access MethodsIdentity Column same row hash within a table, are collectively called a row identifier and uniquely identify each row in a table. A partition identifier is also part of the row identifier in the case of PPI tables. For more information on PPI, see “Partitioned Primary Indexes” on page 95. For SIs, the Teradata Database implements the index as a row identifier based on the: • Hash of the values of the SI columns. • Actual values of the SI columns. • List of row identifiers for rows with that value (or those values).Identity Column Identity Column is a column attribute option defined in the ANSI standard. When associated with a column, this attribute causes the system to generate a unique, table-level number for every row that is inserted into the table. Identity columns have many applications, including the automatic generation of UPIs, USI, and primary keys. For example, an identity column can serve as a UPI to ensure even data distribution when you import data from a system that does not have a PI. For more information about indexes, see “Teradata Database Indexes” on page 93.Normalization Normalization is the process of reducing a complex data structure into a simple, stable one. Generally this process involves removing redundant attributes, keys, and relationships from the conceptual data model.Normal Forms Normalization theory is constructed around the concept of normal forms that define a system of constraints. If a relation meets the constraints of a particular normal form, we say that relation is “in normal form." By definition, a relational database is always normalized to some degree, because the column values are always atomic. That is, a column can contain one and only one value. Null is, of course, a value. But to simply leave it at that invites a number of problems including redundancy and potential update anomalies. The higher normal forms were developed to correct those problems.First, Second, and Third Normal Forms First, second, and third normal forms are stepping stones to the Boyce-Codd normal form and, when appropriate, the higher normal forms.102 Introduction to Teradata Warehouse
  • 121. Chapter 10: Data Distribution and Data Access Methods Referential Integrity First Normal Form First normal form (1NF) is definitive of a relational database. If we are to consider a database relational, then all relations in the database must be in 1NF. We say a relation is in 1NF if all fields within that relation are atomic. We sometimes refer to this concept as the elimination of repeating groups from a relation. Furthermore, first normal form allows no hierarchies of data values. Second Normal Form Second normal form (2NF) deals with the elimination of circular dependencies from a relation. We say a relation is in 2NF if it is in 1NF and if every non-key attribute is fully dependent on the entire Primary Key. A non-key attribute is any attribute that is not part of the Primary Key for the relation. Third Normal Form Third normal form (3NF) deals with the elimination of non-key attributes that do not describe the Primary Key. For a relation to be in 3NF, the relationship between any two non-Primary Key columns, or groups of columns, in a relation must not be one-to-one in either direction. We say attributes are mutually independent if none of them is functionally dependent on any combination of the others. This mutual independence ensures that we can update individual attributes without any danger of affecting any other attribute in a row. The following list of benefits summarizes the advantages of implementing a normalized logical model in 3NF. • Greater number of relations • More PI choices • Optimal distribution of data • Fewer full table scans • More joins possibleReferential Integrity Traditional referential integrity is the concept of relationships between tables, based on the definition of a primary key and a foreign key. The concept states that a row cannot exist in a table with a non-null value for a referencing column if an equal value does not exist in a referenced column. Using referential integrity, you can specify columns within a referencing table that are foreign keys for columns in some other referenced table. You must define referenced columns as either primary key columns or unique columns. Referential integrity is a reliable mechanism that prevents accidental database inconsistencies when you perform INSERTS, UPDATES, and DELETES.Introduction to Teradata Warehouse 103
  • 122. Chapter 10: Data Distribution and Data Access MethodsReferential IntegrityReferential Integrity Terminology We use the following terms to explain the referential integrity concept. Term Definition Parent Table The table referred to by a Child table. Also called the “referenced table.” Child Table A table in which the referential constraints are defined. Also called the “referencing table.” Parent Key A primary or secondary key in the parent table. Primary Key With respect to referential integrity, a primary key is a parent table column set that is referred to by a foreign key column set in a child table. Foreign Key With respect to referential integrity, a foreign key is a child table column set that refers to a primary key column set in a parent table.Referencing (Child) Table We call the referencing table the Child table, and we call the specified Child table columns the referencing columns. Referencing columns should be of the same number and have the same data type as the referenced table key.Referenced (Parent) Table A Child table must have a parent table, and the referenced table is referred to as the Parent table. The parent key columns are the referenced columns.104 Introduction to Teradata Warehouse
  • 123. Chapter 10: Data Distribution and Data Access Methods For More InformationWhy Is Referential Integrity Important? Referential integrity is important, because it keeps you from introducing errors into your database. Suppose you have an Order Parts table like the following. Order Number Part Number Quantity PK Not Null FK FK 1 1 110 1 2 275 2 1 152 Part number and order number, each foreign keys in this relation, also form the composite primary key. Suppose you were to delete the row defined by the primary key value 1 in the PART NUMBER table. The foreign key for the first and third rows in the ORDER PART table would now be inconsistent, because there would be no row in the PART NUMBER table with a primary key of 1 to support it. Such a situation shows a loss of referential integrity. Teradata provides referential integrity to prevent this from happening. If you try to delete a row from the PART NUMBER table for which you have specified referential integrity, the database management system will not allow you to remove the row. Besides data integrity and data consistency, referential integrity provides these benefits. Benefit Description Increases development productivity You do not need to code SQL statements to enforce referential integrity constraints, because the Teradata Database automatically enforces referential integrity. Requires fewer written programs All update activities are programmed to ensure that referential integrity constraints are not violated, because the Teradata Database enforces referential integrity in all environments. Additional programs are not required.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books.Introduction to Teradata Warehouse 105
  • 124. Chapter 10: Data Distribution and Data Access MethodsFor More Information If you want to learn more about… THEN see… Identity column • SQL Reference: Data Definition Statements • SQL Reference: Data Manipulation Statements Indexes and hashing • Database Design • SQL Reference: Data Definition Statements • SQL Reference: Statement and Transaction Processing Normalization Database Design Relational model of database management Referential integrity106 Introduction to Teradata Warehouse
  • 125. CHAPTER 11 Concurrency Control and Transaction Recovery This chapter describes the concurrency control in relational database management systems and how to use transaction journaling (permanent journaling) to recover lost data or restore an inconsistent database to a consistent state. Topics include: • What is concurrency control? • Transactions • ANSI mode transactions • Teradata mode transactions • Locks • Host utility locks • Recovery and transactions • System and media recovery • Two-Phase Commit protocolWhat is Concurrency Control? Concurrency control involves preventing concurrently running processes from improperly inserting, deleting, or updating the same data. A system maintains concurrency control through two mechanisms: • Transactions • LocksTransactions Transactions are a mandatory facility for maintaining the integrity of a database while running multiple, concurrent operations.Definition of a Transaction A transaction is a logical unit of work and the unit of recovery. The statements nested within a transaction must either all happen or not happen at all. Transactions are atomic. A partial transaction cannot exist.Introduction to Teradata Warehouse 107
  • 126. Chapter 11: Concurrency Control and Transaction RecoveryANSI Mode TransactionsDefinition of Serializability A set of transactions is serializable if the set produces the same result as some arbitrary serial execution of those same transactions for arbitrary input. A set of transactions is correct only if it is serializable. Use of a Two-Phase Locking (2PL) protocol may serialize transactions. The two phases are the growing phase and the shrinking phase. In the… A transaction must… growing phase first acquire a lock on an object before operating on it. shrinking phase never acquire any more locks after it has released a lock. Lock release is an all-or-none operation. For more information on the 2PL protocol, see “Two-Phase Commit Protocol” on page 115.Transaction Semantics The Teradata Database supports both ANSI transaction semantics and Teradata transaction semantics. A system parameter specifies the default transaction mode for a site. However, you can override the default for a session. The Teradata Database returns a failure when a transaction operating in Teradata semantics mode issues a COMMIT statement. The Teradata Database supports the ANSI COMMIT statement in ANSI transaction mode.ANSI Mode Transactions All ANSI transactions are implicitly opened. Either of the following events opens an ANSI transaction: • Execution of the first SQL statement in a session. • Execution of the first statement following the close of a previous transaction. Transactions close when the application performs a COMMIT, ROLLBACK, or ABORT statement. When the transaction contains a DDL statement, including DATABASE and SET SESSION, which are considered DDL statements in this context, the statement must be the last statement in the transaction other than the transaction closing statement. A session executing under ANSI transaction semantics allows neither the BEGIN TRANSACTION statement, the END TRANSACTION statement, nor the two-phase commit protocol. When an application submits these statements in ANSI mode, the database software generates an error.108 Introduction to Teradata Warehouse
  • 127. Chapter 11: Concurrency Control and Transaction Recovery Teradata Mode Transactions In ANSI mode, the system rolls back the entire transaction if the current request: • Results in a deadlock. • Performs a DDL statement that aborts. • Executes an explicit ROLLBACK or ABORT statement. Teradata Database accepts the ABORT and ROLLBACK statements in ANSI mode, including conditional forms of those statements. If the system detects an error for either a single or multistatement request, it only rolls back that request, and the transaction remains open, except in special circumstances. Application-initiated, asynchronous aborts also cause full transaction rollback in ANSI mode.Teradata Mode Transactions Teradata mode transactions can be either implicit or explicit. An explicit, or user-generated, transaction is a single set of BEGIN TRANSACTION/END TRANSACTION statements surrounding one or more requests. All other requests are implicit transactions. Consider the following transaction: BEGIN TRANSACTION; DELETE FROM Employee WHERE Name = ‘Smith T’; UPDATE Department SET EmpCount=EmpCount-1 WHERE DeptNo=500; END TRANSACTION; If an error occurs during the processing of either the DELETE or UPDATE statement within the BEGIN TRANSACTION and END TRANSACTION statements, the system restores both Employee and Department tables to the states at which they were before the transaction began. If an error occurs during a Teradata transaction, then the system rolls back the entire transaction.Locks A lock is a means of claiming usage rights to some resource. The Teradata Database can lock several different types of resources in several different ways.Overview of Teradata Database Locking Most locks used on Teradata resources are obtained automatically. Users can override some locks by making certain lock specifications, but the Teradata Database only allows overridesIntroduction to Teradata Warehouse 109
  • 128. Chapter 11: Concurrency Control and Transaction RecoveryLocks when it can assure data integrity. The data integrity requirement of a request decides the type of lock that the system uses. A request for a locked resource by another user is queued (in the case of a conflicting lock level) until the process using the resource releases its lock on that resource.Why Do Database Management Systems Require Locking? The lost update anomaly best explains why database management systems, in which multiple processes are accessing the same database, require locks. The following figure provides an example of this anomaly. Execution of Execution of transaction T1 Database transaction T2 $500.00 READ Balance $500.00 $500.00 READ Balance Add $1,000.00 $1,500.00 $2,500.00 Add $2,000.00 $1,500.00 WRITE result to WRITE result to database database $2,500.00 FG11A001 This example shows a nonserialized set of transactions. If locking had been in effect, the database would not have been able to add $3000.00 to $500.00 and get two different and wrong results. The example demonstrates the most common problem encountered in a transaction processing system without locks. Although several other problems arise when locking is not in effect, the lost update problem sufficiently illustrates the need for locking.Lock Levels The Teradata lock manager implicitly locks the following objects. Object Locked Description Database Locks rows of all tables in the database.110 Introduction to Teradata Warehouse
  • 129. Chapter 11: Concurrency Control and Transaction Recovery Locks Object Locked Description Table Locks all rows in the table and any index and fallback subtables. Row hash Locks the primary copy of a row and all rows that share the same hash code within the same table.Levels of Lock Types Users can apply four different types of locking on Teradata Database resources. The following table explains these types. Lock Type Description Exclusive The requester has exclusive rights to the locked resource. No other process can read from, write to, or access the locked resource in any way. Write The requester has exclusive rights to the locked resource except for readers not concerned with data consistency. Read Several users can hold Read locks on a resource, during which the system permits no modification of that resource. Read locks ensure consistency during read operations such as those that occur during a SELECT statement. Access The requestor is willing to accept minor inconsistencies of the data while accessing the database (an approximation is good enough). An access lock permits modifications on the underlying data while the SELECT operation is in progress. This same information is illustrated in the following table. Lock Type Held Lock Request None Access Read Write Exclusive Access Granted Granted Granted Granted Queued Read Granted Granted Granted Queued Queued Write Granted Granted Queued Queued Queued Exclusive Granted Queued Queued Queued QueuedAutomatic Database Lock Levels The Teradata Database applies most of its locks automatically. The following table illustrates how the Teradata Database applies different locks for various types of SQL statements.Introduction to Teradata Warehouse 111
  • 130. Chapter 11: Concurrency Control and Transaction RecoveryHost Utility Locks Locking Level by Access Type Type of SQL Statement UPI/NUPI/USI NUSI/Full Table Scan Locking Mode SELECT Row Hash Table Read UPDATE Row Hash Table Write DELETE Row Hash Table Write INSERT Row Hash Not applicable Write CREATE DATABASE Not applicable Database Exclusive DROP DATABASE MODIFY DATABASE CREATE TABLE Not applicable Table Exclusive DROP TABLE ALTER TABLEDeadlocks and Deadlock Resolution A deadlock occurs when transaction 1 places a lock on resource A, and then needs to lock resource B. But resource B has already been locked by transaction 2, which in turn needs to place a lock on resource A. This state of affairs is called a deadlock or a deadly embrace. To resolve a deadlock, Teradata Database aborts one of the transactions and performs a rollback. If you used BTEQ to submit the transaction, the database reports the deadlock abort to BTEQ. BTEQ resubmits only the request that caused the error (the default behavior), not the complete transaction. Because this can result in partially committed transactions, you must take care when writing a BTEQ script to ensure that the transaction is one request. For example, a statement in BTEQ ends with a semicolon (;) as the last non-blank character in the line. Thus, BTEQ sees the following example as two requests: sel * from x; sel * from y; However, if you write these same statements in the following way, BTEQ sees them as only one request: sel * from x ; sel * from y;Host Utility Locks The locking operation that the client-resident Teradata Archive/Recovery utility uses is different from the locking operation that the Teradata Database performs. The Teradata Database documentation and utilities frequently refer to archive locks as HUT (Host Utility) locks.112 Introduction to Teradata Warehouse
  • 131. Chapter 11: Concurrency Control and Transaction Recovery Recovery for TransactionsHUT Lock Types Teradata Database places HUT locks as follows. Lock Type Object Locked Read Any object being archived. Group Read Rows of a table being archived if and only if the table is defined for an after- image permanent journal and if you select the appropriate option on the ARCHIVE command. Write Permanent journal table being restored. Write All tables in a ROLLFORWARD or ROLLBACKWARD during recovery operations. Write Journal table being deleted. Exclusive Any object being restored.HUT Lock Characteristics HUT locks have the following characteristics: • Associated with the currently logged-on user who entered the statement rather than with a job or transaction. • Placed only on objects on the AMPs that are participating in a utility operation. • Placed at the cluster level during a CLUSTER dump. • Never conflict with a utility lock at another level that was placed on the same object for the same user. • Remain active until they are released either by the RELEASE LOCK option of the utility command or by the execution of a Teradata SQL RELEASE LOCK statement after a utility operation completes. • Automatically reinstated following a Teradata Database restart if they had not been released.Recovery for Transactions Recovery is a process by which an inconsistent database is brought back to a consistent state. Transactions play the critical role in this process because they are used to “play back” (using the term in its most general sense) a series of updates to the database, either taking it back to some earlier state or bringing it forward to a current state.Introduction to Teradata Warehouse 113
  • 132. Chapter 11: Concurrency Control and Transaction RecoverySystem and Media RecoverySystem and Media Recovery The following are the conditions under which the Teradata Database performs: • An unscheduled restart • A transaction recovery • Down AMP recoverySystem Restarts Unscheduled restarts occur for one of the following reasons: • AMP or disk failure • Software failure • Parity error Failures and errors affect all software recovery in the same way. Hardware failures take the affected component offline and it remains offline until repaired or replaced.Transaction Recovery Two types of automatic transaction recovery can occur: • Single transaction recovery • Database recovery The following table details what happens when the two automatic recovery mechanisms take place: This recovery type… Happens when the Teradata Database… single transaction aborts a single transaction because of: • Transaction deadlock • User error • User-initiated abort command • An inconsistent data table Single transaction recovery uses the transient journal to effect its data restoration. database performs a restart for one of the following reasons: • Hardware failure • Software failure • User command114 Introduction to Teradata Warehouse
  • 133. Chapter 11: Concurrency Control and Transaction Recovery Two-Phase Commit ProtocolDown AMP Recovery When an AMP fails to come online during system recovery, the Teradata Database continues to process requests using fallback data. When the down AMP comes back online, down AMP recovery procedures begin to bring the data for the AMP up-to-date as follows. IF there are… THEN the AMP recovers… a large number of rows to be processed offline. only a few rows to be processed online. After all updates are made, we consider the AMP to be fully recovered.Two-Phase Commit Protocol Two-phase commit (2PC) is a protocol for assuring update consistency across distributed databases in which each participant in the transaction commit operation votes to either commit or abort the changes. The participants wait before committing the change until they know that all participants can commit. By voting to commit, the participant guarantees that it can either commit or roll back its part of the transaction, even if it crashes before receiving the result of the vote. The 2PC protocol allows the development of Customer Information Control System (CICS) and Information Management System (IMS) applications that can update one or more Teradata Database databases and/or databases under some other DBMS in a synchronized manner. The result is that all updates requested in a defined unit of work will either succeed or fail.Definition of Participant A participant is a database manager that performs some work on behalf of the transaction, and that commits or aborts changes to the database. A participant can also be a coordinator of participants at a lower level. In such cases, the coordinator/participant relays a vote request to its participants, and sends its vote to the coordinator only after determining the outcome of its participants. Any number of participants can engage in a two-phase commit operation. A participant is defined as being in doubt from the time it votes to commit or abort until the time it receives a commit or abort instruction from the coordinator, which is the controlling database manager with respect to the distributed transaction. A transaction is in doubt if any of the participants are in doubt.Definition of Coordinator The coordinator is never in doubt. Selection of the coordinator is arbitrary. However, with respect to the Teradata Database, it is always either IMS or CICS. There can be only one coordinator per transaction at any given time.Introduction to Teradata Warehouse 115
  • 134. Chapter 11: Concurrency Control and Transaction RecoveryFor More InformationFor More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Host utility locks • Utilities • Database Administration Locks Database Administration Transaction processing, including information SQL Reference: Statement and Transaction on ANSI and Teradata modes Processing Two-phase commit (2PC) • Teradata Director Program Reference • IBM CICS Interface for Teradata Reference • IBM IMS/DC Interface for Teradata Reference116 Introduction to Teradata Warehouse
  • 135. CHAPTER 12 The Data Dictionary This chapter provides information about the Data Dictionary. Topics include: • What is the Data Dictionary? • Data Dictionary views • Who Uses Data Dictionary views? • SQL access to the Data DictionaryWhat is the Data Dictionary? The Data Dictionary is a set of system tables that contain data about databases and properties of those databases in addition to a great deal of administrative information about the Teradata Database. Data Dictionary tables and views reside in the system database called DBC. These tables and views are reserved for use by the system and contain information, called metadata, about the data associated with the Teradata Database.Data Dictionary Content Data Dictionary system tables include current definitions, control information, and general information about the following: • Databases • End users • Roles • Profiles • Accounts • Tables • Views • Columns • Indexes • Constraints • Sessions and session attributes • Triggers • Access rightsIntroduction to Teradata Warehouse 117
  • 136. Chapter 12: The Data DictionaryWhat is the Data Dictionary? • Journal tables • Disk space • Events • Resource usage • Macros • Stored procedures • Logs • Rules • Translations • Character sets • Statistics • User-defined functions • External stored procedures • Authorization • User-defined types • User-defined methodsWhat is in a Data Dictionary Table? The following table contains information about what is stored in the Data Dictionary when you create some of the most important objects. THEN the definition of the object is stored along WHEN you create… with the following details… a table • Table name • Database name, creator name, and user names of all owners in the hierarchy • Each column in the table, including column name, data type, length, and phrases • User/creator access privileges on the table • Indexes defined for the table • Constraints defined for the table • Table backup and protection, including fallback status and permanent journals • Date and time the object was created118 Introduction to Teradata Warehouse
  • 137. Chapter 12: The Data Dictionary What is the Data Dictionary? THEN the definition of the object is stored along WHEN you create… with the following details… a database • Database name, creator name, owner name, and account name • Space allocation including: • Permanent • Spool • Temporary • Number of fallback tables • Collation type • Creation time stamp • The date and time the database was last altered and the name that altered it • Role and profile names • A unique identifier for the name of the UDF libraryIntroduction to Teradata Warehouse 119
  • 138. Chapter 12: The Data DictionaryWhat is the Data Dictionary? THEN the definition of the object is stored along WHEN you create… with the following details… a user • User-name, creator name, and owner name • Password string and password change date • Space allocation including: • Permanent • Spool • Temporary • Default account, database, collation, character type, and date form • Creation timestamp • Name and time stamp of the last alteration made to the user • Role and profile name THEN the following details are entered in the Data WHEN you create a… Dictionary… view or macro • The text of the view or macro • Creation time attributes • User and creator access privileges stored procedure • Creation time attributes • Parameters including parameter name, parameter type, data type, and default format • User and creator access privileges trigger • The IDs of the: • Table • Trigger • Database and subject table database • User who created the trigger • User who last updated the trigger • Timestamp for the last update • Indexes • Trigger name and: • Whether the trigger is enabled • The event that fires the trigger • The order in which triggers fire • Default character set • Creation text and time stamp • Overflow text, that is, trigger text that exceeds a specified limit • Fallback tables120 Introduction to Teradata Warehouse
  • 139. Chapter 12: The Data Dictionary Data Dictionary Views THEN the following details are entered in the Data WHEN you create a… Dictionary… User-defined function • Function name, database name, specific name • Number, data type, and style of parameters • Function ID, function type, and external name • Source file language • Character type • External file reference • Platform User-defined method • Method name, database name, specific name • Number, data type, and style of parameters • Function ID, function type, and external name • Source file language • Character type • External file reference • Platform User-defined type • Type name, database name, specific name • Number, data type, and style of parameters • Function ID, function type, and external name • Source file language • Character type • External file reference • PlatformData Dictionary Views You can examine the information in the system tables in database DBC directly or through a series of views. Typically, you use views to obtain information on the objects in the Data Dictionary rather than querying the actual tables, which can be very large. The database administrator controls who has access to views.Who Uses Data Dictionary Views? Some Data Dictionary views may be restricted to special types of users, while others are accessible by all users. The database administrator controls access to views by granting access rights. The following table defines the information needs of various types of users.Introduction to Teradata Warehouse 121
  • 140. Chapter 12: The Data DictionarySQL Access to the Data Dictionary This type of user… Needs to know… End • Objects to which the user has access • Types of access available to the user • Access rights the user has granted to other users Supervisory • How to create and organize databases • How to monitor space usage • How to define new users • How to allocate access privileges • How to create indexes • How to perform archiving operations Database administrator • Performance • Status and statistics • Errors • Accounting Security administrator • Access logging rules generated by the execution of BEGIN LOGGING statements • Results of access checking events, logged as specified by the access logging rules Operations control Archive and recovery activitiesSQL Access to the Data Dictionary Every time you log on to the Teradata Database, perform an SQL query, or type a password, you are using the Data Dictionary. For security and data integrity reasons, the only SQL DML command you can use on the Data Dictionary is the SELECT statement. You cannot use the INSERT, UPDATE, MERGE, or DELETE SQL statements to alter the Data Dictionary, except for some data dictionary tables, such as the AccessLog table or the EventLog table. You can use SELECT to examine any view in the Data Dictionary to which your database administrator has granted you access. For example, if you need to access information in the Personnel database, then you can query the DBC.Databases view as shown: SELECT Databasename, Creatorname, Ownername, Permspace FROM DBC.Databases WHERE Databasename=’Personnel’ ; The query above produces a report like this:122 Introduction to Teradata Warehouse
  • 141. Chapter 12: The Data Dictionary For More Information Databasename Creatorname Ownername Permspace Personnel Jones Jones 1,000,000For More Information For more information on the topics presented in this chapter, see the following Teradata Database book. IF you want to learn more about… THEN see… Data Dictionary Data DictionaryIntroduction to Teradata Warehouse 123
  • 142. Chapter 12: The Data DictionaryFor More Information124 Introduction to Teradata Warehouse
  • 143. CHAPTER 13 International Language Support This chapter describes the capabilities of Teradata international language support. Topics include: • Character set overview • External and internal character sets • Teradata Database character data storage • Language support modes • Standard language support mode • Japanese language support mode • Extended supportCharacter Set Overview A character set (sometimes called a code page) is simply a way of representing characters on a computer. There are many ways to represent characters on a computer, so there are many character sets in use today. Because different characters are needed for different languages, character sets are often designed to support a particular language. Even for the same language, many different character sets may exist. When computers or computer applications exchange character data, it is important that they either use the same character set or properly convert the data from one character set to the other during the transfer process. Otherwise, the data received by one machine may no longer have the same meaning as it had before the transfer (the same issue exists for numeric data, but there are fewer ways used to represent numbers, so it is not as big a problem). A character set has a repertoire of characters it supports, a representation for character strings, and an implied collation based on this representation.What Is a Repertoire? Consider English for example. To write English, you need the alphabetic characters, A–Z, the digits, 0–9, and various punctuation characters. Many applications also commonly require the characters a–z, the lower case counterparts of A-Z. If an application is written in French, you need the alphabetic characters that are required for English, plus accented characters, for example, the é. However, some applications may need accented characters for English as well. The word résumé, borrowed from French, is oftenIntroduction to Teradata Warehouse 125
  • 144. Chapter 13: International Language SupportExternal and Internal Character Sets displayed in its accented form in English text. Similarly, ö may be used in English text to spell coördinate. You can see that a repertoire comprises the characters that we need to write a language, and clearly, what we include in our repertoire determines what we can write, and how we must write it. A character string from one character set can only be translated correctly to another character set if every character in the source string exists in the repertoire of the target character set.Character Representation Representing strings of characters is essentially a two-step process: • Creating a mapping between each character required and an integer. • Devising an encoding scheme for placing a sequence of numbers into memory. The simplest systems map the required characters to small integers between 0 and 255, and encode sequences of characters as sequences of bytes with the appropriate numeric values. Representing characters for repertoires that require more than 256 characters, such as Japanese, Chinese, and Korean, requires more complex schemes.External and Internal Character Sets Client systems communicate with the Teradata Database using their own external format for numbers and character strings. The Teradata Database converts numbers and strings to its own internal format when importing the data, and converts numbers and strings back to the appropriate form for the client when exporting the data. This approach allows data to be exchanged between mutually incompatible client data formats. Take for example, channel-attached clients using EBCDIC-based character sets and network-attached clients using ASCII-based character sets. Both clients can access and modify the same data in the Teradata Database.Character Data Translation The Teradata Database translates the characters: • Received from a client system into a form suitable for storage and processing on the server. • Returned to a client into a form suitable for storage, display, printing, and processing on that client. Thus, the server communicates with each client in the language of that client without the need for additional software. It is essential, for this process to work properly, for the database to be informed of the correct character set used by each client.126 Introduction to Teradata Warehouse
  • 145. Chapter 13: International Language Support Teradata Database Character Data StorageWhat Teradata Database Supports The Teradata Database supports many external client character sets and allows each application to choose the internal server character set best suited to each column of character data in the Teradata Database. Because of automatic translation, it is normally the repertoire of characters required that determines the server character set to use. No matter which server character set you chose, communication with the client is always in the client character set (also known as the session charset).Teradata Database Character Data Storage The Teradata Database uses internal server character sets to represent user data and data in the Data Dictionary within the system.Internal Server Character Sets Server character sets include: • LATIN • UNICODE • KANJI1 • KANJISJIS • GRAPHICUser Data User data refers to character data that you store in a character data type column on the Teradata Database.System Dictionary Data The term system dictionary data refers to the names of the following objects as they are stored in the Data Dictionary on the Database: • Tables • Databases • Users • Columns • Views • Macros • Triggers • Join indexes • Hash indexes • Stored proceduresIntroduction to Teradata Warehouse 127
  • 146. Chapter 13: International Language SupportLanguage Support Modes • User-defined functions • User-defined types • User-defined methods There are also many other character fields in the data dictionary, and many of these are Unicode.Language Support Modes During system initialization (sysinit) the database administrator can optimize the Teradata Database for one of two language support modes: • Standard • Japanese The language support mode determines the: • Character set that Teradata Database uses to store system dictionary data. • Default character set for user data. THEN Teradata Database stores IF you enable this language system dictionary data using this AND sets the user data support mode … character set … default character set to … Standard LATIN LATIN. Japanese KANJI1 UNICODE.Default Character Set for User Data The language support mode sets the default server character set for a user if the DEFAULT CHARACTER SET clause does not appear in the CREATE USER statement. To override the default character set for a user, you can use the DEFAULT CHARACTER SET clause in a CREATE USER statement. One can also choose the server character set specifically for each column as part of the SQL.Character Set for System Dictionary Data The character set that Teradata Database uses to store system dictionary data cannot be changed after you enable the language support mode during the sysinit process without again going through the sysinit process.128 Introduction to Teradata Warehouse
  • 147. Chapter 13: International Language Support Standard Language Support Mode IF you optimize the Teradata Database for this language support mode… THEN the names of objects stored in the Data Dictionary can contain … Standard only western European characters. Characters outside the ASCII range (all accented characters, for example), cannot appear in a regular identifier. Rather, they can only occur in a delimited identifier (one that is enclosed in double quotes). Japanese Japanese characters, but only if you use the Teradata-supplied Japanese client character sets. Japanese characters are stored using the KANJI1 server character set. KANJI1 data cannot necessarily be shared between clients with differing client character sets. If you use other multibyte client character sets, such as UTF8, Korean, or Chinese, only characters in the ASCII range can appear in an object name. Accented characters cannot be used.Character Set for Dictionary Data Other Than Object Names The items described as System Dictionary Data above are a subset known as “object names”. Object names are only part of the character data that Teradata Database stores in the Data Dictionary. For many character fields, the Teradata Database always uses the UNICODE server character set to store character data in the Data Dictionary, no matter which language support mode you enable.Standard Language Support Mode If you choose the standard language support mode, then Teradata Database stores system dictionary data and user data using the LATIN character set.LATIN Character Set Standard language support provides Teradata Database internal coding for the entire set of printable characters from the ISO 8859-1 (Latin1) and ISO 8859-15 (Latin9) standard, including diacritical marks such as ä, ñ, Ÿ, Œ, and œ, though the Z with caron in Latin9 is not supported. ASCII control characters are also supported for the standard language set. Note: The ASCII referred to in this chapter is based on Standard ASCII (X’00’ to X’7F’) with Teradata extensions to cover ISO 8859-1 (Latin1) and ISO 8859-15 (Latin9). ASCII, as used here, represents the characters that can be stored as the LATIN server character set, referred to as Teradata LATIN. The EBCDIC referred to in this chapter is the Teradata extended ASCII mapped to the corresponding EBCDIC code points.Introduction to Teradata Warehouse 129
  • 148. Chapter 13: International Language SupportJapanese Language Support ModeCompatible Languages The LATIN server character set that Teradata Database uses in standard language support mode is sufficient for you to use client character sets that support the international languages listed in the following table. International Languages That are Compatible with Standard Language Support Albanian English Germanic Portuguese Basque Estonian Greenlandic Rhaeto-Romantic Breton Faroese Icelandic Romance Catalonian Finnish Irish Gaelic Samoan (new orthography) Celtic French Italian Scottish Gaelic Cornish Frisian Latin Spanish Danish Galician Luxemburgish Swahili Dutch German Norwegian SwedishJapanese Language Support Mode If you enable the Japanese language support mode during the sysinit process, Teradata Database, by default, stores user data using the UNICODE server character set and stores system dictionary data using the KANJI1 server character set.Advantages (over Latin) of Storing System Dictionary Data Using KANJI1 The KANJI1 server character set is compatible with the Teradata-supplied Japanese client character sets, allowing you to use object names containing Kanji characters, Hiragana, Zenkaku (fullwidth) and Hankaku (halfwidth) Katakana, Zenkaku Romaji (Latin), and various other characters. You can also use the ASCII characters from other client character sets to name objects that are stored in the Data Dictionary.Advantages (over Latin or Kanji1) of Storing User Data Using UNICODE Unicode is a 16-bit encoding of virtually all characters in all current languages in the world. The Teradata UNICODE server character set supports Unicode 4.1 and is designed eventually to store all character data on the server. UNICODE may be used to store all characters from all single- and multibyte client character sets. User data stored as UNICODE can be shared among heterogeneous clients.130 Introduction to Teradata Warehouse
  • 149. Chapter 13: International Language Support Extended SupportExtended Support Extended support allows you to customize the Teradata Database to provide additional support for local character set usage. A sufficiently privileged user can create single-byte and multibyte client character sets that support, with certain constraints, any subset of the Unicode repertoire. Moreover, such a user can customize a collation for the entire Unicode repertoire. Extended support is available on systems that have been enabled with standard language support or Japanese language support.For More Information For more information on the topics presented in this chapter, see the following Teradata Database books. IF you want to learn more about… THEN see… International Language Support International Character Set SupportIntroduction to Teradata Warehouse 131
  • 150. Chapter 13: International Language SupportFor More Information132 Introduction to Teradata Warehouse
  • 151. CHAPTER 14 Query and Database Analysis Tools Although the Optimizer executes complex strategic and tactical queries in an efficient manner, when you look at a query plan, you may find it difficult to understand how and why the Optimizer chose the plan for a given query. This chapter discusses: • Tools found in the Teradata Analyst Pack, a set of tools designed to help automate and simply query plan analysis. These include: • Teradata Visual EXPLAIN (VE) • Teradata System Emulation Tool (SET) • Teradata Index Wizard • Teradata Statistics Wizard • Teradata Database Query Analysis Tools (DBQAT), tools designed to improve the overall performance analysis capabilities of the Teradata Database. These include: • Query Capture Facility • Database Query Log • Target Level Emulation (TLE) • Database Object Use CountTeradata Visual EXPLAIN Teradata Visual EXPLAIN (VE) is a tool that visually depicts the execution plan of complex SQL statements in a simplified manner. Teradata VE presents a graphical view of the statement broken down into discrete steps showing the flow of data during execution. Because comparing optimized queries is easier with Teradata VE, application developers and database administrators can fine-tune the SQL statements so that the Teradata Database can access data in the most effective manner. In order to view an execution plan using Teradata VE, the execution plan information must first be captured in the Query Capture Database (QCD) by means of the Query Capture Facility (QCF). Teradata VE reads the execution plan, which has been stored in a QCD, and turns it into a series of icons.Introduction to Teradata Warehouse 133
  • 152. Chapter 14: Query and Database Analysis ToolsTeradata System Emulation ToolTeradata System Emulation Tool When Target Level Emulation (TLE) information is stored on a test system, Teradata System Emulation Tool (SET) allows you to generate and examine the query plans using the test system Optimizer as if the plans were processed on the production system. Using Teradata SET you can: • Change system configuration details, including DBC Control fields, and table demographics and model the impact of various changes on SQL statement performance. • Determine the source of various Optimizer-based production problems. • Provide an environment in which Teradata Index Wizard can produce recommendations for a production system workload.Teradata Index Wizard The Teradata Index Wizard analyzes SQL queries and suggests candidate indexes to enhance their performance. The workload definitions, supporting statistical and demographic data, and index recommendations are stored in various QCD tables. Using data from a QCD or the Database Query Log (DBQL), the Teradata Index Wizard: • Recommends secondary indexes for the tables based on workload details, including data demographics, that are captured using the QCF. • Allows you to validate index recommendations before implementing the new indexes. • Allows you to perform what-if analysis on the workload. The Teradata Index Wizard allows you to determine whether your recommendations actually improve query performance. • Interfaces with other Teradata Tools and Utilities, such as Teradata SET to perform offline query analysis by importing the workload of a production system to a test system • Uses the Teradata Visual EXPLAIN and Compare (VEComp) tools to provide a comparison of the query plans with and without the index recommendations. Teradata Index Wizard can be started from Teradata Visual EXPLAIN, Teradata SET, Teradata Statistics Wizard, and Teradata Manager. Teradata Index Wizard can also open these applications, except for Teradata Manager, to help in your evaluation of recommended indexes.Demographics The Teradata Index Wizard needs demographic information to perform index analysis and to make recommendations. You can collect the following types of data demographics using SQL: • Query demographics134 Introduction to Teradata Warehouse
  • 153. Chapter 14: Query and Database Analysis Tools Teradata Statistics Wizard Use the INSERT EXPLAIN statement with the WITH STATISTICS and DEMOGRAPHICS clauses to collect table cardinality and column statistics. • Table demographics Use the COLLECT DEMOGRAPHICS statement to collect the row count and the average row size in each of the subtables in each AMP on the system.Teradata Statistics Wizard The Teradata Statistics Wizard is a graphical tool that was developed to improve the performance of queries and the entire database. The Statistics Wizard automates the process of collecting statistics for a particular workload or selecting arbitrary indexes or columns for collection or re-collection. In addition, the Statistics Wizard permits you to validate the proposed statistics on a production system. The validation capability enables you to verify the performance of the proposed statistics before applying the recommendations. The following table contains information about the capabilities of Teradata Statistics Wizard. You can… For… select a workload analysis and receive recommendations based on the results. select a database or select several tables, indexes, analysis and receive recommendations based on or columns the results. defer the schedule for the collection or recollection of statistics. display and modify statistics a column or index. receive recommendations analysis that are based on table demographics and general heuristics. As changes are made within a database, the Statistics Wizard identifies those changes and recommends which tables should have statistics collected, based on age of data and table growth, and the columns/indexes that would benefit from having statistics defined and collected for a specific workload. The administrator is then given the opportunity to accept or reject the recommendations.Query Capture Facility The Query Capture Facility (QCF) is available on the Teradata Database. The QCF captures the data pertaining to an execution plan and stores the data in a set of relational tables in a QCD.Introduction to Teradata Warehouse 135
  • 154. Chapter 14: Query and Database Analysis ToolsDatabase Query Log Applications of QCF and QCD: • Provide the foundation for the Teradata Index Wizard utility. • Can store all query plans for customer queries. You can then compare and contrast queries as a function of software release, hardware platform, and hardware configuration. • Provide the foundation for the Visual EXPLAIN tool, which displays EXPLAIN output graphically. • Provide data so that you can generate your own detailed analyses of captured query steps using standard SQL DML statements and third-party query management tools. You can execute the COLLECT, DROP, and HELP STATISTICS SQL statements against a QCD.QCD Schema Improvement QCD schema is designed to: • Minimize the number of tables required by capturing information in a generic fashion. • Promote usability. • Improve the overall performance of data storage and retrieval.Teradata Index Wizard Support A QCD is the central repository for the information used in the analyses performed by the Teradata Index Wizard. A QCD supports the Teradata Index Wizard by capturing and storing the data demographics and index wizard-related information that you specify. The workload definitions, supporting statistical and demographic data, and index recommendations are stored in various QCD tables. The Teradata Index Wizard analyzes various SQL query workloads and suggests candidate indexes to enhance the performance of those queries in the context of the defined workloads. For information about features and capabilities of Teradata Index Wizard, see “Teradata Index Wizard” on page 134.Database Query Log The Database Query Log (DBQL) is a Teradata Database tool that provides a series of predefined tables that can store, based on rules you specify, historical records of queries and their duration, performance, and target activity. DBQL is flexible enough to log information on the variety of SQL requests that run on the Teradata Database, from short transactions to longer-running analysis and mining queries. After implementing DBQL, you use simple SQL statements to control the start, extent, and duration of the logging activity. You can define rules, for instance, that log the first 4000 SQL characters of any query that runs during a session invoked by a specific user under a specific account, if the time to complete that query exceeds the specified time threshold.136 Introduction to Teradata Warehouse
  • 155. Chapter 14: Query and Database Analysis Tools Target Level Emulation You can request that DBQL log particular query information or just a count of qualified queries. You can specify that the recording criteria be a mix of: • Users and accounts • Elapsed time, where time can be expressed as: • A series of intervals • A threshold limit • Processing detail, including any or all: • Objects • Steps • SQL text In addition to the query-related data, DBQL stores the following information to help identify the query: • User name • Session number and account information DBQL data also can be input to TLE and Teradata Tools and Utilities, including Teradata Manager, and Teradata Visual Explain. Teradata Tools and Utilities aid in analysis and present the information in a graphic form that is easily manipulated and understood.Target Level Emulation Teradata Database supports Target Level Emulation (TLE) both on the Teradata Database server and in the client as follows. Teradata supports… On the… Target Level Emulation (TLE) Teradata Database server. Teradata System Emulation Tool (SET) client. For information about Teradata SET, see “Teradata System Emulation Tool” on page 134. The Teradata Database provides the infrastructure for TLE. You can use the standard SQL interface to capture the system configuration details and table demographics on one system and store them on another. Usually the information is obtained from a production system, then stored on a smaller test or development system. With this capability, the Optimizer can generate access plans similar to those that are generated on a production system. You can use the plans to analyze Optimizer- related production problems. This information can also be used by the Teradata SET.Introduction to Teradata Warehouse 137
  • 156. Chapter 14: Query and Database Analysis ToolsDatabase Object Use CountDatabase Object Use Count The database administrator and application developer can use Database Object Use Count to capture the number of times an application refers to an object. Database Object Use Count captures counts for the following: • Database • Table • Column • Index • View • Macro • Teradata stored procedure • Trigger • User-defined functions • User-defined types • User-defined methods Object use access information is not counted for EXPLAIN, INSERT EXPLAIN, or DUMP EXPLAIN statements. Once captured, you can use the information to identify obsolete or unused database objects, particularly those that occupy significant quantities of valuable disk space. Further, Database Object Use Count information can be useful to database query analysis tools like Teradata Index Wizard.For More Information For more information on the topics presented in this chapter, see the following Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Database Query Log • Database Administration • Data Dictionary • Performance Management • SQL Reference: Data Definition Statements • SQL Reference: Statement and Transaction Processing Object Use Count Database Administration138 Introduction to Teradata Warehouse
  • 157. Chapter 14: Query and Database Analysis Tools For More Information IF you want to learn more about… THEN see… Query Capture Database • Database Design • Teradata Manager User Guide • SQL Reference: Data Definition Statements • SQL Reference: Statement and Transaction Processing Target Level Emulation SQL Reference: Statement and Transaction Processing Teradata Index Wizard Teradata Index Wizard User Guide Teradata Statistics Wizard Teradata Statistics Wizard User Guide Teradata System Emulation Tool (SET) • Database Design • SQL Reference: Data Definition Statements • SQL Reference: Statement and Transaction Processing • Teradata System Emulation Tool User Guide Teradata Visual EXPLAIN Teradata Visual Explain User GuideIntroduction to Teradata Warehouse 139
  • 158. Chapter 14: Query and Database Analysis ToolsFor More Information140 Introduction to Teradata Warehouse
  • 159. SECTION 4 Managing and Monitoring TeradataIntroduction to Teradata Warehouse 141
  • 160. Section 4: Managing and Monitoring Teradata142 Introduction to Teradata Warehouse
  • 161. CHAPTER 15 Teradata Database Security This chapter describes the features and methods available to establish and maintain Teradata Database security. Topics include: • Security library • Security features • Security mechanisms • User authentication • User authorization • Encryption • Data integrity checking • Directory management of users • Monitoring access to the database • Defining a security policy • Publishing a security policySecurity Library Teradata Database security is based on Teradata Database Generic Security Services library (TDGSS), which is included when you install Teradata Database software. TDGSS is composed of: • A set of pre-configured security mechanisms. • Editable configuration files that allow you to revise mechanism properties to meet unique security needs. • A set of tools and interfaces for configuring and managing network security functions.Security Features Teradata database security includes the following features: • Security Mechanism: A mechanism selected at logon to set the security context for the session. Each security mechanism defines a unique security context.Introduction to Teradata Warehouse 143
  • 162. Chapter 15: Teradata Database SecuritySecurity Mechanisms • User Authentication: Verification of user identity at logon. The system checks user name, password, and other optional user information against user data stored in the database. Only valid users can access the database. • User Authorization: Authorization of users specifically granted privileges to create, alter, or delete data in the database. The system evaluates user SQL requests to perform such functions and authorizes user activity according to access privileges defined for that user in the database. • Encryption: Data transmitted across the network is encoded by the system to provide confidentiality. • Data Integrity: The system checks messages against what was sent to ensure that data has not been lost or corrupted during transmission across the network. • Directory Management of Users: Supported directories can be configured to authenticate users and authorized database access privileges. • Monitoring Access to the Database: Provides the ability to monitor database activity to identify violations, violators, and potential security hazards.Security Mechanisms Teradata Database employs security mechanisms to define the security context in which a database session will run. Each mechanism is composed of a number of properties that define the function of the mechanism. Some properties are editable. All security mechanisms are constructed using the TDGSS library. Users may select from among several available security mechanisms at logon. If the user does not select a mechanism at logon, the session will automatically defer to the default mechanism. Teradata Database currently provides the following standard, predefined security mechanisms. Mechanism Usage Teradata Method 2 (TD 2) Requests that the Teradata Database perform user authentication. Use TD 2 only if the following are true: • The client systems is at TTU8.2 or later. • The server is at V2R6.0 or later. Teradata Method 1 (TD 1) Requests that the Teradata Database perform user authentication. This mechanism is used only to maintain compatibility with legacy software. The system automatically selects TD1 when needed, even if the user selects TD2.144 Introduction to Teradata Warehouse
  • 163. Chapter 15: Teradata Database Security User Authentication Mechanism Usage Kerberos (KRB5) Requests that the Microsoft Kerberos application perform user authentication (external authentication). Use KRB5 for external authentication if either one of the following is true: • The client systems is at TTU8.2 or later. • The server is at V2R6.0 or later. Kerberos (KRB5C) Requests that the Microsoft Kerberos application perform user authentication (external authentication). This mechanism is used only to maintain compatibility with legacy software. The system automatically selects KRB5C when needed, even if the user selects KRB5. NTLM Requests that the Microsoft NT Lan Manager application perform user authentication. Use the NTLM mechanism for external user authentication if the following are true: • The client systems is at TTU8.2 or later. • The server is at V2R6.0 or later. NTLMC Requests that the Microsoft NT Lan Manager application perform user authentication. This mechanism is used only to maintain compatibility with legacy software. The system automatically selects NTLMC when needed, even if the user selects NTLM. LDAP Requests that a supported LDAP-compliant external directory perform user authentication. Directory authentication of users is only available if the following are true: • The client systems is at TTU8.2 or later • The server is at V2R6.0 or later For details on use of directory-based security functions, see Security Administration.User Authentication Users are authenticated when they log on to the database through a Teradata client application. Teradata Database provides the following features for controlling user authentication: • Logon formats and controls • Password format and controls • Optional user authentication by external applicationsIntroduction to Teradata Warehouse 145
  • 164. Chapter 15: Teradata Database SecurityUser AuthenticationLogon Formats Teradata Database provides two logon formats for access to client applications: • Command line • Graphical User Interface (GUI) Command-Line Logon Users provide the following information when logging on from a network-attached client: • .logmech: Specifies the name of the security mechanism that defines the security context under which the session will operate. If a mechanism is not specified, the logon proceeds using the designated default mechanism. • .logdata: Specifies the external username and password for user authentication by external applications. Also specifies the domain name when required by the application. • .logon: Specifies the tdpid, database username and password, and optional account string information for user authentication by the Teradata Database. For more information on command-line logons, see Security Administration. GUI Logons Some Teradata Client applications provide a logon GUI in the form of dialog boxes. The dialog boxes provide fields and buttons that prompt you to enter the same data required for a command-line logon. For an example of a GUI logon, see Security Administration. Logons from Channel-Attached Clients Sessions logged on from channel-attached clients do not support network security features such as security mechanisms, encryption, or directory management of users. Command-line logons from channel-attached clients use only the .logon field.Logon Controls Teradata Database automatically grants permission for all users defined in the database to logon from all connected client systems. But administrators can, for example: • Modify current or default logon permissions for specific users. • Give individual users permission to log on to the database only from specific client systems. • Set the maximum number of times a user can submit an unsuccessful logon string. • Enable authentication of the user by an external application, such as Kerberos or NTLM. In Teradata Database V2R6.2, Teradata Database enables administrators to restrict access to the database based on the IP address of the machine from which a user logs on.Password Format Passwords must conform to the following general password format rules: A password can contain:146 Introduction to Teradata Warehouse
  • 165. Chapter 15: Teradata Database Security User Authentication • 1 to 30 characters • Letters A through Z and/or a through z • Digits 0 through 9 (in single or multi-byte form) A password can be all numeric only if it is enclosed in quotes • $ (dollar sign) • _ (underscore) • # (pound sign) A password cannot contain: • Katakana symbols • Greek or Cyrillic characters • Multibyte spaces • Special characters other than $ (dollar sign) or _ (underscore) or # (pound sign) not enclosed in quotes • User-defined characters • Any blank characters, such as NULL, LINE FEED, or CARRIAGE RETURN. Note: Do not try to construct a password until you read the complete password format rules in Security Administration.Password Controls Teradata Database provides controls to enable the administration of passwords. Administrators can, for example: • Restrict the content of password strings, defining limitations on minimum and maximum password characters, for instance, and whether or not passwords can contain digits or special characters. • Set the number of days for which a password is valid. • Assign a temporary password. • Set the user lockout time after the user has exceeded the maximum number of logon attempts • Define the period during which a user may not reuse a previous password.External Authentication External authentication allows Teradata Database users to be authenticated by applications running on network-attached client systems. There are three types of external authentication.Introduction to Teradata Warehouse 147
  • 166. Chapter 15: Teradata Database SecurityUser Authorization Type Description Requirements Sign-on without user Once users have been authenticated by the • Client domain username and Teradata credentials client system, they do not have to resubmit Database username must match. a username and password to access the • The user must have LOGON WITH NULL (Single Sign-on) database. PASSWORD privileges. User must select a mechanism that corresponds to the authenticating application (Kerberos or NTLM) or it must be the default mechanism. Sign-on using external credentials: Directory Sign-on The user logs on to Teradata Database • The directory must be supported by Teradata with a directory username. Database. • The user is authenticated by the • The directory must be set up to map directory directory. users to permanent Teradata Database users, roles, and profiles. Unmapped users will be • Once authenticated, the user is also limited to the default EXTUSER privileges. authorized access according to the following rules: • Users must log on with their directory user name. • user automatically inherits the access privileges of the Teradata • Users must select the LDAP mechanism. Database user to which he or she is mapped. • user also takes on any Roles or Profiles to which he or she is mapped. Sign-on As The user logs on to Teradata Database • Client domain username and Teradata with a username and password Database username must match recognizable by the client domain, and is • The user must have LOGON WITH NULL authenticated by Kerberos or NTLM. PASSWORD privileges. • User must select a mechanism that corresponds to the authenticating application (Kerberos or NTLM) or it must be the default mechanism. For complete information on external authentication and the mechanisms that support it, see Security Administration.User Authorization Once users have been authenticated, they are only authorized to take actions that are allowed by their access privileges. The following table lists the various types of database access privileges and describes how they are acquired by a user.148 Introduction to Teradata Warehouse
  • 167. Chapter 15: Teradata Database Security User Authorization Privilege Type How Acquired Implicit Privileges Implicit privileges are acquired by default when a user owns a database object or owns the space in which the object was created. These are sometimes referred to as ownership privileges. Explicit Privileges Explicit privileges on users, databases, or objects are given to a user by employing a GRANT statement. Privileges can either be granted directly to a user or to a role of which the user is a member. Inherited Privileges Privileges may be inherited in the following ways: • Being a member of a group that has privileges, as in the case of roles. • Subclass privileges are inherited by a user that has been GRANTed a higher level of privileges that contain them. • A directory user mapped to Teradata Database user inherits the privileges of that user. For additional information on the full range of available privileges and how to use SQL statements to GRANT and REVOKE them, see SQL Reference: Data Definition Statements.Roles Roles define user access privileges on database objects for groups of users. In addition to a default role, an administrator can GRANT one or more roles to each user in addition to a default role. A member of a role may access all objects to which a role has privileges. As a user, you can use SET ROLE to switch from the default to any alternate role of which you are a member. The use of roles provides two administrative advantages: • Simplification of administration of access rights • Reduction of dictionary disk space For more information on roles, see Database Administration. For more information on the management of directory roles, see “Directory Managed Roles” on page 151.Profiles An administrator can define a profile and assign it to a group of users who share similar values for the following types of parameters: • Default database assignment • Spool space capacity • Temporary space capacity • Account strings permitted • Password security attributes The use of profiles provides two administrative advantages:Introduction to Teradata Warehouse 149
  • 168. Chapter 15: Teradata Database SecurityEncryption • Simplification of administration of parameters • Simplification of control for user-level password security For further information on profiles, see Database Administration.Encryption Teradata Database supports encryption of data transmitted between client applications and the Teradata Database. There are two types of encryption: • Logon encryption: Automatically provided by Teradata Database to ensure the security of the passwords used in logon strings. Passwords are encrypted by default where they are stored. Password are never decrypted. • Message encryption: Set to provide confidentiality for all data transmitted across the network between the client and the Teradata Database (optional).Logon Encryption When operating under default conditions, the Teradata Gateway accepts only encrypted logons and rejects unencrypted ones. The administrator can control encryption using an option in the Gateway Control utility. For the gateway to accept both encrypted and unencrypted logons, the administrator must set a Gateway Control option to yes. The client application cannot enable or disable logon encryption. Encryption is determined by the settings of the Teradata Gateway.Message Encryption The Teradata Database supports encryption of all data transmitted between network-attached clients and the Teradata Gateway. When logging on, users may employ any one of several selectable security mechanisms to define the network security context for a session. All mechanisms support encryption. The encryption/decryption cycle does have an effect on system performance, especially when applied to large-scale data transmission. Most client applications allow users to enable or disable data encryption for the duration of a session. For further information on encryption, see Security Administration.Data Integrity Checking The Teradata Database performs an automatic check of data integrity for all message transmissions (both encrypted and non-encrypted) across the network to ensure that data has not been changed, corrupted, or lost during transmission.150 Introduction to Teradata Warehouse
  • 169. Chapter 15: Teradata Database Security Directory Management of UsersDirectory Management of Users Normally, users that log on to the Teradata database have been defined in the database using a CREATE USER request. However, because many potential database users may already be defined in a directory running within the client network, the Teradata Database allows for authentication and authorization of users by supported directories. Integration of directory managed users simplifies administration by eliminating the need to create a database instance for every user.Supported Directories The Teradata Database interfaces with the following directories that conform to the Lightweight Directory Access Protocol (LDAP), version 3. Teradata Server Operating Systems Supported Directories Windows Server 2003 • Active Directory (all versions) • Sun Java System Directory Server • Other directories that support OpenLDAP Windows Active Directory (all versions) MP-RAS • Active Directory (all versions) • Sun Java System Directory Server • Other directories that support OpenLDAP Linux • Active Directory (all versions) • Sun Java System Directory ServerDirectory User Logons Directory users must logon on to the database using their directory user names. The logon must include the selection of the LDAP mechanism, unless it is the default mechanism.Integrating Directory Users To provide directory users more than the default EXTUSER (SELECT) privileges, the administrator must configure the directory to map the directory users to Teradata Database users, roles, and profiles.Directory Managed Roles Directory managed roles are handled differently from the way the administrator normally uses the ANSI-defined SQL statements to create, drop, grant, or revoke roles. For directory users, the administrator uses the CREATE EXTERNAL ROLE or DROP EXTERNAL ROLE statements.Introduction to Teradata Warehouse 151
  • 170. Chapter 15: Teradata Database SecurityMonitoring Access to the Database The administrator creates special external roles that are only for directory users. Once the external roles have been created, the administrator must assign (not GRANT) them to directory users.Profiles for Directory Users Mapped directory users inherit the profile assigned to the permanent user to which they are mapped. Additional profiles can be created and assigned directly to the directory user and will take precedence over the inherited profile. Profiles can also be assigned to individual users that are mapped to EXTUSER by default.Directory Tools The Teradata Database provides the following tools to search and validate directory content: • tdsbind: Diagnostic tool that allows you to determine the mapping between the Teradata Database objects and directory users. • tdssearch: Tool that allows you to explore directory assignments and help resolve issues with failed logon attempts or data access denials. These tools are installed as part of TDGSS and are run to query the directory. For more information about directory tools see, Security Administration.Monitoring Access to the Database The Teradata Database automatically tracks all logon and logoff activity. However, you can apply certain Teradata features to specify additional audits of specific events in the Teradata Database. Monitoring can help identify the following security hazards: • Potential break-ins. • Attempts to gain unauthorized access to database resources. • Attempts to alter the behavior of Teradata Database auditing functions. The security monitoring features can help you examine or print audit data during normal operation hours, or you can archive the data for later review by means of automated views and generated reports. Feature Description Data Dictionary Provides a repository from which you may check access privileges, user access, demographics, and general database access information.152 Introduction to Teradata Warehouse
  • 171. Chapter 15: Teradata Database Security Defining a Security Policy Feature Description System Views Provides information views about users, their access privileges, and historical data on grants, logons, and access activities. Views are provided via tables that follow the format DBC.<View_Name>; for example, DBC.AccessLog is a table that contains the view of the log access information for a given user. System View Queries Provides query capability on system views. This minimizes having to look for specific information in large table views. Access Logging Provides Data Definition Language (DDL) statements that you can use to monitor database access. These access logging checks are executed using the BEGIN LOGGING and END LOGGING statements. If you identify unauthorized or undesirable activity, you can take one or more of the following remedial actions to address the problem: • Do additional auditing of the actions of particular users. • Change compromised passwords. • Modify the access rights of some users. • Revise your security policy. • Deny the offending users any access to Teradata Database (in extreme cases). For more information about monitoring database access, see Security Administration.Defining a Security Policy Your security policy should be based on the following considerations: • Determine your security needs to balance the need for secure data against user needs for quick and efficient data access. • Review the Teradata Database security features to meet your needs. • Develop a security strategy that includes both system-enforced and personnel-enforced security features.Publishing a Security Policy To ensure administrators and users at your site understand and follow site-specific security procedures, the administrator should create a security handbook.Introduction to Teradata Warehouse 153
  • 172. Chapter 15: Teradata Database SecurityFor More Information The handbook should summarize how you are using Teradata Database security features for your database. You should include the following topics in this document: • Why security is needed. • Benefits of adhering to the security policy for both the company and the users. • A description of the specific implementation of Teradata Database security features at your site. • Suggested/required security actions for users and administrators to follow. • Who to contact when security questions arise.For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. If you want to learn more about… THEN see… Client (TDP) security Teradata Director Program Reference Detailed information on all aspects of Teradata Security Administration Database security Detailed logon requirements for Teradata Client The user guide for the respective application applications DBSControl, Gateway Control and other Utilities utilities used to set security parameters Information on how to use security-related SQL SQL Reference: Data Definition Statements statements such as GRANT and REVOKE. Security issues related to database Database Administration administration, such as setup of roles and profiles Security-related system tables and views Data Dictionary154 Introduction to Teradata Warehouse
  • 173. CHAPTER 16 System Administration While the system administrator is responsible for creating databases and users, he or she is also responsible for managing the database. This chapter discusses the use of roles and profiles, session management, and database maintenance utilities. Topics include: • Roles and profiles for users • Accounting • Maintenance utilities For information on creating database and users, see Chapter 7: “Database Objects, Databases and Users.”Roles and Profiles The task of system administration can be simplified using the features provided by Roles and Profiles. Think of a Role as a pseudo-user with privileges on a number of database objects. Think of a profile as a container that holds a set of parameters, such as database, spool space, temporary space, and accounts, to which the system administrator assigns certain values. After creating roles and profiles, the system administrator assigns them to users. Roles and profiles simplify system administration as follows. Using… Simplifies database administration because… Roles for which privileges are granted to changing Roles is easier than deleting old privileges database objects and all users assigned to the and granting new ones to a user who, for example, role who inherit those privileges may change jobs within his or her organization. It is easier to assign a Role to a new user than to specify all his or her privileges. Profiles to change parameter values for users changing a parameter value once in a Profile is assigned to the Profile easier than updating the value for each user. It is easier to assign a Profile to a new user than to specify all his or her parameters. For information on how to make all assigned Roles and Profiles available to a user, see Database Administration.Introduction to Teradata Warehouse 155
  • 174. Chapter 16: System AdministrationSession Management To learn more about using Roles and Profiles to ensure password security, see Chapter 15: “Teradata Database Security.”Session Management Users must log on to the Teradata Database and establish a session before system administrators can do any system accounting.Session Requests A session is established after the database accepts the username, password, and account number and returns a session number to the process. Subsequent Teradata SQL requests generated by the user and responses returned from the database are identified by: • Host id • Session number • Request number The database supplies the identification automatically for its own use. The user is unaware that it exists. The context for the session also includes a default database name that is the same as the user name. When the session ends, the system discards the context and accepts no further Teradata SQL statements from the user.Establishing a Session To establish a session, the user logs on to the database. The procedure varies depending on the client system, the operating system, and whether the user is an application program, or a user in an interactive terminal session using BTEQ or a third-party query processing product.Logon Operands The logon string can include any of the following operands: • Optional identifier for the database server, called a tdpid • User name • Password • Optional account number Note: In the Windows environment, the Single Sign-On feature eliminates the need for users to re-submit usernames, passwords, and account ids after they have logged on to Windows using their authorized user names and passwords. For more information about this feature, see Chapter 15: “Teradata Database Security.”156 Introduction to Teradata Warehouse
  • 175. Chapter 16: System Administration Maintenance UtilitiesMaintenance Utilities A large number of utilities are available to system administrators to perform maintenance functions on the Teradata Database. The following table lists some of the major Teradata Database utilities. The utility … Allows you to … Abort Host abort all outstanding transactions running on a failed host until the system administrator restarts the host. CheckTable check for inconsistencies between internal data structures, such as table headers, row identifiers, and secondary indexes. ampload display the load on all AMPs in a system, including the number of: • Available AMP worker tasks (AWTs) • Waiting messages (message queue length) cnsrun start and run a database utility from a script. Configuration define AMPs, PEs, and hosts and their interrelationships for a Teradata Database. ctl display and modify the fields of the Parallel Data Extensions (PDE) Control Parameters Globally Distributed Objects (GDOs). Note: ctl is a Windows and Linux utility. Database execute one or more of the standard DIP scripts packaged with the Teradata Initialization Database. Program (DIP) DBS Control interactively display and modify the DBS Control Record fields. Dump Unload/ save or restore system dump tables onto tape or disk. Load (DUL) Ferret do the following: • Define the scope of an action, such as a range of selected tables or vprocs. • Display the parameters and scope of the action. • Perform the action by either: • Moving data to reconfigure data blocks and cylinders. • Displaying disk space and cylinder free space percent in use of the defined scope. Note: This utility meant for use by NCR Field Engineers or the Teradata Support Center. Filer find and correct problems within the Teradata File System. Note: This utility meant for use by NCR Field Engineers or the Teradata Support Center.Introduction to Teradata Warehouse 157
  • 176. Chapter 16: System AdministrationMaintenance Utilities The utility … Allows you to … Gateway Control modify default values in the fields of the Gateway Control Globally Distributed Object (GDO). Gateway Global monitor and control the Teradata network-connected users and their sessions. Lock Display view a snapshot capture of all real-time database locks and their associated currently running sessions. Locking Logger log the following: • Transaction identifiers • Session identifiers • Locked object identifiers • Lock levels associated with executing SQL statements modmpplist modify the node list file (mpplist). Priority Scheduler prioritize process scheduling. Processes have an externally assigned priority associated with their Teradata Database session. Priority Scheduler uses the priority to allocate CPU and I/O resources. Query report the current Teradata Database configuration, including the node, AMP, Configuration and PE identification and status. Query Session monitor the state of all or selected Teradata Database sessions on all or selected logical host ids. Reconfiguration implement the Teradata Database system that is described in the configuration map created in a previous Configuration utility session. Reconfiguration estimate an elapsed time for reconfiguration based upon the number and size Estimator of tables on your current system and provides estimates for the following phases: • Redistribution • Deletion • NUSI building Recovery Manager display information used to monitor progress of a Teradata Database recovery. Resource Check do the following: Tools • Identify a slow-down or hang of the Teradata Database. • Display system statistics that could lead to the cause of the slow down or hang. RSSmon do the following: • Display PDE real-time resource usage per node. • Select relevant data fields from a specific Resource Sampling Subsystem (RSS) table to be examined for PDE resource usage monitoring purposes. Note: RSSmon is a UNIX MP-RAS utility.158 Introduction to Teradata Warehouse
  • 177. Chapter 16: System Administration Maintenance Utilities The utility … Allows you to … Showlocks display locks placed by Archive and Recovery and Table Rebuild operations on databases and tables. System Initializer do the following: • Initialize the Teradata Database. • Update the DBS Control Record and other Globally Distributed Objects (GDOs). • Initialize or update configuration maps. • Set hash function value in the DBS Control Record. Table Rebuild rebuild tables that the Teradata Database cannot automatically recover, including the following: • Primary or fallback portion of a table • An entire table • All tables in a database • All tables in an Access Module Processor (AMP) Table Rebuild can be run as an interactive or a background task. tdlocaledef convert the Source Specification for Data Formatting (SDF) into an internal form usable by the Teradata Database. tdnstat do the following: • Perform a GetStat/ResetStat operations. • View, get, or clear the Teradata Network Services specific statistics. tdntune perform a read/write of tdn tunables. You can use the interface to view, get, or update the Teradata Network Services, which are specific to tunable parameters. Teradata MultiTool use a Windows Graphical User Interface (GUI) to run command-line-based Teradata Database and PDE tasks. TPCCONS perform the following 2PC-related functions: • Display a list of coordinators that have in-doubt transactions. • Display a list of sessions that have in-doubt transactions. • Resolve in-doubt transactions. tsklist display information about PDE processes and their tasks. Note: This is a Windows and Linux utility. Update DBC recalculate the PermSpace and SpoolSpace values in the DBASE table for the user DBC and the MaxPermSpace and MaxSpoolSpace values of the DATABASESPACE table for all databases based on the values in the DBASE table. Update Space recalculate the permanent, temporary, or spool space used by a single database or by all databases in a system. vpacd improve the performance of systems with several CPUs and a high level of concurrency. Note: vpacd is an NCR UNIX MP-RAS utility.Introduction to Teradata Warehouse 159
  • 178. Chapter 16: System AdministrationFor More Information The utility … Allows you to … Vproc Manager manage the virtual processors (vprocs), such as obtain status of all or some vprocs, initialize vprocs, force a vproc restart, and force a Teradata Database restart. xctl display and modify the fields of the Parallel Database Extensions (PDE) Control Parameters Globally Distributed Objects (GDOs). Note: xctl is a UNIX MP-RAS utility. xperfstate display real-time performance data for a PDE system, including system-wide CPU utilization, system-wide disk utilization, and more. Note: xperstate is a UNIX MP-RAS utility. xpsh use a GUI front-end for performing various system-level tasks in an MPP system environment, such as debugging, analyzing, monitoring, sysadmins, and so forth. Note: xpsh is a UNIX MP-RAS utility.For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Roles and Profiles for Users • Database Administration • SQL Reference: Fundamentals • SQL Reference: Data Definition Statements Maintenance utilities Utilities Session management Database Administration160 Introduction to Teradata Warehouse
  • 179. CHAPTER 17 Database Management Tools and Utilities Teradata offers a wide variety of utilities, management tools, and peripherals. Some of these reside on the Teradata Database and others are part of the Teradata Tools and Utilities management suite available for installation in client environments. With database management tools, you can backup and restore important data, save dumps, and investigate and control the Teradata Database configuration, user sessions, and various aspects of its operation and performance. Management and analysis tools help keep the database running at optimum performance levels. Topics include: • Data archiving utilities • Data load and export utilities • Session and configuration management tools • System resource and workload management tools • Teradata SQL Assistant • PocketDBA for TeradataData Archiving Utilities Storing data for future retrieval is an important part of system administration. Teradata Tools and Utilities offers the following archive and recovery utilities: • Teradata Archive/Recovery (for channel-attached and network attached systems) • Open Teradata Backup products for Microsoft Windows systems including: • NetBackup (network-attached systems) • NetVault (network-attached systems)Teradata Archive/Recovery Utility The Teradata Archive/Recovery utility (ARC) supports archiving and restoring Teradata Database databases, individual tables, or permanent journals to any of the following media: • Client tape • Client fileIntroduction to Teradata Warehouse 161
  • 180. Chapter 17: Database Management Tools and UtilitiesData Load and Export Utilities ARC also includes recovery with rollback and rollforward functions for data tables defined with a journal option. For more information about rollback and rollforward, see Chapter 11: “Concurrency Control and Transaction Recovery.”Open Teradata Backup Open Teradata Backup (OTB) supports open architecture products that provide backup and restore functions for Microsoft Windows clients. The following products are available: • NetVault The NetVault Teradata Module is a backup system that allows you to graphically select databases and tables and specify the kinds of backups (distributed, online, and so forth) you want to perform. • NetBackup NetBackup for Teradata supports parallel backups and restores coordinated across multiple hosts connected to a single Teradata Database. The full functional capabilities of the NetBackup server and the multiple media servers are realized in this product. In addition, NetBackup uses an Administrative Host, which contains a Graphical User Interface (GUI) to provide object browsing and selection, automatic script generation and centralized job monitoring. Note: Contact Teradata Global Sales Support for information about the controlled distribution of NetBackup.Data Load and Export Utilities Teradata Tools and Utilities data load utilities are designed to operate following one of two philosophies: Utilities operate either… For example… And are typically used… as fast as possible, with little • Teradata MultiLoad in a decision support regard for the impact on • Teradata FastLoad environment where system users transactions for the day are loaded during a nightly batch window when there are few interactive users. or, in the background and limit Teradata TPump to process a continuous feed of the impact on interactive users near-realtime updates while interactive users require rapid responses.Teradata MultiLoad The Teradata MultiLoad utility supports bulk INSERTs, UPDATEs, and DELETEs against initially unpopulated or populated tables. Both the client and server environments support Teradata MultiLoad.162 Introduction to Teradata Warehouse
  • 181. Chapter 17: Database Management Tools and Utilities Data Load and Export Utilities Teradata MultiLoad can: • Run against multiple tables. • Perform block transfers with multi-session parallelism. • Load data from multiple input source files. • Pack multiple SQL statements and associated data into a request. • Perform data Upserts.Teradata FastLoad The Teradata FastLoad utility loads data in unpopulated tables only. Both the client and server environments support Teradata FastLoad. Teradata FastLoad can: • Load data into an empty table. FastLoad loads data into one table per job. If you want to load data into more than one table, you can submit multiple FastLoad jobs. • Perform block transfers with multi-session parallelism.Teradata Parallel Data Pump The Teradata Parallel Data Pump (TPump) utility uses standard SQL/DML (not block transfers) to maintain data in tables. It also contains a resource governing method whereby you can control the use of system resources by specifying how many INSERTs and UPDATEs occur minute-by-minute. This allows background maintenance for INSERT, DELETE, and UPDATE operations to take place at any time of day while the Teradata Database is in use. TPump provides the following capabilities: • Has no limitations on the number of instances running concurrently. • Uses conventional row hash locking, which provides some amount of concurrent read and write access to the target tables. • Supports the same restart, portability, and scalability as Teradata MultiLoad. • Perform data Upserts.Teradata FastExport Utility To export data, Teradata Tools and Utilities provides the Teradata FastExport utility. The Teradata FastExport utility exports data in parallel. The utility exports large quantities of data from the Teradata Database to a client and is the functional complement of the FastLoad and MultiLoad utilities. Teradata FastExport can: • Export tables to client files. • Export data to an Output Modification (OUTMOD) routine. You can write an OUTMOD routine to select, validate, and preprocess exported data. • Perform block transfers with multi-session parallelism.Introduction to Teradata Warehouse 163
  • 182. Chapter 17: Database Management Tools and UtilitiesSession and Configuration Management ToolsSession and Configuration Management Tools Database management tools include utilities for investigating active sessions and the state of the Teradata Database configuration, such as: • Query Session • Query Configuration • Gateway Global The following table contains information about the capabilities of each utility. This utility… Does the following… Query Session • Provides information about active Teradata Database sessions. • Monitors the state of all or selected sessions on selected logical host IDs attached to the Teradata Database. • Provides information about the state of each session including session details for Teradata Index Wizard. For more information about Teradata Index Wizard, see “Teradata Index Wizard” on page 134. Query Configuration Provides reports on the current Teradata Database configuration, including: • Node • AMP • PE identification and status Gateway Global • Allows you to monitor and control the sessions of Teradata Database network-connected users. • Runs as a separate operating system task and is the interface between the network and the Teradata Database. • Supports up to 1200 sessions per gateway, depending on available system resources and the number of allotted PEs. Note: At least one PE that can support up to 120 sessions is required for each logical network attachment. • Allows client programs that communicate through the gateway to the Teradata Database to be installed and running on either: • The Teradata Database server • Network-attached workstations Client programs that run on a channel-attached host bypass the gateway completely.164 Introduction to Teradata Warehouse
  • 183. Chapter 17: Database Management Tools and Utilities System Resource and Workload Management Tools and ProtocolsSystem Resource and Workload ManagementTools and Protocols Teradata provides the Teradata Database with specific tools, protocols, and tool architectures for system resource and workload management. Among these are: • Write Ahead Logging (WAL) • Ferret utility • Priority Scheduler • Teradata MultiTool • Teradata Active System Management (TASM)Write Ahead Logging (WAL) The Teradata Database uses a Write Ahead Logging (WAL) protocol introduced in Teradata Database V2R6.2. According to this protocol, writes of permanent data are written to a log file that contains the records representing updates. The log file is forced to disk at key moments, such as at transaction commit. Modification to permanent data from different transactions, all written to the WAL log, can also be batched. This achieves a significant reduction in I/O write operations. One I/O operation can represent multiple updates to permanent data. The WAL Log is conceptually similar to a table, but the log has a simpler structure than a table. Log data is a sequence of WAL records, different from normal row structure and not accessible via SQL. The WAL Log includes the following: • Redo Records for updating disk blocks and insuring file system consistency during restarts, based on operations performed in cache during normal operation. • Transient Journal (TJ) records used for transaction rollback. WAL protects all permanent tables and all system tables, except Transient Journal (TJ) tables, user journal tables, and restartable spool tables (global temporary tables). Furthermore, WAL allows the Teradata database to be reconstructed from the WAL Log in the event of a system failure. WAL stages in-place writes through a disk area called the DEPOT, a collection of cylinders. Staging in-place writes through the DEPOT ensures that either the old or the new copy is available after a restart. The Ferret and the Filer utilities allow display and modification of the WAL Log and its index. Ferret includes the following: • The SCANDISK option includes checking the WAL Log by default. • The SCOPE option can be set to just the WAL Log. • The SHOWBLOCKS and SHOWSPACE options display log statistics and space.Introduction to Teradata Warehouse 165
  • 184. Chapter 17: Database Management Tools and UtilitiesSystem Resource and Workload Management Tools and Protocols Filer includes the following: • The WAL and WREC commands display WAL Log records by ordinal record number or by embedded log sequence number respectively, similar to the TABLE and ROW commands. • The WMI, WCI, and WDB commands display the WAL Log index structures. • Generic commands, such as BLK, DELETE, DISPLAY, IDENT, NEXT, PREV, and PATCH, deal with log data as well as with normal data. • The SCANDISK option includes checking the WAL log by default; it can be directed to check just the WAL log. • A WHERE clause allows restricting Selects to a subset of the WAL Log records within a range.Ferret Utility The Ferret utility is a tool that you can use to set various disk space utilization attributes associated with the Teradata Database while maintaining the integrity of the data managed by the Teradata Database file system. After you have selected the attributes and functions, Ferret dynamically reconfigures the data on the disks to correspond with the selections. Depending on the functions, Ferret can operate at the vproc, table, subtable, disk, or cylinder level.Priority Scheduler Teradatas Priority Scheduler is a workload management facility that controls access to resources among the work active on the Teradata platform. Priority Scheduler allows the administrator to define different priorities for different categories of work, and comes with a number of flexible options. The Priority Scheduler is active in all Teradata Database systems. The Teradata Database itself automatically moves internal jobs into different priority levels, especially when a quick boost to one activity is critical to overall throughput. Priority Scheduler has these capabilities: • Instituting better service for your more important work. • Controlling resource sharing among different applications. • Preventing aggressive queries from over-consuming at the expense of other work. • Automating changes in priority by query or session CPU usage levels. • Placing a ceiling on CPU usage for some applications. The Priority Scheduler Administrator available in Teradata Manager provides a graphical interface for configuration, management, and monitoring. For information, see “Teradata Manager” on page 171.166 Introduction to Teradata Warehouse
  • 185. Chapter 17: Database Management Tools and Utilities System Resource and Workload Management Tools and ProtocolsTeradata MultiTool Teradata MultiTool is a Teradata Database utility that offers a graphical user interface (GUI) on Windows systems that Teradata administrators and support personnel can use as an interface to command-line-based Teradata and PDE tasks. You can start specific utilities using the options available in the GUI. The following table lists the tools accessible from Teradata MultiTool. The tool … Is used to … Control GDO display and modify the fields of the PDE GDO (Globally Distributed Editor (CTL) Object). Database Window activate the Supervisor window and subwindows. (DBW) Database execute one or more of the standard Database Initialization Program Initialization Structured Query Language (SQL) scripts packaged with the database. Program (DIP) Vproc Manager perform the following functions: • Obtain the status of vprocs. • Change vproc states. • Initialize and boot a specific vproc. • Initialize the vdisk associated with a specific vproc. • Force a database restart.Teradata Active System Management (TASM) Teradata Active System Management (TASM) is a system management tool architecture that describes how individual monitoring and management tools work are coordinated to support business-driven, workload-centric system management goals. The following tools and products are part of TASM: • Teradata Workload Analyzer • Teradata Dynamic Workload Manager (DWM) • Teradata Manager’s Dashboard Workload Monitor and Workload Trend Analysis TASM architecture helps conceptualize workload management, performance tuning, and performance monitoring under one domain. TASM provides a single view of system performance and enables system management to occur conceptually in a comprehensive way. The tools that are part of TASM architecture: • Help control resource allocation. Long and short queries coming from the same user can be classified to run at the correct priority before they begin execution. • Provide automated exception handling. Queries that run in an anomalous manner are automatically detected and dynamically managed.Introduction to Teradata Warehouse 167
  • 186. Chapter 17: Database Management Tools and UtilitiesSystem Resource and Workload Management Tools and Protocols • Display real-time performance for each active workload and enable longer-term trend analysis. Teradata Workload Analyzer Teradata Workload Analyzer (WA) helps DBAs identify classes of queries (workloads) and provides recommendations on workload definitions and operating rules to ensure that database performance meets Service Level Goals (SLGs). Through graphical displays such as pie charts showing CPU and I/O utilization by accounts or applications, and histograms showing actual service levels, along with recommended settings, Teradata WA makes it easier for administrators to manage distribution of resources effectively. Teradata Dynamic Workload Manager Teradata Dynamic Workload Manager (DWM) supports detailed creation and management of Workload Definitions (WDs). These are sets of rules that define classes of queries based on business-driven allocations of operating resources. Teradata DWM can: • Filter queries against object-access or query-access rules. • Prevent queries from running at specific times of the day based on object access or query resource rules. • Control the level of concurrency by specific groups of users. • Control concurrency of different load utilities by time of day. Teradata DWM provides three categories of rules to enable dynamic workload management. • Category 1: Filter Rules Filter rules restrict access to the system based on the following: • Object types • SQL types • Estimated rows and processing time • Category 2: Throttle Rules Throttle rules manage incoming work based on the following: • System session concurrency • Query throttling based on various user, account, performance, and group attributes • Load utility concurrency • Category 3: Workload Class Rules Workload Class rules create Workload Definitions (WDs) based on the following: • Various attributes of a PG. • Exception criteria for the query that, when exceeded, cause various actions to occur. • System statistics that can be used for analysis in determining long-term trends. Note: You are required to have the Teradata Manager installed on your system in order to use Teradata DWM.168 Introduction to Teradata Warehouse
  • 187. Chapter 17: Database Management Tools and Utilities Teradata SQL Assistant For more on Teradata DWM, see Teradata Dynamic Workload Manager User Guide and Database Administration. Teradata Manager’s Dashboard Workload Monitor and Workload Trend Analysis Teradata Manager’s Dashboard Workload Monitor provides a view of current and recent historical workload status, as well as options for changing the workload definition assigned for the current session. Teradata Manager’s Workload Trend Analysis lists workload definitions according to various user-defined criteria and reports workload definition usage trends. For other information on Teradata Manager, see “Teradata Manager” on page 171.Teradata SQL Assistant Teradata SQL Assistant provides information discovery capabilities on Windows-based systems. Teradata SQL Assistant retrieves data from any ODBC-compliant database server and allows you to manipulate and store the data on your desktop PC. You can then use this data to produce consolidated results or perform analyses on the data using tools such as Microsoft Excel. The following table contains information about key features of Teradata SQL Assistant. This feature… Allows you to… Reports • Create reports from any database that provides an ODBC interface. • Use an imported file to create many similar reports (query results or answer sets); for example, display the DDL (SQL) that was used to create a list of tables. Data manipulation • Export data from the database to a file on a PC. • Import data from a PC file directly to the database. • Create a historical record of the submitted SQL with timings and status information, such as success or failure. • Use the Database Explorer Tree to easily view database objects. Queries • Use SQL syntax examples to help compose your SQL statements. • Send statements to any ODBC database or the same statement to many different databases. • Limit data returned to prevent runaway execution of statements. Teradata stored procedures Use a procedure builder that gives you a list of valid statements for building the logic of a stored procedure, using Teradata syntax.Introduction to Teradata Warehouse 169
  • 188. Chapter 17: Database Management Tools and UtilitiesFor More Information Teradata SQL Assistant electronically records your SQL activities with data source identification, timings, row counts, and notes. Having this historical data allows you to build a script of the SQL that produced the data. The script is useful for data mining.For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Archive utilities Teradata Archive/Recovery Utility Reference Load and export utilities • Teradata FastExport Reference • Teradata FastLoad Reference • Teradata MultiLoad Reference • Teradata Parallel Data Pump Reference Teradata Database management utilities Utilities Priority Scheduler Utilities Teradata MultiTool Graphical User Interfaces: Database Window and Teradata MultiTool Teradata Active System Management (TASM) Performance Management Teradata Dynamic Workload Manager (DWM) Teradata Dynamic Workload Manager User Guide Teradata SQL Assistant Teradata SQL Assistant for Microsoft Windows User Guide170 Introduction to Teradata Warehouse
  • 189. CHAPTER 18 Aspects of System Monitoring This chapter discusses various aspects of monitoring the Teradata Database, including the monitoring tools used to track system and performance issues. Topics include: • Teradata Manager • Teradata Graphical User Interface (GUI) • Resource usage (ResUsage) monitoring • Performance monitoringTeradata Manager The Teradata Manager is a suite of management tools and applications available in Teradata Tools and Utilities. You can use them to monitor, control, and administer one or more Teradata Database servers. The suite of performance monitoring applications collects, queries, manipulates, and displays performance and usage data. This information allows you to quickly identify and resolve resource usage abnormalities. Teradata Manager can display dynamic and historical data in graphical and tabular formats. The client/server feature of Teradata Manager replicates performance data in the Teradata Database server for access by any number of clients. Because data is collected once, workload on the Teradata Database remains constant while the number of client applications varies. You can access information from a desktop, laptop, or the Wireless Palm VII. The information in the following table summarizes Teradata Manager control applications.Introduction to Teradata Warehouse 171
  • 190. Chapter 18: Aspects of System MonitoringTeradata Manager Function Application/Description Performance applications Teradata Performance Monitor (PMON) Provides seven functional areas for monitoring system activity: • Configuration summary • Performance summary • Resource usage (both physical and virtual) • Session and lock information • Session history • Control functions • Graphic displays of resource and session data Teradata Priority Scheduler Administrator • Provides administrative capabilities for Teradata Priority Scheduler. • Prevents bottlenecks and speeds responses to queries by automatically balancing the database workload. • Ensures that queries requiring immediate handling are given priority treatment by letting the jobs cut in line ahead of lower priority work. Centralized Alerts/Event Management Facilitates the monitoring of performance characteristics and faults. It can automatically send a page or an e-mail when certain events occur. Alert Policy Editor Allows you to define actions and specify when action should be taken based on thresholds that you set for the following: • Teradata Database performance metrics • Database space utilization • Messages in the database Event Log. The Alert Viewer Allows you to easily view system status for multiple systems. Trend Analysis Allows you to study Teradata Database resource utilization trends from summarized reports displayed as charts. You can do the following: • Detect resource usage abnormalities. • Determine the onset of a problem. • Analyze the impact of the problem on the system.172 Introduction to Teradata Warehouse
  • 191. Chapter 18: Aspects of System Monitoring Teradata Manager Function Application/Description Database management applications Teradata Administrator Allows you to perform database administration tasks, such as: • CREATE, MODIFY and DROP users or databases. • CREATE tables. • GRANT or REVOKE access/monitor rights. • COPY table, view, or macro definitions to another database or to another system. • DROP or RENAME tables, views, or macros. • Move space from one database to another. • Run an SQL query. • Display information about a database. • Display information about a table, view, or macro. Space Usage Monitors disk space utilization and re-allocates permanent space from one database to another. System Maintenance Provides various macros for performing clean-up of system tables.Introduction to Teradata Warehouse 173
  • 192. Chapter 18: Aspects of System MonitoringTeradata Graphical User Interface (GUI) Function Application/Description Operational control Session Information Monitors the status of sessions on Teradata. The status information includes: • Idle • Active • Blocked • Responding • Parsing • Aborting • Details • Prolonged idles Remote Console Allows you to run many of the Teradata console utilities from the Teradata Manager PC. Error Log Analyzer Provides an interface to view the error logs for an associated Teradata Database. LogOnOff Usage Presents daily, weekly, and monthly logon statistics. BTEQ Window (BTEQWIN) Provides a graphical Windows-type interface to BTEQ. Gives Teradata Manager applications a consistent, graphical interface. Access management Allow you to manage security access to the Teradata Database using the features of Teradata Administrator and Profile capabilities. Teradata Administrator establishes account and privilege assignments that control access to the Teradata Database. Profile capabilities allow you to create user profiles that define who can access certain Teradata Database and Teradata Manager applications.Teradata Graphical User Interface (GUI) The Teradata GUI is the primary vehicle for starting and controlling the operation of the Teradata Database utilities, and runs in a graphical X Window or Microsoft Windows environment. The Teradata GUI communicates with the Teradata Database through the console subsystem (CNS), which is part of the Parallel Database Extensions (PDE) software.174 Introduction to Teradata Warehouse
  • 193. Chapter 18: Aspects of System Monitoring Resource Usage (ResUsage) Monitoring By definition, the Teradata Database is always in one of several states. You can monitor these states from the Teradata GUI. The following table lists and describes the states. Status Description Offline Either the processor to which the database console is attached or the entire database has been started offline. The database cannot be accessed from a client or used for processing. Startup The system is starting up but is not ready to process requests. Logoff No new sessions may log on (logons are disabled), but one or more sessions remain logged on. Logoff/Quiet No new sessions may log on, and no sessions are currently logged on. The system is quiescent. Logon New sessions may log on (logons are enabled) and one or more sessions are currently logged on. Logon/Quiet New sessions may log on (logons are enabled), but no sessions are logged on. Reconfig The reconfiguration program is running.Resource Usage (ResUsage) Monitoring The Teradata Database has facilities that permit you to monitor the use of resources such as: • CPUs • AMPs • Disk activity • BYNET activity Resource usage, or ResUsage, is the collection and reporting of statistical information about these resources. You can use resource usage data to: • Measure system benchmarks. • Measure component performance. • Assist with on-site job scheduling. • Identify potential performance impacts. • Plan installation, upgrade, and migration. • Analyze performance degradation and improvement. • Identify problems such as bottlenecks and parallel inefficiencies.Resource Usage Tables and Views Resource usage data is stored in Teradata Database tables and views in the DBC database. Macros installed with Teradata Database generate reports that display the data. You can alsoIntroduction to Teradata Warehouse 175
  • 194. Chapter 18: Aspects of System MonitoringResource Usage (ResUsage) Monitoring write your own queries or macros on resource usage data. As with other database data, you can access resource usage data using SQL. You need to decide which kinds of resource usage data you want to collect and the level of detail you want it to cover.Resource Usage Data Categories Each row of resource usage data contains two broad categories of information: • Housekeeping, containing identifying information • Statistical Each item of statistical data falls into a defined kind and class. Each kind corresponds to one (or several) different things that may be measured about a resource.Resource Usage Data Handling Resource usage data handling is divided into two phases: 1 Various subsystems gather resource usage data and the Resource Sampling Subsystem (RSS) collects the data in collect buffers. 2 The collected data is logged to ResUsage tables periodically (as determined by user-defined logging intervals). The logged resource usage data is then available for analysis by the various ResUsage macros.Resource Usage Macros The facilities for analyzing resource usage data are provided by means of a set of ResUsage macros tailored to retrieving information from a set of system views designed to collect and present resource usage information.How to Control Collection and Logging of Resource Usage Data Several mechanisms exist within the Teradata Database for setting the collection and logging rates of resource usage data. The control sets allow users to do any of the following: • Specify data collection rate. • Specify data logging rate. • Enable or disable ResUsage data logging on a table-by-table basis. • Enable or disable summarization of the data. Collection rates control the frequency that resource usage data is made available to applications. Logging rates control the frequency that resource usage data is logged to the ResUsage tables. You can specify data collection without specifying logging. This capability saves space in system tables while making resource usage data available to applications, such as Teradata Performance Monitor. You can use the Teradata GUI command SET LOG to establish the logging of resource usage information. The system inserts data into ResUsage tables every logging period for the tables176 Introduction to Teradata Warehouse
  • 195. Chapter 18: Aspects of System Monitoring Performance Monitoring that have logging enabled. You can use the statistics collected in the ResUsage tables to analyze system bottlenecks, determine excessive swapping, and detect system load imbalances.Summary Mode You can activate summarization mode for many ResUsage tables independently. This mode reduces database I/O by summarizing data from multiple vprocs and other objects on each node in one representative row. The summarization reduces detail, but the data is very useful for exploratory analysis of performance problems and general resource usage issues. When the summarization mode is active, the different classes of data are summarized as follows: • The cnt and cur fields contain the sum of all the summarized values they represent. • The max fields contain the maximum of all the summarized values they represent.Performance Monitoring Several facilities exist for monitoring and controlling system performance.Account String Expansion (ASE) Account String Expansion (ASE) is a mechanism that enables AMP usage and I/O statistics to be collected. ASE supports performance monitoring for an account string. The system stores the accumulated statistics for a user/account string pair as a row in DBC.AMPUsage table in the Data Dictionary. Each user/account string pair results in a new set of statistics and an additional row. You can use this information in capacity planning or in charge back and accounting software. At the finest granularity, ASE can generate a summary row for each SQL request. You can also direct ASE to generate a row for each user, each session, or for an aggregation of the daily activity for a user. ASE permits you to use substitution variables to include date and time information in the account id portion of a user logon string. The system inserts actual values for the variables at Teradata SQL execution time.The TDPTMON The Teradata Director Program (TDP) User Transaction Monitor (TDPTMON) is a client routine that enables a system programmer to write code to track TDP elapsed time statistics.System Management Facility The System Management Facility (SMF) is available in the Multiple Virtual Storage (MVS) environment only. This facility collects data about Teradata Database performance, accounting, and usage. Data is grouped into the following categories:Introduction to Teradata Warehouse 177
  • 196. Chapter 18: Aspects of System MonitoringFor More Information • Session information • Security violations • PE stopsThe Performance Monitor/Application Programming Interface The Performance Monitor/Application Programming Interface (PM/API) provides hooks into the Performance Monitor and Production Control (PM and PC) functions resident within the Teradata Database. PM and PC data is available through a log-on partition called MONITOR using a specialized PM/API subset of the Call-Level Interface version 2 (CLIv2) routines. The PM/API uses the Resource Sampling System (RSS) to collect performance data, and set data sampling and logging rates. Collected data is stored in memory buffers, and is available to the PM/API with little or no performance impact. Using PM/API commands, you can collect performance data on: • Current system configuration, status, and utilization. • Resource usage and status of an individual AMP, PE, or node. • Resource usage and status of individual sessions. • Problem SQL requests. PM/API data may be used to show how efficiently the Teradata Database is using its resources, to identify problem sessions and users, and to abort sessions and users having a negative impact on system performance.For More Information For more information on the topics presented in this chapter, see the following Teradata Database and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Account String Expansion Performance Management Performance Monitor/Application PM/API Reference Programming Interface Resource Usage Resource Usage Macros and Tables System Management Facility Database Administration TDPTMON Performance Management Teradata Graphical User Interface Graphical User Interfaces: Database Window and Teradata MultiTool178 Introduction to Teradata Warehouse
  • 197. Chapter 18: Aspects of System Monitoring For More Information IF you want to learn more about… THEN see… Teradata Manager • Teradata Manager User Guide • Teradata Manager Installation GuideIntroduction to Teradata Warehouse 179
  • 198. Chapter 18: Aspects of System MonitoringFor More Information180 Introduction to Teradata Warehouse
  • 199. CHAPTER 19 Teradata Meta Data Services The Teradata Meta Data Services product provides a means of storing, administering, and navigating metadata in a Teradata Warehouse. It is the only metadata management system optimized for and integrated with the Teradata Database environment. Topics include: • What is metadata? • Types of metadata • Teradata Meta Data ServicesWhat is Metadata? Metadata is the term applied to the definitions of the data stored in the Teradata Warehouse. Simply put, metadata is data about data. In a transaction processing database environment, a Data Dictionary generally satisfies the need for data about data. In the data warehouse environment, the requirements for a more elaborate metadata storage system can exceed the capabilities of the Data Dictionary. Metadata plays an important role across the Teradata Warehouse architecture. In the operational database environment, that role is very formal. All development should use metadata as a standard part of the design and development process. As far as the data warehouse is concerned, metadata is used to locate data. Without it, you cannot not interact with the data in the data warehouse because you have no means of knowing how the tables are structured, what the precise definitions of the data are, or where the data originated.Types of Metadata Metadata has been around for as long as there have been programs and data. However, in the world of data warehouses, metadata takes on a new level of importance. Using metadata, you can make the most effective use of the Teradata Warehouse. Metadata allows the decision support system (DSS) analyst to navigate through the possibilities. The major component of the DSS environment is archival data, that is, data with a timestamp. Because archival data is timestamped, it makes sense to store metadata with the actual occurrences of data, which are time stamped as well. The following table describes the types of metadata.Introduction to Teradata Warehouse 181
  • 200. Chapter 19: Teradata Meta Data ServicesTeradata Meta Data Services For the… The following types of metadata are stored… data model • Description • Specification • The layout of the physical data model tables • Relation between the data model and the data warehouse data warehouse • Data source (system of record) • Definition of the system of record • Mapping from system of record to the data warehouse and other places defined in the environment • Table structures and attributes • Any relationship or artifacts of relationships • Transformation of data as it passes into the data warehouse • History of extracts • Extract logging • Common routines for data access columns • Columns in a row • Order in which the columns appear • Physical structure of the columns • Any variable-length columns • Any columns with NULL values • Unit of measure of any numeric columns • Any encoding used database design • Description of the layouts used • Structure of data as known to the programmers and analystsTeradata Meta Data Services Teradata Meta Data Services (MDS) is software that creates a repository in the Teradata Warehouse in which metadata is stored. MDS also permits the DSS analyst to administer and navigate metadata in the warehouse. The following table describes the benefits of Teradata Meta Data Services to several user groups.182 Introduction to Teradata Warehouse
  • 201. Chapter 19: Teradata Meta Data Services Teradata Meta Data Services For this type of user… Teradata MDS… application developers • Provides a persistent store for application metadata so that developers can concentrate on developing application functions. • Allows the developer to manipulate metadata with the same techniques used to manipulate other data. • Provides security (MDS controls the read and write access). • Allows metadata to be shared between applications. This allows integration of tools such as ordered analytical functions and data mining tools. • Allows application data to be modeled around Teradata Database metadata maintained by MDS. MDS maintains the metadata so that the application is kept current with warehouse database changes. database administrator • Provides a common repository for Teradata Warehouse components. • Provides a single shared copy of metadata, or a single version of the truth. One copy eliminates multiple islands of redundant metadata that can cause confusion and administrative difficulties. • Provides the capabilities to browse through data in the repository and to drill-down to see successive levels of detail. • Shows interrelationships between different data definitions. • Provides impact analysis of proposed changes. business user • Provides the foundation for a “warehouse view” of enterprise computing. • Allows business analysts to quickly determine where their data comes from, how it was changed, when it was last updated, and how the answer was determined. This greatly increases the value of the detail data and implicitly the value of the metadata. • Supports third-party tools that can be used to import metadata into MDS for viewing. • Supports a web browser that provides general reporting and search capabilities and shows strategic metadata relationships.Creating the Teradata Meta Data Repository The Teradata MDS repository is a set of tables that resides in the Teradata Database. You must use MDS program software to create these tables before metadata can be added, stored, or accessed.Connecting to the Teradata Meta Data Repository Each system running a Teradata MDS application must have the following: • The appropriate Teradata ODBC driver. • An ODBC System Data Source Name (DSN) connection to the Teradata Database where the MDS repository resides.Introduction to Teradata Warehouse 183
  • 202. Chapter 19: Teradata Meta Data ServicesFor More InformationFor More Information For more information on the topics presented in this chapter, see the following Teradata Meta Data Services and Teradata Tools and Utilities books. IF you want to learn more about… THEN see… Teradata ODBC driver Teradata ODBC Driver User Guide Teradata Meta Data Services (MDS) • Teradata Meta Data Services Installation and Administration Guide • Teradata Meta Data Services Programmer Guide184 Introduction to Teradata Warehouse
  • 203. Glossary 1NF First Normal Form 2NF Second Normal Form 2PC Two-Phase Commit 3NF Third Normal Form 4NF Fourth Normal Form 5NF Fifth Normal Form AMP Access Module Process ANSI American National Standards Institute API Application Programming Interface ARC Teradata Archive/Recovery Utility ASCII American Standard Code for Information Interchange ASE Account String Expansion AWS Administration Workstation BCNF Boyce-Codd Normal Form AWT AMP Worker Task BTEQ Basic Teradata Query Facility BYNET Banyan Network (high-speed interconnect) CICS Customer Information Control System CLIv2 Call-Level Interface, Version 2 CNS Console Subsystem DB2 DATABASE 2 DBC Database Computer DBQAT Database Query Analysis Tools DBQL Database Query Log DBS Database System or Database Software DDE Dynamic Data ExchangeIntroduction to Teradata Warehouse 185
  • 204. Glossary DDL Data Definition Language DIP Database Initialization Program DML Data Manipulation Language DNS Domain Name Source DSS Decision Support System EBCDIC Extended Binary Coded Decimal Interchange Code FIPS Federal Information Processing Standards GDO Globally Distributed Object HI Hash Index IBM International Business Machines Corporation ID Identification IMS Information Management System I/O Input/Output ISV Independent Software Vender JBOD Just a Bunch Of Disks JI Join Index LAN Local Area Network LUN Logical Unit MDS Meta Data Services MIPS Millions of Instructions Per Second MOSI Micro Operating System Interface MPP Massively Parallel Processing MTDP Micro Teradata Director Program MVS Multiple Virtual Storage NPPI Non-Partitioned Primary Index NUPI Non-Unique Primary Index NUSI Non-Unique Secondary Index ODBC Open Database Connectivity OS/VS Operating System/Virtual Storage186 Introduction to Teradata Warehouse
  • 205. Glossary OTB Open Teradata Backup PDE Parallel Database Extensions PE Parsing Engine PI Primary Index PL/I Programming Language 1 PJ/NF Projection-Join Normal Form PP2 Preprocessor2 PPI Partitioned Primary Index PUT Parallel Upgrade Tool QCD Query Capture Database QCF Query Capture Facility RAID Redundant Array of Independent Disks RCT Resource Check Tools RI Referential Integrity RCT Resource Check Tools SIA Shared Information Architecture SMP Symmetric Multi-Processing SNMP Simple Network Management Protocol SQL Structured Query Language SR Single Request SSO Single Sign On TCP/IP Transmission Control Protocol/Internet Protocol TDGSS Teradata Generic Security Services TDP Teradata Director Program TDSP Teradata Stored Procedures TDWM Teradata Dynamic Workload Manager TPA Trusted Parallel Application TS/API Transparency Series/Application Program Interface UPI Unique Primary IndexIntroduction to Teradata Warehouse 187
  • 206. Glossary USI Unique Secondary Index VM/CMS Virtual Machine/Conversational Monitor System VM Virtual Machine vproc Virtual Processor VS Virtual Storage188 Introduction to Teradata Warehouse
  • 207. IndexNumerics examples of 78 Application development1NF, first normal form 103 embedded SQL applications 832NF, second normal form 103 macros 842PC 115 platforms 842PL, phases of 108 Preprocessor2 843NF, third normal form 103 stored procedures 86 Application development languagesA C 84Access level lock 111 COBOL 84Access Module Processor. See AMP PL/I 84Account String Expansion. See ASE Archive utilitiesAccounting NetBackup 162 ASE 177 NetVault 162 DBC.AMPUsage table 177 Teradata Archive/Recovery 12, 43, 161Active data warehouse 1 Archive/Recovery. See Teradata Archive/RecoveryActive session management ASE, accounting 177 Gateway Global utility 164 ASF2 Tape Reader. See Teradata ASF2 Tape Reader Query Configuration 164 Attachment methods Query Sessions 164 channel 7Active Teradata Warehouse network 7 active access 3 Auditing database access 152 active availability 4 Audits active enterprise integration 3 addressing problems 153 active events 3 security hazards 152 active load 2 Authentication active workload management 3 external 147 operational intelligence 2 user 145 strategic intelligence 2 AuthorizationAdministration Workstation. See AWS user 148Aggregate functions 79 AWSAggregate join indexes 98 function 36AMP platform 36 clusters 31, 41 down AMP journal 42 B down AMP recovery 115 functions 30 Basic Teradata Query. See BTEQ hashing and 101 Battery backup 45 indexes and 94 Boardless BYNET 27 operation 33 BTEQ 12 SELECT statement processing 34 BTEQ Window. See BTEQWIN step processing 34 BTEQWIN 174 vproc migration 39 BYNET vprocs 29 boardless 27Analytical functions. See Ordered analytical functions function 26ANSI mode transactions 108 inter-network communication 26ANSI-compliant data types 77 multiple 44Introduction to Teradata Warehouse 189
  • 208. IndexC stored procedures 81 Customer Information Control System. See CICSC Preprocessor 16C, application development language 84Call Level Interface Version 2. See CLIv2 DChannel-attached systems Dashboard Workload Monitor 169 logons 146 Data attributes, summary of 78 mainframe 7 Data communications multiple connections 45 communications interfaces 49 supported operating systems 50 Microsoft Windows 53 TDP 50 UNIX systems 53Character set support. See International language support Data Control Language. See DCLChild table 104 Data Definition Language. See DDLCICS 12 Data DictionaryCliques content 117 definition 28 DBC.AMPUsage table 177 disk arrays 27 definition 117 hardware fault tolerance 46 SQL statements and 122 vproc migration 46 tables 118CLIv2 12, 13 views 121 channel-attached systems 50 Data distribution definition 49 hashing 101 network-attached systems 51 indexes and 93 PM/API 178 Data integrity 150Clusters Data load and unload utilities AMP 31 Teradata FastExport 163 fault tolerance 31 Teradata FastLoad 163COBOL Preprocessor 16 Teradata MultiLoad 162COBOL, application development language 84 Teradata TPump 163Columns 21 Data management attributes 20 active sessions 164 identity 102 archive utilities 161Command-line logon 146 Open Teradata Backup 162Communications interfaces Data Manipulation Language. See DML CLIv2 49 Data type attributes 78 JDBC 54 Data type phrase 77 MOSI 52 Data types 77 MTDP 52 ANSI-compliant 77 ODBC 54 data type phrase and 77 TDP 50 Teradata 77 WinCLI 53 Data warehouseConcurrency control active data warehouse 1 2PC and 115 definition 1 definition 107 Database access, auditing 152 locks 109 Database Initialization Program. See DIP transactions 107 Database level lock 110Constraints Database Management and Query Analysis Tools 13 normal forms and 102 Database object use count 138 table 20 Database Query Log. See DBQLConstructor methods 67 Database recovery 114Cursors 80 Database Security 143 definition 81 Database window Preprocessor2 81 server software 10 SQL statements related to 81 Databases190 Introduction to Teradata Warehouse
  • 209. Index database object use count 138 Exclusive level lock 111 DBQL 137 EXPLAIN statement definition 67 definition 88 space allocation 67 use 88DBQAT Extended language support. See International language database object use count 138 support DBQL 136 External authentication 147 Query Capture Database 136 External Stored Procedures Query Capture Facility 135 creating 62DBQL External stored procedures 62 query information 137 TLE 137 F user information 137 workload management 136 Fallback table 40DBW FastExport. See Teradata FastExport supervisor window 167 FastLoad. See Teradata FastLoad Teradata MultiTool 167 Fault tolerance use 157, 167 clusters 31DCL, examples of 73 hardware 44DDL, examples of 72 software 39Deadlocks Ferret 166 resolution 112 Foreign key 94 transaction rollback 112 referential integrity and 104Derived tables 21 Formats, logon 146DIP Full table scans, strengths and weaknesses 101 Teradata MultiTool 167 Functions use 167 aggregate 79Directory Management of Users definition 79 external directories 151 ordered analytical 80Directory tools 152 scalar 79Disk arrays LUNs 27 G pdisks 27 Gateway Global utility 164 RAID 27 Generator, function 30 RAID1 45 Global temporary tables 20 vdisks 27 Group Read HUT lock 113Dispatcher GUI logon 146 function 30 operation 32DML, examples of 73 HDown AMP Hardware fault tolerance journal 42 battery backup 45 recovery 115 cliques 46Dynamic Workload Manager. See Teradata DWM hot swap 45 multiple BYNETS 44 multiple channel and network connections 45E redundant power supplies and fans 45Embedded SQL applications 83 server isolation 45Encryption Hash indexes 99 data 150 strengths and weaknesses 101 logon 150 Hashing message 150 data distributing 101 network data 150 primary indexes 101Exclusive HUT lock 113 secondary indexes 102<<document title>> 191
  • 210. IndexHost Utility Consoles. See HUTCNS system dictionary data 127Host Utility Lock. See HUT lock system dictionary data, character set 128Hot standby nodes user data 127 definition 28 user data, default character set 128 function 28Hot swap J components 45 definition 45 Japanese language support. See International languageHUT lock support, Japanese support characteristics of 113 JDBC 13 Exclusive 113 driver 54 Group Read 113 Join indexes Read 113 aggregate 98 Teradata Archive/Recovery and 112 covering 97 Write 113 multi-table 98HUTCNS 12 multi-table, partially covering 97 partially covering 97 single table 97I sparse 98IBM IMS 12 strengths and weaknesses 101Identity column Joins column attribute 102 SELECT statement and 76 unique row number generator 102 JournalsIMS. See IBM IMS down AMP 42Indexes permanent 43 hash 99 transient 43 join 97 primary 94 K non-unique 94 Key unique 94 foreign 94, 104 secondary 96 parent 104 specification 99 primary 94, 104 SQL statements and 99 strengths and weakness 100 types of 93 L uses 93 Load and Unload Utilities 14Instance methods 67 LocksInternational character set support. See International access 111 Language Support database level lock 110International language support deadlocks 112 character data translation 126 exclusive 111 character set overview 125 HUT 112 character sets, internal 127 levels 110 compatible languages 130 read 111 extended support 131 row hash level lock 111 external character sets 126 table level locks 111 internal character sets 126 write 111 Japanese mode 128 Logical Units. See LUNs Japanese support mode 130 Logon language support modes 128 command-line 146 LATIN character set 129 controls 146 session charset 127 formats 146 standard mode 128 GUI 146 standard support mode 129 logon string operands 156192 Introduction to Teradata Warehouse
  • 211. Index sessions 156 Network data encryption 150Logon controls 146 Network-attached systemsLogon encryption 150 LAN 7Logons, channel-attached systems 146 MOSI 52LUNs multiple connections 45 RAID 27 supported operating systems 51 vprocs 27 Non-partitioned primary index 96 partitioned primary index and 96M Non-unique primary index. See NUPI Non-unique secondary index 96Macros Non-unique secondary index. See NUSI definition 63, 84 Normal forms multi-user 63 1NF 103 processing 63 2NF 103 resource usage 176 3NF 103 single-user 63 definition 102 SQL statements and 84, 85 first 103 SQL statements related to 63 second 103 use 85 third 103Maintenance utilities 157 NormalizationManagement software normal forms 102 server software 10 purpose 102Massively Parallel Processing. See MPP NUPI, strengths and weaknesses 100MDS 182 NUSI, strengths and weaknesses 100Message encryption 150Meta Data Services. See MDSMetadata O definition 181 ODBC 12 types of 181 communications interface 54Methods OLE DB Provider 13 constructor 67 Open Teradata Backup 16 instance 67 NetVault 16Micro Operating System Interface. See MOSI Symantec VERITAS NetBackup 16Micro Teradata Director Program. See MTDP Windows clients 16, 162MOSI Optimizer definition 52 function 30 network-attached systems 52 SQL request implementation 32MPP Ordered analytical functions 80 architecture 25 hardware platform 25 P workstation connections 36MTDP Parallel Data Extensions. See PDE definition 52 Parallel Upgrade Tool. See PUT interface 52 Parent key 104 network-attached systems 52 Parent table 104MultiLoad. See Teradata MultiLoad ParserMulti-table function 30 join indexes, partially covering 97 PE element 30Multi-table join indexes 98 request processing 31 strengths and weaknesses 101 Parsing Engine. See PE Partitioned primary index 96 non-partitioned primary index and 96N PasswordNetBackup 162 controls 147NetVault 16, 162 format 146<<document title>> 193
  • 212. IndexPDE Priority Scheduler 166 function 34 Priority Scheduler Administrator 166, 172 MPP system enabling 35 Priority Scheduler Administrator server software 10 Priority Scheduler 166 task management with Teradata MultiTool 167 Teradata Manager 166, 172 TPA and non-TPA 35 Processor node 26 vprocs 35 Profiles 149pdisks 27 definition 155PE security 149 dispatcher 30 PUT function 29 installation and 10 generator 30 operational modes 10 migration 28 optimizer 30 Q parser 30 request processing 31 QCD 133, 134, 135 SELECT statement processing 34 schema 136 session control 29, 30 Teradata Index Wizard 136 vproc migration 39 Teradata Visual EXPLAIN 133 vprocs 29 use 136Performance Monitor/Application Programming Interface. QCF 133, 134, 135 See PM/API use 136Performance monitoring. See System performance Queries monitoring strategic 2Permanent journals 43 tactical 2PL/I Teradata SQL Assistant 169 application development language 84 Query Analysis Tools. See Database Management and QueryPL/I Preprocessor 16 Analysis ToolsPM/API Query Capture Database. See QCD CLIv2 178 Query Capture Facility. See QCF performance monitoring 178 Query Configuration 164 resource usage and 178 Query Director. See Teradata QD third-party software support 90 Query Sessions 164Preprocessor2 12 Queue tables 59 application development 84 base tables 59 C 16 event processing 60 COBOL 16 cursors 81 R PL/I 16 RAIDPrimary index LUNs 27 data distribution and 94 RAID1 45 non-partitioned 96 storage technology 27 partitioned 96 vdisks 27 primary key 95 Read HUT lock 113 secondary index and 97 Read level lock 111Primary indexes Recovery hashing 101 database 114Primary key 94 definition 113 primary index 95 down AMP 115Primary keys media 114 first normal form 103 single transaction 114 referential integrity and 104 system 114 second normal form 103 transaction 114 third normal form 103 Recursive query 79194 Introduction to Teradata Warehouse
  • 213. IndexReferenced table (parent) 104 cursor declaration 81Referencing table (child) 104 joins and 76Referential integrity options 76 benefits of 105 processing 34 referenced tables 104 set operators 76 referencing tables 104 Session charset 127 system integrity and 150 Session control terminology 104 function 30Referential integrity terminology PE 29 child table 104 Session management 156 foreign key 104 Sessions parent key 104 establishing 156 parent table 104 logon 156 primary key 104 Set operatorsRelational database SELECT statement and 76 definition 19 Set theory relational model and 19 relational databases and 19 set theory and 19 relational model and 19 set theory terminology 20 Set theory terminologyRelational model relation 20 relational databases and 19 tuple 20 set theory and 19 Single transaction recovery 114Request processing 31 Single-table join indexes 97Resource Usage. See ResUsage strengths and weaknesses 101Restarts, system 114 SMPResUsage 175 architecture 25 categories of data 176 boardless BYNET 27 collection rate control 176 hardware platform 25 definition 175 workstation connections 36 macros 176 Software fault tolerance monitoring 175 AMP clusters 41 summary mode 177 fallback tables 40 tables 176 Table Rebuild utility 44 views 176 Teradata Archive/Recovery 43Roles vproc migration 39 definition 155 Space allocation 67 security 149 databases 67Row hash level lock 111 Sparse join indexes 98Rows 21 strengths and weaknesses 101 row hash lock 111 SQL tuples 20 aggregate function 79 cursors 80S data control language statements 73 data definition language statements 72Scalar functions 79 Data Dictionary statements 122Secondary index 96 data manipulation language statements 73 non-unique 96 data types 77 primary index and 97 embedded 83 subtables 96 EXPLAIN 88 unique 96 indexes 99Secondary indexes macros 84, 85 hashing 102 ordered analytical function 80Security. See Teradata security relational databases and 71SELECT statement 76 scalar functions 79<<document title>> 195
  • 214. Index SELECT statement 76 Dashboard Workload Monitor, Teradata Manager 169 SELECT statement processing 34 Ferret utility 166 statement execution 75 Priority Scheduler 166 statement punctuation 75 TASM 167 statement syntax 74 Teradata DWM 168 statements related to macros 63 Teradata MultiTool 167 statements, types of 71 Teradata Workload Analyzer 168 stored procedure statements 87 Workload Trend Analysis, Teradata Manager 169 stored procedures 86 System restarts 114 transaction statements 108, 109 System statusStandard language support. See International language configuration 175 support, standard support states 175Storage management utilities 16Stored procedures T benefits 61 definition 61 Table function 66 elements 62 Table level lock 111 SQL statements and 86, 87 Table Rebuild utility 44 use 61 TablesStored Procedures, External. See External Stored Procedures child 104Strategic queries 2 columns 21Subtables, secondary index and 96 constraints 20Supported operating systems DBC.AMPUsage 177 channel-attached systems 50 derived 21 network-attached systems 51 fallback 40Symantec VERITAS NetBackup 16 global temporary 20Symmetric Multi-Processing. See SMP locks 111System administration parent 104 maintenance 157 permanent 20 performance monitoring 177 queue tables 59 profiles 155 referenced table (parent) 104, 150 roles 155 referencing (child) 104, 150 session management 156 relations 20 space allocation 67 resource usage 176System console rows 21 function 36 system integrity and 150 platform 36 temporary 20System Emulation Tool. See Teradata SET volatile temporary 21System integrity Tactical queries 2 referential integrity and 150 Target Level Emulation. See TLE tables and 150 TASM 167System Management Facility 178 Dashboard Workload Monitor, Teradata Manager 167System monitoring Teradata DWM 167 resource usage 175 Teradata Workload Analyzer 167 system status 175 Workload Trend Analysis, Teradata Manager 167 Teradata Graphical User Interface 175 TDP 12System performance monitoring channel-attached systems 50 performance monitoring 177 definition 50 PM/API 178 functions 50 system management facility 178 TDPTMON 177 TDPTMON 177 Temporary tables Teradata Manager 171 global 20 Teradata Performance Monitor 172 volatile 21System resource management 165 Teradata Active System Management. See TASM196 Introduction to Teradata Warehouse
  • 215. IndexTeradata Administrator 13 Teradata Dynamic Workload Manager. See Teradata DWM database administration 173 Teradata FastExport 15, 163Teradata Analyst Pack Teradata FastLoad 15, 163 Teradata Index Wizard 133 Teradata file system Teradata SET 133 Cylinder Read 35 Teradata Statistics Wizard 133 function 35 Teradata Visual EXPLAIN 133 Teradata GatewayTeradata architecture server software 10 BYNET 26 Teradata Graphical User Interface 36 cliques 27 Teradata Index Wizard 13 disk arrays 27 demographics 134 hot standby nodes 28 QCD and 136 MPP 25 Teradata Visual EXPLAIN and 134 processor node 26 use 134 SMP 25 Teradata Manager 14 TPA 9 Alert Policy Editor 172 vprocs 29 Alert Viewer 172 workstations 36 alerts/events management 172Teradata Archive/Recovery 12, 16 BTEQ Window 174 HUT locks 112 error log analyzer 174 software fault tolerance 43 LogOnOff usage 174 use 161 Priority Scheduler Administrator 172Teradata ASF2 Tape Reader 16 remote control 174Teradata C Preprocessor 16 session information 174Teradata Call-Level Interface, Version 2. See CLIv2 space usage 173Teradata Data Connector 13 system maintenance 173Teradata data types 77 system monitoring 171 examples of 77 Teradata Administrator 13, 173Teradata Database Teradata Data Connector 13 ANSI SQL 7 Teradata Performance Monitor 172 ANSI transaction semantics 108 trend analysis 172 architecture 25 Teradata Meta Data Services 15 as single data store 9 Teradata mode transactions 109 capabilities 8 Teradata MultiLoad 15, 162 CLIv2 50 Teradata MultiTool 13 communications interfaces 49 DIP 167 database window 10 PDE tasks 167 management software 10 use 167 methods of attachment 7, 49 vproc manager 167 PDE 10 Teradata Parallel Transporter 15 PUT installation software 10 Teradata Performance Monitor 14 referential integrity 103 functions 172 status 175 system performance monitoring 172 Teradata Gateway 10 Teradata QD 16 Teradata mode transactions 109 Teradata Query Director. See Teradata QD Teradata SQL 7 Teradata Query Manager 14 Teradata transaction semantics 108 Teradata recursive query 79 third-party software 90 Teradata Replication Solutions 46Teradata Database Query Analysis Tools. See DBQAT Teradata security 149Teradata Director Program User Transaction Monitor. See accountability 152 TDPTMON audits 152Teradata Director Program. See TDP external authentication 147Teradata DWM 13, 167 features 143 workload management 168 logon encryption 150<<document title>> 197
  • 216. Index logons 146 Teradata SQL Assistant 13 mechanisms 144 Teradata Statistics Wizard 14 policy, defining 153 Teradata Tools and Utilities Access Modules 15 policy, publishing 153 Teradata TPump 15 roles 149 Teradata Utility Pack 12 security library 143 Teradata Visual Explain 14 user authentication 145 Teradata Workload Analyzer 14 user authorization 148 Tivoli Storage Manager for Teradata 16Teradata SET 14 TS/API 12 client support for 134 Windows Call-Level Interface Developer’s Kit 13 TLE and 134 Teradata Tools and Utilities Access Modules 15 use 134 Teradata TPump 15, 163Teradata SQL Teradata Transparency Series/Application Programming non-ANSI compliant development 8 Interface. See TS/APITeradata SQL Assistant 13, 169 Teradata Utility Pack 12Teradata Statistics Wizard 14 Teradata Visual EXPLAIN 14, 133 statistics collection 135 QCD 133Teradata stored procedures Teradata Index Wizard and 134 cursors 81 Teradata Workload Analyzer 14, 167Teradata System Emulation Tool. See Teradata SET Third-party softwareTeradata Tools and Utilities PM/API 90 BTEQ 12 Teradata Database, compatible with 90 C Preprocessor 16 TS/API products 90 CICS 12 UDF 66 CLIV2 13 Tivoli Storage Manager for Teradata 16 CLIv2 12 TLE COBOL Preprocessor 16 Teradata SET and 137 Database Management and Query Analysis Tools 13 use 137 Host Utility Console 12 TPA, services 35 IBM IMS 12 TPump. See Teradata TPump JDBC 13 Transaction recovery, definition 113 Load and Unload Utilities 14 Transactions ODBC 12 2PL 108 OLE DB Provider 13 ANSI mode 108 Open Teradata Backup 16 ANSI mode, rollback 109 PL/I Preprocessor 16 control using 2PL 108 Preprocessor2 12 deadlock resolution 112 storage management utilities 16 definition 107 TDP 12 recovery 114 Teradata Archive/Recovery 12, 16, 43 semantics 108 Teradata ASF2 Tape Reader 16 serializability 108 Teradata Dynamic Workload Manager 13 SQL statements and 108, 109 Teradata FastExport 15 Teradata mode 109 Teradata FastLoad 15 Teradata mode, rollback 109 Teradata Index Wizard 13 Transient journals 43 Teradata Manager 14 Triggers Teradata Meta Data Services 15 definition 64 Teradata MultiLoad 15 firing 64 Teradata MultiTool 13 types of 64 Teradata Parallel Transporter 15 use 65 Teradata Performance Monitor 14 Trusted Parallel Application. See TPA Teradata Query Director 16 TS/API 12 Teradata Query Manager 14 third-party product 90 Teradata SET 14 Two-Phase Locking. See 2PL198 Introduction to Teradata Warehouse
  • 217. IndexU hardware fault tolerance 46 software fault tolerance 39UDF Vprocs creation 66 AMP 29 definition 65 definition 29 table function 66 function 29 third-party 66 LUNs 27UDM maximum per system 29 constructor methods 67 PDE 35 instance methods 67 PE 29UDT types of 29 examples of 78 vproc manager in Teradata MultiTool 167 system-supplied data type and 67 vproc migration 39 types 66Unique primary index. See UPIUnique secondary index 96 WUnique secondary index. See USI WAL 165Unload Utilities. See Load and Unload Utilities WinCLI 53UPI, strengths and weaknesses 100 Windows Call-Level Interface Developer’s Kit 13User authentication 145 Workload management 165User authorization 148 DBQL 136User-Defined Function. See UDF Priority Scheduler 166User-Defined Method. see UDM TASM 167User-Defined Type. See UDT Teradata DWM 168Users Teradata Workload Analyzer 168 definition 67 Write Ahead Logging 165 space allocation 67 Workload Trend Analysis 169USI, strengths and weaknesses 100 WorkstationsUtilities AWS 36 Ferret 166 platform specific 36 Gateway Global 164 system console 36 Open Teradata Backup 162 Write Ahead Logging. See WAL Priority Scheduler 166 Write HUT lock 113 Teradata Archive/Recovery 161 Write level lock 111 Teradata FastExport 163 Teradata FastLoad 163 Teradata MultiLoad 162 Teradata MultiTool 167 Teradata TPump 163Vvdisks 27Views base tables 60 benefits 60 definition 60 resource usage 176 restrictions 61 users 121Virtual processors. See VprocsVisual Explain. See Teradata Visual EXPLAINVolatile temporary tables 21Vproc migration cliques 46<<document title>> 199
  • 218. Index200 Introduction to Teradata Warehouse

×