SlideShare a Scribd company logo
INTRODUCTION TO
TERADATA DATA WAREHOUSE
SYSTEM ARCHITECTURE
PREPARED BY: MOHAMED TAHOON
1. What’s a Teradata DWH System ?
2. SMP vs MPP
3. Shared-Everything vs Shared-Nothing architecture
4. Hardware architecture
4.1 Cliques 4.2 Hot standby Nodes
5. Node architecture
5.1 PDE 5.2 Virtual Processors
5.3 Parsing Engine 5.4 Access Module Processor
5.5 Disk Arrays
6. Request Processing
2
3
1 What is Teradata DWH system?
• RDBMS designed to run the world’s largest databases
• Latest Intel technology nodes
• Standard access language (SQL)
• Massive Parallel Processing ‘MPP’ system
• a “Shared-Nothing” architecture
• Parallel-aware optimizer
allowing concurrent complex queries
• Linear Scalability
2. SMP VS MPP
- Multiple CPU’s serving
separate processes Simultaneously
- Shared Everything
- All CPU’s Share Same Memory
- Mostly Hosted on Shared SAN
Symmetric Multi Processing Massive Parallel Processing
- Multiple CPUs runs in Parallel
serving single process
- Shared Nothing
- Each CPU have It’s Own Memory and space
- High Speed Nodes Connection [ByNet]
4
1. What’s a Teradata DWH System ?
2. SMP vs MPP
3. Shared-Everything vs Shared-Nothing architecture
4. Hardware architecture
4.1 Cliques 4.2 Hot standby Nodes
5. Node architecture
5.1 PDE 5.2 Virtual Processors
5.3 Parsing Engine 5.4 Access Module Processor
5.5 Disk Arrays
6. Request Processing
5
6
SHARED-NOTHINGSHARED-EVERYTHING
- Disk controllers and bandwidth shared
- Synchronization required across nodes
- Large scale Scalability Issue
- Best for many small statements
- Controllers dedicated to nodes
- No Cache Synchronization necessary
- Linear Scalability
- Best for Heavy statements
1. What’s a Teradata DWH System ?
2. SMP vs MPP
3. Shared-Everything vs Shared-Nothing architecture
4. Hardware architecture
4.1 Cliques 4.2 Hot standby Nodes
5. Node architecture
5.1 PDE 5.2 Virtual Processors
5.3 Parsing Engine 5.4 Access Module Processor
5.5 Disk Arrays
6. Request Processing
7
8
4 Teradata Hardware Architecture
• SMP Nodes
> Latest Intel SMP CPUs
> Configured in 2+1 node cliques
> Linux or Windows
• BYNET Interconnect
> Fully scalable bandwidth
> 1 to 1024 nodes
• Storage
> Independent I/O per Node
> Scales per node
• Server Management
> One console for the entire system
Server Management
PE
SMP Node1
AMPPE
AMP AMP AMP
PE
SMP Node2
AMPPE
AMP AMP AMP
PE
SMP Node3
AMPPE
AMP AMP AMP
PE
SMP Node4
AMPPE
AMP AMP AMP
BYNET Interconnect
4.1 CLIQUES
• Group nodes together by multiported
access to common disk array units.
• Inter-node disk array connections are
made using FibreChannel (FC) buses.
• FC paths enable redundancy to ensure
the loss of a processor node or disk
controller won’t limit data availability.
• Clique is a mechanism supports migration of VPROCs under PDE following a
node failure.
• If a node in a clique fails, VPROCs migrate to other nodes in the clique and
continue to operate while recovery occurs on their home node.
9
4.2 HOT STANDBY NODES
Improves availability and maintain performance levels
in the event of a node failure.
What’s a Hot Standby node:
• Is a member of each clique in the system.
• Does not normally participate in Teradata Database operations.
• Used to compensate for the loss of a node in the clique.
Using Hot Standby node Eliminates:
• Restarts that are required to bring a failed node back into service.
• Degraded service when VPROCs have migrated to other nodes in a clique.
How Hot Standby node failover works :
At node failure, all AMPs and LAN-attached PEs on the failed node migrate to the hot standby node
The hot standby node becomes the production node.
When the failed node returns to service, it becomes the new hot standby node.
10
1. What’s a Teradata DWH System ?
2. SMP vs MPP
3. Shared-Everything vs Shared-Nothing architecture
4. Hardware architecture
4.1 Cliques 4.2 Hot standby Nodes
5. Node architecture
5.1 PDE 5.2 Virtual Processors
5.3 Parsing Engine 5.4 Access Module Processor
5.5 Disk Arrays
6. Request Processing
11
5 Node Architecture (‘Shared Nothing’)
Each Teradata Node is made up of hardware and software
• Each node runs copy of OS, database SW, & virtual processes
• Each node has CPUs, system disk, memory & adapters
PE vproc
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
AMP
vproc
Vdisk
PE vproc
UNIX
PDE
5.1 PARALLEL DATABASE EXTENSIONS - PDE
Software interface layer lies between O. S. & TD DB which enables The database to :
• Run in a parallel environment
• Execute Vprocs
• Apply a flexible priority scheduler to Teradata Database sessions
• Consistently manage memory, I/O, and messaging system interfaces across multiple OS platforms
PDE provides a series of parallel operating system services, which include:
• Facilities to manage parallel execution of database operations on multiple nodes.
• Dynamic distribution of database tasks.
• Coordination of task execution within and between nodes.
13
5.2 VIRTUAL PROCESSORS – WHAT IS IT
What is it:
• Set of software processes that run on a node under Teradata (PDE).
• Eliminate dependency on specialized physical processors
VPROC characteristics:
• Multiple VPROCs can run on an SMP platform or a node.
• VPROCs and the tasks running under them communicate using unique-address
messaging, as if they were physically isolated from one another.
• This message communication is done using the BYNET hardware and BYNET Driver.
• maximum # VPROCs in a system: 16,384 VPROCs, in a node 128.
14
5.2 VIRTUAL PROCESSORS – VPROC TYPES
GTW
• Gateway VPROCs provide a socket interface to Teradata Database
PE
• Parsing Engines perform session control, query parsing, security validation, query optimization
AMP
• Access Module Processors perform DB functions; Like: executing database queries.
• Database storage Distributed Across AMPs.
TVS
• Manages Teradata Database storage.
• AMPs acquire their portions of database storage through the TVS vproc.
NODE
• The node vproc handles PDE and operating system functions not directly related to AMP and PE work.
• Cannot be externally manipulated, and do not appear in the output of the Vproc Manager utility.
RSG
• Relay Services Gateway provides a socket interface for the replication agent.
• Relaydictionary changes to the Teradata Meta Data Services utility.
15
5.3 PARSING ENGINE ‘PE’
• Communicates with the client system on one side and with the AMPs (via the BYNET) on the other side.
• Each PE executes the database software that manages sessions, decomposes SQL statements into
steps, possibly in parallel, and returns the answer rows to the requesting client.
Parsing Engine Elements
Parser Decomposes SQL into relational data management processing steps.
Optimizer Determines the most efficient path to access data.
Generator Generates and packages steps.
Dispatcher Receives processing steps from the parser, sends them to the appropriate AMPs via BYNET.
Monitors the completion of steps and handles errors encountered during processing.
Session
Control
Manages session activities, such as logon, password validation, and logoff.
Recovers sessions following client or server failures.
16
5.4 ACCESS MODULE PROCESSOR ‘AMP’
• The AMP VPROC manages Teradata Database interactions with the disk subsystem.
• Each AMP manages a share of the disk storage.
Database management tasks
• Accounting
• Journaling
• Locking tables, rows, and databases
• Output data conversion
During query processing:
• Sorting
• Joining data rows
• Aggregation
File System Management
Disk Space management.
17
5.4 THE AMPS – REQUEST PROCESSING
• The BYNET transmits messages to and from the AMPS and PEs.
• An AMP step can be sent to one of the following:
• One AMP
• Multi-Cast (A selected set of AMPs)
• All AMPs in the system
PE communication with Amps during request processing:
1. PE 1 : Access is through a primary index and a request is for a single row
- the PE transmits steps to a single AMP
2. PE 2 : the request is for many rows (an all-AMP request):
- the PE makes the BYNET broadcast the steps to all AMPs
** To minimize system overhead, the PE can send a step to a subset of AMPs, when appropriate.
18
5.5 DISK ARRAYS
Logical Units ‘Lun’
• The RAID Manager uses drive groups DG
• DG is a set of drives that have been configured
into one or more LUNs.
• OS recognizes a LUN as a disk and is not aware
that it is writing on on multiple disk drives.
Vdisk
• Group of cylinders currently assigned to an AMP
• OS recognizes a LUN as a disk and is not aware
that it is writing on on multiple disk drives.
• The actual physical storage may derive from
several different storage devices
19
1. What’s a Teradata DWH System ?
2. SMP vs MPP
3. Shared-Everything vs Shared-Nothing architecture
4. Hardware architecture
4.1 Cliques 4.2 Hot standby Nodes
5. Node architecture
5.1 PDE 5.2 Virtual Processors
5.3 Parsing Engine 5.4 Access Module Processor
5.5 Disk Arrays
6. Request Processing
20
6 REQUEST PROCESSING – “LIFETIME OF A QUERY”
1. The Parser
•Checks Request cache to determine
if the request is already there
2. The Syntaxer
•checks the syntax of an incoming
request
3. The Resolver
•Adds information from the Data
Dictionary to convert database,
table, view, stored procedure, and
macro names to internal identifiers.
4. Security module
•checks privileges in the Data
Dictionary.
5. The Optimizer
•Determines the most effective way
to implement the SQL request.
6. The Optimizer
•scans the request to determine
where to place locks, then passes
the optimized parse tree to the
Generator.
7. The Generator
•Transforms the optimized parse
tree into plastic steps, caches the
steps if appropriate, and passes
them to gncApply
8. gncApply
•Takes the plastic steps produced by
the Generator, binds in
parameterized data if it exists, and
transforms it into concrete steps.
9 The Dispatcher
21
P.45 REQUEST PROCESSING – “1. THE PARSER”
1. The Parser
• Checks if the request in Request cache:
• IF IN = > Go to Step (2) - The Syntaxer.
• IF New Request
• The Parser reuses the plastic steps found in the
cache and passes them to gncApply.
• Go to checking privileges (step 4)
• Then Go to gncApply (step 8) after.
2. The Syntaxer
•checks the syntax
3. The Resolver
•convert Object names to internal
identifiers.
4. Security module
•checks privileges
5. The Optimizer
•Determines the most effective way
to implement the SQL request.
6. The Optimizer
•scans the request to determine
where to place locks
7. The Generator
•Transforms parse tree into plastic
steps.
8. gncApply
•binds parameterized data if it exists,
transforms it into concrete steps.
9 The Dispatcher
Plastic steps are directives to the
database management system
that do not contain data values
22
P.45 REQUEST PROCESSING – “2. SYNTAXER, 3. RESOLVER”
1. The Parser
•Checks Request cache to
determine if the request is
already there
2. The Syntaxer
• checks the syntax of new request:
• IF Wong => passes an error
message back to the requestor and
stops
• IF Correct => converts the request
to a parse tree and passes it to the
Resolver (3)
3. The Resolver
• Adds information from the Data
Dictionary (or cached copy of the
information) to convert database,
table, view, stored procedure, and
macro names to internal identifiers.
4. Security module
•checks privileges
5. The Optimizer
•Determines the most
effective way to implement
the SQL request.
6. The Optimizer
•scans the request to
determine where to place
locks
7. The Generator
•Transforms parse tree into
plastic steps.
8. gncApply
•binds parameterized data if it
exists, transforms it into
concrete steps.
9 The Dispatcher
23
P.45 REQUEST PROCESSING – “4. SECURITY MODULE, 5. - 6. OPTIMIZER”
1. The Parser
•Checks Request cache to determine if the
request is already there
2. The Syntaxer
•checks the syntax
3. The Resolver
•convert Object names to internal identifiers.
4. The Security module
• checks privileges of accessed object vs Requestor:
• Mismatch => returns a privilege error message
• Privileged => passes the request to the Optimizer.
5. The Optimizer
• Determines the most
effective way to implement
the request (Excution plan)
6. Optimizer
• Determine what type
and where to place
Objects locks
7. The Generator
•Transforms parse tree into plastic steps.
8. gncApply
•binds parameterized data if it exists,
transforms it into concrete steps.
9 The Dispatcher
24
P.45 REQUEST PROCESSING – “SEPARATE ORIGINAL”
1. The Parser
•Checks Request cache to determine if the request is
already there
2. The Syntaxer
•checks the syntax
3. The Resolver
•convert Object names to internal identifiers.
4. Security module
•checks privileges
5. The Optimizer
•Determines the most effective way to implement the
SQL request.
6. The Optimizer
•scans the request to determine where to place locks
7. The Generator
• Transforms the optimized parse
tree into plastic steps
• caches the steps if appropriate
• passes them to gncApply.
8. gncApply
• Binds in parameterized data if it
exists, and transform plastic
steps to concrete steps.
• passes the concrete steps to the
Dispatcher
9 The Dispatcher
Concrete steps are directives to the AMPs
that contain needed user-or session-
specific values and needed data parcels
25
P.45 DISPATCHER – REQUEST PROCESSING
• controls the sequence in which steps are executed:
• It also passes the steps to the BYNET to be distributed to the AMPs:
• 1 The Dispatcher receives concrete steps from gncApply.
• 2 The Dispatcher places the first step on the BYNET;
• - tells the BYNET whether the step is for one AMP, several AMPS, or all AMPs;
• - waits for a completion response.
• - Whenever possible, Teradata Database performs steps in parallel to enhance performance.
• 3 The Dispatcher receives a completion response from all expected AMPs and places the next step
on the BYNET.
• It continues to do this until all the AMP steps associated with a request are done.
26
27

More Related Content

What's hot

SQL Server Tuning to Improve Database Performance
SQL Server Tuning to Improve Database PerformanceSQL Server Tuning to Improve Database Performance
SQL Server Tuning to Improve Database Performance
Mark Ginnebaugh
 
Teradata a z
Teradata a zTeradata a z
Teradata a z
Dhanasekar T
 
Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape
CCG
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
HostedbyConfluent
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
HostedbyConfluent
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
Databricks
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
Ike Ellis
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
Databricks
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
Databricks
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Big data ecosystem
Big data ecosystemBig data ecosystem
Big data ecosystemmagda3695
 
Big data-cheat-sheet
Big data-cheat-sheetBig data-cheat-sheet
Big data-cheat-sheet
masoodkhh
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
Databricks
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
Databricks
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
Amazon Web Services
 
Neo4j Spatial - Backing a GIS with a true graph database
Neo4j Spatial - Backing a GIS with a true graph databaseNeo4j Spatial - Backing a GIS with a true graph database
Neo4j Spatial - Backing a GIS with a true graph database
Craig Taverner
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services Marketplace
Denodo
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
Salma Gouia
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
DATAVERSITY
 

What's hot (20)

SQL Server Tuning to Improve Database Performance
SQL Server Tuning to Improve Database PerformanceSQL Server Tuning to Improve Database Performance
SQL Server Tuning to Improve Database Performance
 
Teradata a z
Teradata a zTeradata a z
Teradata a z
 
Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape Master Data Management's Place in the Data Governance Landscape
Master Data Management's Place in the Data Governance Landscape
 
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
Standing on the Shoulders of Open-Source Giants: The Serverless Realtime Lake...
 
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, ConfluentApache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
Apache Kafka and the Data Mesh | Ben Stopford and Michael Noll, Confluent
 
Data Discovery at Databricks with Amundsen
Data Discovery at Databricks with AmundsenData Discovery at Databricks with Amundsen
Data Discovery at Databricks with Amundsen
 
Data Modeling on Azure for Analytics
Data Modeling on Azure for AnalyticsData Modeling on Azure for Analytics
Data Modeling on Azure for Analytics
 
Massive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta LakeMassive Data Processing in Adobe Using Delta Lake
Massive Data Processing in Adobe Using Delta Lake
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
 
Big data ecosystem
Big data ecosystemBig data ecosystem
Big data ecosystem
 
Big data-cheat-sheet
Big data-cheat-sheetBig data-cheat-sheet
Big data-cheat-sheet
 
SQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at ComcastSQL Analytics Powering Telemetry Analysis at Comcast
SQL Analytics Powering Telemetry Analysis at Comcast
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Neo4j Spatial - Backing a GIS with a true graph database
Neo4j Spatial - Backing a GIS with a true graph databaseNeo4j Spatial - Backing a GIS with a true graph database
Neo4j Spatial - Backing a GIS with a true graph database
 
Data Services Marketplace
Data Services MarketplaceData Services Marketplace
Data Services Marketplace
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
No sqlpresentation
No sqlpresentationNo sqlpresentation
No sqlpresentation
 
Five Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data GovernanceFive Things to Consider About Data Mesh and Data Governance
Five Things to Consider About Data Mesh and Data Governance
 

Similar to Teradata introduction - A basic introduction for Taradate system Architecture

MYSQL
MYSQLMYSQL
MYSQL
gilashikwa
 
1.4 System Arch.pdf
1.4 System Arch.pdf1.4 System Arch.pdf
1.4 System Arch.pdf
ssuser8b6c85
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
동현 김
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
Membase
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
Kadri20
 
Serve like a boss (part one)
Serve like a boss (part one)Serve like a boss (part one)
Serve like a boss (part one)
Hamed Nemati
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
BIOVIA
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linux
mountpoint.io
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
hik_lhz
 
CA UNIT IV.pptx
CA UNIT IV.pptxCA UNIT IV.pptx
CA UNIT IV.pptx
ssuser9dbd7e
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
vijayapraba1
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
Haris456
 
Developing a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure EnvironmentsDeveloping a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure Environments
Ceph Community
 
parallel-processing.ppt
parallel-processing.pptparallel-processing.ppt
parallel-processing.ppt
MohammedAbdelgader2
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
Muhammad54342
 
Benchmark emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
Benchmark   emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mwareBenchmark   emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
Benchmark emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
solarisyougood
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
Sander Temme
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
dilip kumar
 
chap 18 multicore computers
chap 18 multicore computers chap 18 multicore computers
chap 18 multicore computers Sher Shah Merkhel
 

Similar to Teradata introduction - A basic introduction for Taradate system Architecture (20)

MYSQL
MYSQLMYSQL
MYSQL
 
1.4 System Arch.pdf
1.4 System Arch.pdf1.4 System Arch.pdf
1.4 System Arch.pdf
 
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community) [발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
[발표자료] 오픈소스 Pacemaker 활용한 zabbix 이중화 방안(w/ Zabbix Korea Community)
 
Membase East Coast Meetups
Membase East Coast MeetupsMembase East Coast Meetups
Membase East Coast Meetups
 
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.pptBIL406-Chapter-2-Classifications of Parallel Systems.ppt
BIL406-Chapter-2-Classifications of Parallel Systems.ppt
 
Serve like a boss (part one)
Serve like a boss (part one)Serve like a boss (part one)
Serve like a boss (part one)
 
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
 
Membase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San FranciscoMembase Intro from Membase Meetup San Francisco
Membase Intro from Membase Meetup San Francisco
 
Current and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on LinuxCurrent and Future of Non-Volatile Memory on Linux
Current and Future of Non-Volatile Memory on Linux
 
Considerations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmfConsiderations when implementing_ha_in_dmf
Considerations when implementing_ha_in_dmf
 
CA UNIT IV.pptx
CA UNIT IV.pptxCA UNIT IV.pptx
CA UNIT IV.pptx
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
Multithreading computer architecture
 Multithreading computer architecture  Multithreading computer architecture
Multithreading computer architecture
 
Developing a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure EnvironmentsDeveloping a Ceph Appliance for Secure Environments
Developing a Ceph Appliance for Secure Environments
 
parallel-processing.ppt
parallel-processing.pptparallel-processing.ppt
parallel-processing.ppt
 
Multiprocessor.pptx
 Multiprocessor.pptx Multiprocessor.pptx
Multiprocessor.pptx
 
Benchmark emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
Benchmark   emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mwareBenchmark   emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
Benchmark emc vnx7500, emc fast suite, emc snap sure and oracle rac on v-mware
 
Apache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling UpApache Performance Tuning: Scaling Up
Apache Performance Tuning: Scaling Up
 
18 parallel processing
18 parallel processing18 parallel processing
18 parallel processing
 
chap 18 multicore computers
chap 18 multicore computers chap 18 multicore computers
chap 18 multicore computers
 

Recently uploaded

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 

Recently uploaded (20)

Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 

Teradata introduction - A basic introduction for Taradate system Architecture

  • 1. INTRODUCTION TO TERADATA DATA WAREHOUSE SYSTEM ARCHITECTURE PREPARED BY: MOHAMED TAHOON
  • 2. 1. What’s a Teradata DWH System ? 2. SMP vs MPP 3. Shared-Everything vs Shared-Nothing architecture 4. Hardware architecture 4.1 Cliques 4.2 Hot standby Nodes 5. Node architecture 5.1 PDE 5.2 Virtual Processors 5.3 Parsing Engine 5.4 Access Module Processor 5.5 Disk Arrays 6. Request Processing 2
  • 3. 3 1 What is Teradata DWH system? • RDBMS designed to run the world’s largest databases • Latest Intel technology nodes • Standard access language (SQL) • Massive Parallel Processing ‘MPP’ system • a “Shared-Nothing” architecture • Parallel-aware optimizer allowing concurrent complex queries • Linear Scalability
  • 4. 2. SMP VS MPP - Multiple CPU’s serving separate processes Simultaneously - Shared Everything - All CPU’s Share Same Memory - Mostly Hosted on Shared SAN Symmetric Multi Processing Massive Parallel Processing - Multiple CPUs runs in Parallel serving single process - Shared Nothing - Each CPU have It’s Own Memory and space - High Speed Nodes Connection [ByNet] 4
  • 5. 1. What’s a Teradata DWH System ? 2. SMP vs MPP 3. Shared-Everything vs Shared-Nothing architecture 4. Hardware architecture 4.1 Cliques 4.2 Hot standby Nodes 5. Node architecture 5.1 PDE 5.2 Virtual Processors 5.3 Parsing Engine 5.4 Access Module Processor 5.5 Disk Arrays 6. Request Processing 5
  • 6. 6 SHARED-NOTHINGSHARED-EVERYTHING - Disk controllers and bandwidth shared - Synchronization required across nodes - Large scale Scalability Issue - Best for many small statements - Controllers dedicated to nodes - No Cache Synchronization necessary - Linear Scalability - Best for Heavy statements
  • 7. 1. What’s a Teradata DWH System ? 2. SMP vs MPP 3. Shared-Everything vs Shared-Nothing architecture 4. Hardware architecture 4.1 Cliques 4.2 Hot standby Nodes 5. Node architecture 5.1 PDE 5.2 Virtual Processors 5.3 Parsing Engine 5.4 Access Module Processor 5.5 Disk Arrays 6. Request Processing 7
  • 8. 8 4 Teradata Hardware Architecture • SMP Nodes > Latest Intel SMP CPUs > Configured in 2+1 node cliques > Linux or Windows • BYNET Interconnect > Fully scalable bandwidth > 1 to 1024 nodes • Storage > Independent I/O per Node > Scales per node • Server Management > One console for the entire system Server Management PE SMP Node1 AMPPE AMP AMP AMP PE SMP Node2 AMPPE AMP AMP AMP PE SMP Node3 AMPPE AMP AMP AMP PE SMP Node4 AMPPE AMP AMP AMP BYNET Interconnect
  • 9. 4.1 CLIQUES • Group nodes together by multiported access to common disk array units. • Inter-node disk array connections are made using FibreChannel (FC) buses. • FC paths enable redundancy to ensure the loss of a processor node or disk controller won’t limit data availability. • Clique is a mechanism supports migration of VPROCs under PDE following a node failure. • If a node in a clique fails, VPROCs migrate to other nodes in the clique and continue to operate while recovery occurs on their home node. 9
  • 10. 4.2 HOT STANDBY NODES Improves availability and maintain performance levels in the event of a node failure. What’s a Hot Standby node: • Is a member of each clique in the system. • Does not normally participate in Teradata Database operations. • Used to compensate for the loss of a node in the clique. Using Hot Standby node Eliminates: • Restarts that are required to bring a failed node back into service. • Degraded service when VPROCs have migrated to other nodes in a clique. How Hot Standby node failover works : At node failure, all AMPs and LAN-attached PEs on the failed node migrate to the hot standby node The hot standby node becomes the production node. When the failed node returns to service, it becomes the new hot standby node. 10
  • 11. 1. What’s a Teradata DWH System ? 2. SMP vs MPP 3. Shared-Everything vs Shared-Nothing architecture 4. Hardware architecture 4.1 Cliques 4.2 Hot standby Nodes 5. Node architecture 5.1 PDE 5.2 Virtual Processors 5.3 Parsing Engine 5.4 Access Module Processor 5.5 Disk Arrays 6. Request Processing 11
  • 12. 5 Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software • Each node runs copy of OS, database SW, & virtual processes • Each node has CPUs, system disk, memory & adapters PE vproc AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk AMP vproc Vdisk PE vproc UNIX PDE
  • 13. 5.1 PARALLEL DATABASE EXTENSIONS - PDE Software interface layer lies between O. S. & TD DB which enables The database to : • Run in a parallel environment • Execute Vprocs • Apply a flexible priority scheduler to Teradata Database sessions • Consistently manage memory, I/O, and messaging system interfaces across multiple OS platforms PDE provides a series of parallel operating system services, which include: • Facilities to manage parallel execution of database operations on multiple nodes. • Dynamic distribution of database tasks. • Coordination of task execution within and between nodes. 13
  • 14. 5.2 VIRTUAL PROCESSORS – WHAT IS IT What is it: • Set of software processes that run on a node under Teradata (PDE). • Eliminate dependency on specialized physical processors VPROC characteristics: • Multiple VPROCs can run on an SMP platform or a node. • VPROCs and the tasks running under them communicate using unique-address messaging, as if they were physically isolated from one another. • This message communication is done using the BYNET hardware and BYNET Driver. • maximum # VPROCs in a system: 16,384 VPROCs, in a node 128. 14
  • 15. 5.2 VIRTUAL PROCESSORS – VPROC TYPES GTW • Gateway VPROCs provide a socket interface to Teradata Database PE • Parsing Engines perform session control, query parsing, security validation, query optimization AMP • Access Module Processors perform DB functions; Like: executing database queries. • Database storage Distributed Across AMPs. TVS • Manages Teradata Database storage. • AMPs acquire their portions of database storage through the TVS vproc. NODE • The node vproc handles PDE and operating system functions not directly related to AMP and PE work. • Cannot be externally manipulated, and do not appear in the output of the Vproc Manager utility. RSG • Relay Services Gateway provides a socket interface for the replication agent. • Relaydictionary changes to the Teradata Meta Data Services utility. 15
  • 16. 5.3 PARSING ENGINE ‘PE’ • Communicates with the client system on one side and with the AMPs (via the BYNET) on the other side. • Each PE executes the database software that manages sessions, decomposes SQL statements into steps, possibly in parallel, and returns the answer rows to the requesting client. Parsing Engine Elements Parser Decomposes SQL into relational data management processing steps. Optimizer Determines the most efficient path to access data. Generator Generates and packages steps. Dispatcher Receives processing steps from the parser, sends them to the appropriate AMPs via BYNET. Monitors the completion of steps and handles errors encountered during processing. Session Control Manages session activities, such as logon, password validation, and logoff. Recovers sessions following client or server failures. 16
  • 17. 5.4 ACCESS MODULE PROCESSOR ‘AMP’ • The AMP VPROC manages Teradata Database interactions with the disk subsystem. • Each AMP manages a share of the disk storage. Database management tasks • Accounting • Journaling • Locking tables, rows, and databases • Output data conversion During query processing: • Sorting • Joining data rows • Aggregation File System Management Disk Space management. 17
  • 18. 5.4 THE AMPS – REQUEST PROCESSING • The BYNET transmits messages to and from the AMPS and PEs. • An AMP step can be sent to one of the following: • One AMP • Multi-Cast (A selected set of AMPs) • All AMPs in the system PE communication with Amps during request processing: 1. PE 1 : Access is through a primary index and a request is for a single row - the PE transmits steps to a single AMP 2. PE 2 : the request is for many rows (an all-AMP request): - the PE makes the BYNET broadcast the steps to all AMPs ** To minimize system overhead, the PE can send a step to a subset of AMPs, when appropriate. 18
  • 19. 5.5 DISK ARRAYS Logical Units ‘Lun’ • The RAID Manager uses drive groups DG • DG is a set of drives that have been configured into one or more LUNs. • OS recognizes a LUN as a disk and is not aware that it is writing on on multiple disk drives. Vdisk • Group of cylinders currently assigned to an AMP • OS recognizes a LUN as a disk and is not aware that it is writing on on multiple disk drives. • The actual physical storage may derive from several different storage devices 19
  • 20. 1. What’s a Teradata DWH System ? 2. SMP vs MPP 3. Shared-Everything vs Shared-Nothing architecture 4. Hardware architecture 4.1 Cliques 4.2 Hot standby Nodes 5. Node architecture 5.1 PDE 5.2 Virtual Processors 5.3 Parsing Engine 5.4 Access Module Processor 5.5 Disk Arrays 6. Request Processing 20
  • 21. 6 REQUEST PROCESSING – “LIFETIME OF A QUERY” 1. The Parser •Checks Request cache to determine if the request is already there 2. The Syntaxer •checks the syntax of an incoming request 3. The Resolver •Adds information from the Data Dictionary to convert database, table, view, stored procedure, and macro names to internal identifiers. 4. Security module •checks privileges in the Data Dictionary. 5. The Optimizer •Determines the most effective way to implement the SQL request. 6. The Optimizer •scans the request to determine where to place locks, then passes the optimized parse tree to the Generator. 7. The Generator •Transforms the optimized parse tree into plastic steps, caches the steps if appropriate, and passes them to gncApply 8. gncApply •Takes the plastic steps produced by the Generator, binds in parameterized data if it exists, and transforms it into concrete steps. 9 The Dispatcher 21
  • 22. P.45 REQUEST PROCESSING – “1. THE PARSER” 1. The Parser • Checks if the request in Request cache: • IF IN = > Go to Step (2) - The Syntaxer. • IF New Request • The Parser reuses the plastic steps found in the cache and passes them to gncApply. • Go to checking privileges (step 4) • Then Go to gncApply (step 8) after. 2. The Syntaxer •checks the syntax 3. The Resolver •convert Object names to internal identifiers. 4. Security module •checks privileges 5. The Optimizer •Determines the most effective way to implement the SQL request. 6. The Optimizer •scans the request to determine where to place locks 7. The Generator •Transforms parse tree into plastic steps. 8. gncApply •binds parameterized data if it exists, transforms it into concrete steps. 9 The Dispatcher Plastic steps are directives to the database management system that do not contain data values 22
  • 23. P.45 REQUEST PROCESSING – “2. SYNTAXER, 3. RESOLVER” 1. The Parser •Checks Request cache to determine if the request is already there 2. The Syntaxer • checks the syntax of new request: • IF Wong => passes an error message back to the requestor and stops • IF Correct => converts the request to a parse tree and passes it to the Resolver (3) 3. The Resolver • Adds information from the Data Dictionary (or cached copy of the information) to convert database, table, view, stored procedure, and macro names to internal identifiers. 4. Security module •checks privileges 5. The Optimizer •Determines the most effective way to implement the SQL request. 6. The Optimizer •scans the request to determine where to place locks 7. The Generator •Transforms parse tree into plastic steps. 8. gncApply •binds parameterized data if it exists, transforms it into concrete steps. 9 The Dispatcher 23
  • 24. P.45 REQUEST PROCESSING – “4. SECURITY MODULE, 5. - 6. OPTIMIZER” 1. The Parser •Checks Request cache to determine if the request is already there 2. The Syntaxer •checks the syntax 3. The Resolver •convert Object names to internal identifiers. 4. The Security module • checks privileges of accessed object vs Requestor: • Mismatch => returns a privilege error message • Privileged => passes the request to the Optimizer. 5. The Optimizer • Determines the most effective way to implement the request (Excution plan) 6. Optimizer • Determine what type and where to place Objects locks 7. The Generator •Transforms parse tree into plastic steps. 8. gncApply •binds parameterized data if it exists, transforms it into concrete steps. 9 The Dispatcher 24
  • 25. P.45 REQUEST PROCESSING – “SEPARATE ORIGINAL” 1. The Parser •Checks Request cache to determine if the request is already there 2. The Syntaxer •checks the syntax 3. The Resolver •convert Object names to internal identifiers. 4. Security module •checks privileges 5. The Optimizer •Determines the most effective way to implement the SQL request. 6. The Optimizer •scans the request to determine where to place locks 7. The Generator • Transforms the optimized parse tree into plastic steps • caches the steps if appropriate • passes them to gncApply. 8. gncApply • Binds in parameterized data if it exists, and transform plastic steps to concrete steps. • passes the concrete steps to the Dispatcher 9 The Dispatcher Concrete steps are directives to the AMPs that contain needed user-or session- specific values and needed data parcels 25
  • 26. P.45 DISPATCHER – REQUEST PROCESSING • controls the sequence in which steps are executed: • It also passes the steps to the BYNET to be distributed to the AMPs: • 1 The Dispatcher receives concrete steps from gncApply. • 2 The Dispatcher places the first step on the BYNET; • - tells the BYNET whether the step is for one AMP, several AMPS, or all AMPs; • - waits for a completion response. • - Whenever possible, Teradata Database performs steps in parallel to enhance performance. • 3 The Dispatcher receives a completion response from all expected AMPs and places the next step on the BYNET. • It continues to do this until all the AMP steps associated with a request are done. 26
  • 27. 27

Editor's Notes

  1. 12
  2. Plastic steps are directives to the database management system that do not contain data values.
  3. Plastic steps are directives to the database management system that do not contain data values.
  4. Plastic steps are directives to the database management system that do not contain data values.
  5. Plastic steps are directives to the database management system that do not contain data values.
  6. Plastic steps are directives to the database management system that do not contain data values.