SlideShare a Scribd company logo
1 of 21
Download to read offline
IBM Netezza
IBM PUREDATA SYSTEM FOR ANALYTICS
INTRODUCES BY: AHMED SHOUMAN
Architectural principles
 Processing close to the data source
The Netezza architecture is based on a fundamental computer science
principle: when operating on large data sets, do not move data unless
absolutely necessary.
 Balanced, massively parallel architecture
The Netezza architecture combines the best elements of Symmetric
Multiprocessing (SMP) and Massively Parallel Processing (MPP) to create an
appliance purpose-built for analyzing petabytes of data quickly.
Architectural principles
 Appliance simplicity
By automating and streamlining day-to-day operations, the Netezza
architecture shields users from the underlying complexity of the platform.
 Accelerated innovation and performance improvements
One of the key goals of the Netezza architecture is to deliver price-
performance improvements and innovative functionality faster than
competing technologies over the long run.
Architectural principles
 Flexible configurations and extreme scalability
PureData System for Analytics scales modularly from a few hundred
gigabytes to petabytes of queryable user data. The system architecture
serves the needs of different segments of the data warehouse and
analytics market.
System building blocks
System building blocks
A major part of the PureData System for Analytics performance
advantage comes from its unique AMPP architecture, which combines
an SMP front end with a shared nothing MPP back end for query
processing.
 PureData System for Analytics hosts:
 The SMP hosts are high-performance Linux servers set up in an active-
passive configuration for high availability. It compiles SQL queries into
executable code segments called snippets, creates optimized query
plans, and distributes the snippets to the MPP nodes for execution.
System building blocks
 Snippet Blades (S-Blades):
 S-Blade is an independent server containing powerful multi-core CPUs,
multi-engine FPGAs, and gigabytes of RAM, all balanced and working
concurrently to deliver peak performance.
 Disk enclosures:
 The disk enclosures’ high-density, high-performance disks are RAID
protected. Each disk contains a slice of every database table's data. A
high-speed network connects disk enclosures to S-Blades, allowing all
the disks in the system to simultaneously stream data to the S-Blades at
the maximum rate possible.
System building blocks
 Network fabric:
 A high-speed network fabric connects all system components. The
PureData System for Analytics runs a customized IP-based protocol that
fully utilizes the total cross-sectional bandwidth of the fabric and
eliminates congestion even under sustained, bursty network traffic. The
network is optimized to scale to more than a thousand nodes, while
allowing each node to initiate large data transfers to every other node
simultaneously.
Where extreme performance happens:
Inside an S-Blade
FPGA Secret Sauce
FPGA Core CPU Core
Uncompress
Project Restrict,
Visibility
Complex ∑
Group by, …
select DISTRICT,
PRODUCTGRP,
sum(NRX)
from MTHLY_RX_TERR_DATA
where MONTH = '20091201'
and MARKET = 509123
and SPECIALTY = 'GASTRO'
Slice of table
MTHLY_RX_TERR_DATA
(compressed)
where MONTH = '20091201'
and MARKET = 509123
and SPECIALTY =
'GASTRO'
sum(NRX)select DISTRICT,
PRODUCTGRP,
sum(NRX)
Using FPGA reduces a tremendous
among of unnecessary data movement
The power of Netezza technologies FAST
engines
FAST engines include:
 The Compress engine:
 The engine uncompresses data at wire speed, instantly
transforming each block on disk into 4 to 8 blocks in
memory. The result is a significant speedup of the slowest
component in any data warehouse, the disk.
 The Project and Restrict engines:
 which further increase performance by filtering out
columns and rows respectively, based on the parameters
in the SELECT and WHERE clauses in a SQL query.
FAST engines include:
 The Visibility engine:
 which plays a critical role in maintaining Atomicity, Consistency,
Isolation, and Durability (ACID) compliance at streaming speeds in the
system. It filters out rows that should not be seen by a query
Orchestrating queries on PureData System
for Analytics
Orchestrating queries on PureData System
for Analytics
 Make an optimized query plan
The host compiles the query and creates a query
execution plan optimized for the AMPP
architecture. The intelligence of the optimizer is one
of the system’s greatest strengths.
Orchestrating queries on PureData System
for Analytics
 Convert it to snippets
 The compiler converts the query plan into
executable code segments, called snippets,
which are query segments executed by Snippet
Processors in parallel across all the data streams in
the appliance.
 Each snippet has two elements:
1. Compiled code executed by individual CPU cores
2. A set of FPGA parameters to customize the FAST
engines’ filtering for that particular snippet
Orchestrating queries on PureData System
for Analytics
 Schedule them to run at just the right moment:
 Netezza scheduler balances execution across
complex workloads to meet the objectives of different
users, while maintaining maximum utilization and
throughput.
 The scheduler uses the appliance arch to gather up-to-
date and accurate metrics about resource availability
from each component of the system. Using
sophisticated algorithms, the scheduler maximizes
system throughput by using close to 100% of the disk
bandwidth and ensuring that memory and network
resources are not overloaded.
Orchestrating queries on PureData System
for Analytics
Orchestrating queries on PureData System
for Analytics
 Execute them in parallel
 Each Snippet Processor on every S-Blade now has the
instructions it needs to execute its portion of the snippet. In
addition to the host scheduler, the Snippet Processors have
their own smart preemptive scheduler that allows snippets
from multiple queries to execute simultaneously.
Orchestrating queries on PureData System
for Analytics
 Return the results:
 All Snippet Processors now have snippet results
that must be assembled. The SnippetProcessors
use the intelligent network fabric to communicate
flexibly with the host and with each other to
perform intermediate calculations and
aggregations.
Questions?
Thank you

More Related Content

What's hot

Storage networks
Storage networksStorage networks
Storage networksAhmed Nour
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Managementsameerfaizan
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageTony Pearson
 
Direct Attached Storage CONCEPTS
Direct Attached Storage CONCEPTSDirect Attached Storage CONCEPTS
Direct Attached Storage CONCEPTSRamkaliyaperumal
 
Performance evolution of raid
Performance evolution of raidPerformance evolution of raid
Performance evolution of raidZubair Sami
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and UsesSuvradeep Rudra
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Kaushik Rajan
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architectureAjeet Singh
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olapSalah Amean
 
Storage Primer
Storage PrimerStorage Primer
Storage Primersriramr
 
Database fundamentals(database)
Database fundamentals(database)Database fundamentals(database)
Database fundamentals(database)welcometofacebook
 
cassandra
cassandracassandra
cassandraAkash R
 
Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)shahab3
 

What's hot (20)

Database System Architectures
Database System ArchitecturesDatabase System Architectures
Database System Architectures
 
Storage networks
Storage networksStorage networks
Storage networks
 
NoSql Data Management
NoSql Data ManagementNoSql Data Management
NoSql Data Management
 
IBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object StorageIBM Spectrum Scale for File and Object Storage
IBM Spectrum Scale for File and Object Storage
 
Chapter 2 - Retail Sales
Chapter 2 - Retail Sales Chapter 2 - Retail Sales
Chapter 2 - Retail Sales
 
Cassandra
CassandraCassandra
Cassandra
 
Direct Attached Storage CONCEPTS
Direct Attached Storage CONCEPTSDirect Attached Storage CONCEPTS
Direct Attached Storage CONCEPTS
 
Performance evolution of raid
Performance evolution of raidPerformance evolution of raid
Performance evolution of raid
 
NOSQL Databases types and Uses
NOSQL Databases types and UsesNOSQL Databases types and Uses
NOSQL Databases types and Uses
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
RAID-Presentation
RAID-PresentationRAID-Presentation
RAID-Presentation
 
Storage system architecture
Storage system architectureStorage system architecture
Storage system architecture
 
Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)Analysis of SOFTWARE DEFINED STORAGE (SDS)
Analysis of SOFTWARE DEFINED STORAGE (SDS)
 
Ms sql server architecture
Ms sql server architectureMs sql server architecture
Ms sql server architecture
 
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
Data Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olapData Mining:  Concepts and Techniques (3rd ed.)— Chapter _04 olap
Data Mining: Concepts and Techniques (3rd ed.) — Chapter _04 olap
 
Storage Primer
Storage PrimerStorage Primer
Storage Primer
 
Database fundamentals(database)
Database fundamentals(database)Database fundamentals(database)
Database fundamentals(database)
 
cassandra
cassandracassandra
cassandra
 
Deep Dive on Amazon Redshift
Deep Dive on Amazon RedshiftDeep Dive on Amazon Redshift
Deep Dive on Amazon Redshift
 
Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)Database systems - Chapter 2 (Remaining)
Database systems - Chapter 2 (Remaining)
 

Viewers also liked

Cache allocation
Cache allocationCache allocation
Cache allocationhdkid82
 
Netezza technicaloverviewportugues
Netezza technicaloverviewportugues Netezza technicaloverviewportugues
Netezza technicaloverviewportugues hdkid82
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsIBMGovernmentCA
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension Sunita Sahu
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceIBM Danmark
 

Viewers also liked (9)

DR_PRESENT 1
DR_PRESENT 1DR_PRESENT 1
DR_PRESENT 1
 
Cache allocation
Cache allocationCache allocation
Cache allocation
 
Netezza technicaloverviewportugues
Netezza technicaloverviewportugues Netezza technicaloverviewportugues
Netezza technicaloverviewportugues
 
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational AnalyticsA Hybrid Technology Platform for Increasing the Speed of Operational Analytics
A Hybrid Technology Platform for Increasing the Speed of Operational Analytics
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Netezza pure data
Netezza pure dataNetezza pure data
Netezza pure data
 
Slowly changing dimension
Slowly changing dimension Slowly changing dimension
Slowly changing dimension
 
The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
 
Slowly changing dimensions informatica
Slowly changing dimensions informatica Slowly changing dimensions informatica
Slowly changing dimensions informatica
 

Similar to IBM Netezza Architecture Breakdown for Analytics

Netezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture AdministrationNetezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture AdministrationBraja Krishna Das
 
Netezza Architecture and Administration
Netezza Architecture and AdministrationNetezza Architecture and Administration
Netezza Architecture and AdministrationBraja Krishna Das
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200xIBM Sverige
 
Network Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTMNetwork Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTMSlide_N
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
ETHERNET PACKET PROCESSOR FOR SOC APPLICATION
ETHERNET PACKET PROCESSOR FOR SOC APPLICATIONETHERNET PACKET PROCESSOR FOR SOC APPLICATION
ETHERNET PACKET PROCESSOR FOR SOC APPLICATIONcscpconf
 
Implementation of RISC-Based Architecture for Low power applications
Implementation of RISC-Based Architecture for Low power applicationsImplementation of RISC-Based Architecture for Low power applications
Implementation of RISC-Based Architecture for Low power applicationsIOSR Journals
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSimDeepak Shankar
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryDeepak Shankar
 
Juniper Networks Router Architecture
Juniper Networks Router ArchitectureJuniper Networks Router Architecture
Juniper Networks Router Architecturelawuah
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit SupercomputerVigneshwarRamaswamy
 
3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWarevmug
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesJun Liu
 
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureSynergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureMichael Gschwind
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning Dr. Swaminathan Kathirvel
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technologyNikhil Venugopal
 

Similar to IBM Netezza Architecture Breakdown for Analytics (20)

Netezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture AdministrationNetezza TwinFin12 Architecture Administration
Netezza TwinFin12 Architecture Administration
 
Netezza Architecture and Administration
Netezza Architecture and AdministrationNetezza Architecture and Administration
Netezza Architecture and Administration
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Network Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTMNetwork Processing on an SPE Core in Cell Broadband EngineTM
Network Processing on an SPE Core in Cell Broadband EngineTM
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
eet_NPU_file.PDF
eet_NPU_file.PDFeet_NPU_file.PDF
eet_NPU_file.PDF
 
ETHERNET PACKET PROCESSOR FOR SOC APPLICATION
ETHERNET PACKET PROCESSOR FOR SOC APPLICATIONETHERNET PACKET PROCESSOR FOR SOC APPLICATION
ETHERNET PACKET PROCESSOR FOR SOC APPLICATION
 
Implementation of RISC-Based Architecture for Low power applications
Implementation of RISC-Based Architecture for Low power applicationsImplementation of RISC-Based Architecture for Low power applications
Implementation of RISC-Based Architecture for Low power applications
 
Exploration of Radars and Software Defined Radios using VisualSim
Exploration of  Radars and Software Defined Radios using VisualSimExploration of  Radars and Software Defined Radios using VisualSim
Exploration of Radars and Software Defined Radios using VisualSim
 
Mirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP LibraryMirabilis_Design AMD Versal System-Level IP Library
Mirabilis_Design AMD Versal System-Level IP Library
 
Juniper Networks Router Architecture
Juniper Networks Router ArchitectureJuniper Networks Router Architecture
Juniper Networks Router Architecture
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
3PAR and VMWare
3PAR and VMWare3PAR and VMWare
3PAR and VMWare
 
Strata + Hadoop 2015 Slides
Strata + Hadoop 2015 SlidesStrata + Hadoop 2015 Slides
Strata + Hadoop 2015 Slides
 
Exadata database machine_x5-2
Exadata database machine_x5-2Exadata database machine_x5-2
Exadata database machine_x5-2
 
Sx 6-single-node
Sx 6-single-nodeSx 6-single-node
Sx 6-single-node
 
Actian Vector Whitepaper
 Actian Vector Whitepaper Actian Vector Whitepaper
Actian Vector Whitepaper
 
Synergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architectureSynergistic processing in cell's multicore architecture
Synergistic processing in cell's multicore architecture
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technology
 

More from Ahmed Salman

Faas__Food_as_a_Service__project
Faas__Food_as_a_Service__projectFaas__Food_as_a_Service__project
Faas__Food_as_a_Service__projectAhmed Salman
 
Project_Overview_-_final
Project_Overview_-_finalProject_Overview_-_final
Project_Overview_-_finalAhmed Salman
 
TECRM 20 Presentation
TECRM 20 PresentationTECRM 20 Presentation
TECRM 20 PresentationAhmed Salman
 
TCRM10 Pesentation
TCRM10 PesentationTCRM10 Pesentation
TCRM10 PesentationAhmed Salman
 
Big Data Course - BigData HUB
Big Data Course - BigData HUBBig Data Course - BigData HUB
Big Data Course - BigData HUBAhmed Salman
 
Introduction to Dig Data& Hadoop
Introduction to Dig Data& HadoopIntroduction to Dig Data& Hadoop
Introduction to Dig Data& HadoopAhmed Salman
 
BigData HUB Workshop
BigData HUB WorkshopBigData HUB Workshop
BigData HUB WorkshopAhmed Salman
 
Hadoop Installation
Hadoop InstallationHadoop Installation
Hadoop InstallationAhmed Salman
 

More from Ahmed Salman (10)

Faas__Food_as_a_Service__project
Faas__Food_as_a_Service__projectFaas__Food_as_a_Service__project
Faas__Food_as_a_Service__project
 
Project_Overview_-_final
Project_Overview_-_finalProject_Overview_-_final
Project_Overview_-_final
 
Cloudera
ClouderaCloudera
Cloudera
 
TECRM 20 Presentation
TECRM 20 PresentationTECRM 20 Presentation
TECRM 20 Presentation
 
TCRM10 Pesentation
TCRM10 PesentationTCRM10 Pesentation
TCRM10 Pesentation
 
Big Data Concepts
Big Data ConceptsBig Data Concepts
Big Data Concepts
 
Big Data Course - BigData HUB
Big Data Course - BigData HUBBig Data Course - BigData HUB
Big Data Course - BigData HUB
 
Introduction to Dig Data& Hadoop
Introduction to Dig Data& HadoopIntroduction to Dig Data& Hadoop
Introduction to Dig Data& Hadoop
 
BigData HUB Workshop
BigData HUB WorkshopBigData HUB Workshop
BigData HUB Workshop
 
Hadoop Installation
Hadoop InstallationHadoop Installation
Hadoop Installation
 

IBM Netezza Architecture Breakdown for Analytics

  • 1. IBM Netezza IBM PUREDATA SYSTEM FOR ANALYTICS INTRODUCES BY: AHMED SHOUMAN
  • 2. Architectural principles  Processing close to the data source The Netezza architecture is based on a fundamental computer science principle: when operating on large data sets, do not move data unless absolutely necessary.  Balanced, massively parallel architecture The Netezza architecture combines the best elements of Symmetric Multiprocessing (SMP) and Massively Parallel Processing (MPP) to create an appliance purpose-built for analyzing petabytes of data quickly.
  • 3. Architectural principles  Appliance simplicity By automating and streamlining day-to-day operations, the Netezza architecture shields users from the underlying complexity of the platform.  Accelerated innovation and performance improvements One of the key goals of the Netezza architecture is to deliver price- performance improvements and innovative functionality faster than competing technologies over the long run.
  • 4. Architectural principles  Flexible configurations and extreme scalability PureData System for Analytics scales modularly from a few hundred gigabytes to petabytes of queryable user data. The system architecture serves the needs of different segments of the data warehouse and analytics market.
  • 6. System building blocks A major part of the PureData System for Analytics performance advantage comes from its unique AMPP architecture, which combines an SMP front end with a shared nothing MPP back end for query processing.  PureData System for Analytics hosts:  The SMP hosts are high-performance Linux servers set up in an active- passive configuration for high availability. It compiles SQL queries into executable code segments called snippets, creates optimized query plans, and distributes the snippets to the MPP nodes for execution.
  • 7. System building blocks  Snippet Blades (S-Blades):  S-Blade is an independent server containing powerful multi-core CPUs, multi-engine FPGAs, and gigabytes of RAM, all balanced and working concurrently to deliver peak performance.  Disk enclosures:  The disk enclosures’ high-density, high-performance disks are RAID protected. Each disk contains a slice of every database table's data. A high-speed network connects disk enclosures to S-Blades, allowing all the disks in the system to simultaneously stream data to the S-Blades at the maximum rate possible.
  • 8. System building blocks  Network fabric:  A high-speed network fabric connects all system components. The PureData System for Analytics runs a customized IP-based protocol that fully utilizes the total cross-sectional bandwidth of the fabric and eliminates congestion even under sustained, bursty network traffic. The network is optimized to scale to more than a thousand nodes, while allowing each node to initiate large data transfers to every other node simultaneously.
  • 9. Where extreme performance happens: Inside an S-Blade
  • 10. FPGA Secret Sauce FPGA Core CPU Core Uncompress Project Restrict, Visibility Complex ∑ Group by, … select DISTRICT, PRODUCTGRP, sum(NRX) from MTHLY_RX_TERR_DATA where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' Slice of table MTHLY_RX_TERR_DATA (compressed) where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' sum(NRX)select DISTRICT, PRODUCTGRP, sum(NRX) Using FPGA reduces a tremendous among of unnecessary data movement
  • 11. The power of Netezza technologies FAST engines
  • 12. FAST engines include:  The Compress engine:  The engine uncompresses data at wire speed, instantly transforming each block on disk into 4 to 8 blocks in memory. The result is a significant speedup of the slowest component in any data warehouse, the disk.  The Project and Restrict engines:  which further increase performance by filtering out columns and rows respectively, based on the parameters in the SELECT and WHERE clauses in a SQL query.
  • 13. FAST engines include:  The Visibility engine:  which plays a critical role in maintaining Atomicity, Consistency, Isolation, and Durability (ACID) compliance at streaming speeds in the system. It filters out rows that should not be seen by a query
  • 14. Orchestrating queries on PureData System for Analytics
  • 15. Orchestrating queries on PureData System for Analytics  Make an optimized query plan The host compiles the query and creates a query execution plan optimized for the AMPP architecture. The intelligence of the optimizer is one of the system’s greatest strengths.
  • 16. Orchestrating queries on PureData System for Analytics  Convert it to snippets  The compiler converts the query plan into executable code segments, called snippets, which are query segments executed by Snippet Processors in parallel across all the data streams in the appliance.  Each snippet has two elements: 1. Compiled code executed by individual CPU cores 2. A set of FPGA parameters to customize the FAST engines’ filtering for that particular snippet
  • 17. Orchestrating queries on PureData System for Analytics  Schedule them to run at just the right moment:  Netezza scheduler balances execution across complex workloads to meet the objectives of different users, while maintaining maximum utilization and throughput.  The scheduler uses the appliance arch to gather up-to- date and accurate metrics about resource availability from each component of the system. Using sophisticated algorithms, the scheduler maximizes system throughput by using close to 100% of the disk bandwidth and ensuring that memory and network resources are not overloaded.
  • 18. Orchestrating queries on PureData System for Analytics
  • 19. Orchestrating queries on PureData System for Analytics  Execute them in parallel  Each Snippet Processor on every S-Blade now has the instructions it needs to execute its portion of the snippet. In addition to the host scheduler, the Snippet Processors have their own smart preemptive scheduler that allows snippets from multiple queries to execute simultaneously.
  • 20. Orchestrating queries on PureData System for Analytics  Return the results:  All Snippet Processors now have snippet results that must be assembled. The SnippetProcessors use the intelligent network fabric to communicate flexibly with the host and with each other to perform intermediate calculations and aggregations.