SlideShare a Scribd company logo
1 of 25
NETEZZA
What is Bigdata Netezza ?
An introduction to Netezza Vijaya Chandrika J
1
Netezza Architecture
 Netezza uses a proprietary architecture called Asymmetric Massively Parallel
Processing (AMPP)
 AMPP is based on the concept of Massively Parallel Processing (MPP) where
nothing (CPU, memory, storage) is shared .
 The MPP is achieved through an array of S-Blades which are servers on its own
running its own operating systems connected to disks.
 Netezza architecture has one unique hardware component called the
Database Accelerator card which is attached to the S-Blades.
An introduction to Netezza Vijaya Chandrika J
2
Hardware components of the Netezza
05/25/15An introduction to Netezza Vijaya Chandrika J 3
The following diagram provides a
high level logical schematic
which will help imagine the
various components in the
Netezza appliance.
Uses a Linux OS
Each S-Blades has 8 processor
cores and 16 GB of RAM .
Each processor in the S-Blade is
connected to disks in a disk array
through a Database Accelerator
card which uses FPGA
technology.
What are S-Blades
 S blades are called Snippet blades or Snippet Processing Array (SPA)
 The S-Blade is a specialized processing board which combines the CPU
processing power of a blade server with the query analysis intelligence
 The Netezza Database Accelerator card contains the FPGA query engines,
memory, and I/O for processing the data from the disks where user data is
stored.
An introduction to Netezza Vijaya Chandrika J
4
1- S Blade
2- accelerator card
How it works ? An example
 Assumptions : Assume an example data warehouse for a large retail firm and
one of the tables store the details about all of its 10 million customers. Also
assume that there are 25 columns in the tables and the total length of each
table row is 250 bytes.
 Query : user query the application for say Customer Id, Name and State who
joined the organization in a particular period sorted by state and name
An introduction to Netezza Vijaya Chandrika J
5
High level steps
 In Netezza the 10 million customer records will be stored fairly equally across
all the disks available in the disk arrays connected to the snippet processors
in the S-Blades in a compressed form.
 The Database Accelerator card in the snippet processor will un-compress the
data which will include all the columns in the table, then it will remove the
unwanted columns from the data which in case will be 22 columns i.e. 220
bytes out of the 250 bytes, applies the where clause which will remove the
unwanted rows from the data and passes the small amount of the data to the
CPU in the snippet processor. In traditional databases all these steps are
performed in the CPU.
 The CPU in the snippet processor performs tasks like aggregation, sum, sort
etc on the data from the database accelerator card and parses the result to
the host through the network.
An introduction to Netezza Vijaya Chandrika J
6
The key takeaways
 The Netezza has the ability to process large volume of data in parallel and
the key is to make sure that the data is distributed appropriately to leverage
the massive parallel processing.
 Implement designs in a way that most of the processing happens in the
snippet processors; minimize communication between snippet processors and
minimal data communication to the host.
An introduction to Netezza Vijaya Chandrika J
7
Netezza Tools
 NzAdmin : This is a GUI based administration tool
 The tool has a system view which it provides a visual snapshot of the state of
the appliance including issues with any hardware components. The second
view the tool provides is the database view which lists all the databases
including the objects in them, users and groups currently defined, active
sessions, query history and any backup history. The database view also
provides options to perform database administration tasks like creation and
management of database and database objects, users and groups.
An introduction to Netezza Vijaya Chandrika J
8
An introduction to Netezza Vijaya Chandrika J
9
05/25/15An introduction to Netezza Vijaya Chandrika J
10
NZSQL
 “nzsql” is the second tool that is most commonly used .
 The “nzsql” command invoke the SQL command interpreter through which all
Netezza supported SQL statements can be executed.
 nzsql –d testdb –u testuser –p password
 This command Will connect and create a “nzsql” session with the database
“testdb” as the user “testuser” after which the user can execute SQL
statements against the database. Also as with all the Netezza commands the
“nzsql” has the “-h” help option which displays details about the usage of the
command.
05/25/15An introduction to Netezza Vijaya Chandrika J
11
System Objects
 The appliance comes preconfigured with the following 3 user ids which can’t
be modified or deleted from the system. They are used to perform all the
administration tasks and hence should be used by restricted number of users.
 root : The super user for the host system on the appliance and has all the
access as a super user in any Linux system.
 nz : Netezza system administrator Linux account that is used to run host
software on Linux
 admin : The default Netezza SQL database administrator user which has
access to perform all database related tasks against all the databases in the
appliance.
An introduction to Netezza Vijaya Chandrika J
12
Create Table
create table employee (
emp_id integer not null,
first_name varchar(25) not null,
last_name varchar(25) not null,
sex char(1),
dept_id integer not null,
created_dt timestamp not null,
created_by char(8) not null,
updated_dt timestamp not null,
updated_by char(8) not null,
constraint pk_employee primary key(emp_id)
constraint fk_employee foreign key (dept_id) references department(dept_id)
on update restrict on delete restrict
) distribute on random;
An introduction to Netezza Vijaya Chandrika J
13
the statement will look familiar except for the
“distribute on” clause details. Also there are
no storage related details like tablespace on
which the table needs to be created or any
bufferpool details which are handled by the
Netezza appliance.
Netezza vs traditional dbs
 Netezza doesn’t enforce any of the constraints like the primary key or foreign
key when inserting or loading data into the tables for performance reasons. It
is up to the application to make sure that these constraints are satisfied by
the data being loaded into the tables. Even though the constraints are not
enforced by Netezza defining them will provide additional hints to the query
optimizer to generate efficient snippet execution code which in turn helps
performance.
 Modifying the column length is only applicable to columns defined as varchar.
 If a table gets renamed the views attached to the table will stop working
 If a table is referenced by a stored procedure adding or dropping a column is
not permitted. The stored procedure needs to be dropped first before adding
or dropping a column and then the stored procedure needs to be recreated.
An introduction to Netezza Vijaya Chandrika J
14
Netezza vs traditional dbs - MV
 Only one table can be specified in the FROM clause of the create statement
for MV
 There can be no where clause in the select clause of the create statement for
MV
 The columns in the projection list must be columns from the base table and
no expressions
 External, temporary, system or clustered base tables can’t be used as base
table for materialized views
An introduction to Netezza Vijaya Chandrika J
15
Netezza vs traditional dbs - Sequence
 The following is a sample sequence creation statement which can be used to
populate the id column in the employee table.
create sequence seq_emp_id as integer start with 1 increment by 1
minvalue 1 no maxvalue no cycle;
 Since no max value is used, the sequence will be able to hold up to the
largest value of the sequence type which in this case is 35,791,394 for integer
type.
 System will be forced to flush cached values of sequences in situations like
stopping of the system, system or SPU crashes or during some alter sequence
statements which will also create gaps in the sequence number generated by
a sequence.
An introduction to Netezza Vijaya Chandrika J
16
Netezza Storage
 Each disk in the appliance is partitioned into primary, mirror and temp or
swap partitions. The primary partition in each disk is used to store user data
like database tables, the mirror stores a copy of the primary partition of
another disk so that it can be used in the event of disk failures and the
temp/swap partition is used to store the data temporarily like when the
appliance does data redistribution while processing queries. The logical
representation of the data saved in the primary partition of each disk is
called the data slice. When users create database tables and loads data into
it, they get distributed across the available data slices. Logical representation
of data slices is called the data partition.
An introduction to Netezza Vijaya Chandrika J
17
Netezza Storage - Diagram
An introduction to Netezza Vijaya Chandrika J
18
Data Organization
 When users create tables in databases and store data into it, data gets stored
in disk extents which is the minimum storage allocated on disks for data
storage. Netezza distributes the data in data extents across all the available
data slices based on the distribution key specified during the table creation.
A user can specify upto four columns for data distribution or can specify the
data to be distributed randomly or none at all during the table creation
process.
 When the user selects random as the option for data distribution, then the
appliance uses round robin algorithm to distribute the data uniformly across
all the available dataslices.
 The key is to make sure that the data for a table is uniformly distributed
across all the data slices so that there are no data skews. By distributing data
across the data slices, all the SPUs in the system can be utilized to process
any query and in turn improves performance.
An introduction to Netezza Vijaya Chandrika J
19
Netezza Transactions
 By default Netezza SQLs are executed in auto-commit mode i.e. the changes
made by a SQL statement takes in effect immediately after the completion of
the statement as if the transaction is complete.
 If there are multiple related SQL statements where all the SQL execution
need to fail if any one of them fails, user can use the BEGIN, COMMIT and
ROLLBACK transaction control statements to control the transaction involving
multiple statements. All SQL statements between a BEGIN statement and
COMMIT or ROLLBACK statement will be treated as part of a single transaction
An introduction to Netezza Vijaya Chandrika J
20
Alternate for redo logs in Netezza
 Netezza doesn’t use logs and all the changes are made on the storage where
user data is stored which also helps with the performance.
 Netezza maintains three additional hidden columns (createxid, deletexid and
row id) per table row which stores the transaction id which created the row,
the transaction id which deleted the row and a unique row id assigned to the
data row by the system.
An introduction to Netezza Vijaya Chandrika J
21
Best Practices
 Define all constraints and relationships between objects. Even though
Netezza doesn’t enforce them other than the not null constraint, the query
optimizer will still use these details to come-up with an efficient query
execution plan.
 If data for a column is known to have a fixed length value, then use char(x)
instead of varchar(x). Varchar(x) uses additional storage which will be
significant when dealing with TB of data and also impacts the query
processing since additional data need to be pulled in from disk for processing.
 Use NOT NULL wherever data permits. This will help improve performance by
not having to check for null condition by the appliance and will reduce
storage usage.
An introduction to Netezza Vijaya Chandrika J
22
Best Practices
 Distribute on columns of high cardinality and ones that used to join often. It
is best to distribute fact and dimension table on the same column. This will
reduce the data redistribution during queries improving the performance.
 Create materialized view on a small set of the columns from a large table
often used by user queries.
An introduction to Netezza Vijaya Chandrika J
23
Questions ?
An introduction to Netezza Vijaya Chandrika J
24
Thank you
25

More Related Content

What's hot

Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS Dr Neelesh Jain
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developersBiju Nair
 
1 introduction databases and database users
1 introduction databases and database users1 introduction databases and database users
1 introduction databases and database usersKumar
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management systemPooja Dixit
 
Lesson 1: Introduction to DBMS
Lesson 1: Introduction to DBMSLesson 1: Introduction to DBMS
Lesson 1: Introduction to DBMSAmrit Kaur
 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Ravinder Kamboj
 
Fundamentals of Database system
Fundamentals of Database systemFundamentals of Database system
Fundamentals of Database systemphilipsinter
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data miningZHAO Sam
 
Operating system concepts (notes)
Operating system concepts (notes)Operating system concepts (notes)
Operating system concepts (notes)Sohaib Danish
 
Big Table, H base, Dynamo, Dynamo DB Lecture
Big Table, H base, Dynamo, Dynamo DB LectureBig Table, H base, Dynamo, Dynamo DB Lecture
Big Table, H base, Dynamo, Dynamo DB LectureDr Neelesh Jain
 

What's hot (20)

Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS  Cloud File System with GFS and HDFS
Cloud File System with GFS and HDFS
 
Netezza fundamentals for developers
Netezza fundamentals for developersNetezza fundamentals for developers
Netezza fundamentals for developers
 
Cs6703 grid and cloud computing unit 5
Cs6703 grid and cloud computing unit 5Cs6703 grid and cloud computing unit 5
Cs6703 grid and cloud computing unit 5
 
DDBMS
DDBMSDDBMS
DDBMS
 
1 introduction databases and database users
1 introduction databases and database users1 introduction databases and database users
1 introduction databases and database users
 
Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4Cs6703 grid and cloud computing unit 4
Cs6703 grid and cloud computing unit 4
 
Distributed database management system
Distributed database management  systemDistributed database management  system
Distributed database management system
 
Lesson 1: Introduction to DBMS
Lesson 1: Introduction to DBMSLesson 1: Introduction to DBMS
Lesson 1: Introduction to DBMS
 
Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)Architecture of dbms(lecture 3)
Architecture of dbms(lecture 3)
 
Fundamentals of Database system
Fundamentals of Database systemFundamentals of Database system
Fundamentals of Database system
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Operating system concepts (notes)
Operating system concepts (notes)Operating system concepts (notes)
Operating system concepts (notes)
 
Temporal databases
Temporal databasesTemporal databases
Temporal databases
 
Data warehouse physical design
Data warehouse physical designData warehouse physical design
Data warehouse physical design
 
IBM Netezza
IBM NetezzaIBM Netezza
IBM Netezza
 
Big Table, H base, Dynamo, Dynamo DB Lecture
Big Table, H base, Dynamo, Dynamo DB LectureBig Table, H base, Dynamo, Dynamo DB Lecture
Big Table, H base, Dynamo, Dynamo DB Lecture
 
Semi join
Semi joinSemi join
Semi join
 
Distributed database
Distributed databaseDistributed database
Distributed database
 
Chapter1
Chapter1Chapter1
Chapter1
 
NESTED SUBQUERY.pptx
NESTED SUBQUERY.pptxNESTED SUBQUERY.pptx
NESTED SUBQUERY.pptx
 

Viewers also liked

The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceIBM Danmark
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceIBM Sverige
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep DivesRush Shah
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200xIBM Sverige
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Abhishek Satyam
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaRavikumar Nandigam
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceBiju Nair
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03huynhle1990
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Ravikumar Nandigam
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Sverige
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaBiju Nair
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload managementBiju Nair
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar DatabaseBiju Nair
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Daniel Abadi
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveJason Shih
 
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三schoowebcampus
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesAmazon Web Services
 

Viewers also liked (18)

The IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse applianceThe IBM Netezza datawarehouse appliance
The IBM Netezza datawarehouse appliance
 
The IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse ApplianceThe IBM Netezza Data Warehouse Appliance
The IBM Netezza Data Warehouse Appliance
 
Netezza Deep Dives
Netezza Deep DivesNetezza Deep Dives
Netezza Deep Dives
 
Netezza pure data
Netezza pure dataNetezza pure data
Netezza pure data
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001Ibm pure data system for analytics n3001
Ibm pure data system for analytics n3001
 
Netezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in IndiaNetezza Online Training by www.etraining.guru in India
Netezza Online Training by www.etraining.guru in India
 
Using Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve PerformaceUsing Netezza Query Plan to Improve Performace
Using Netezza Query Plan to Improve Performace
 
Postgre sql statements 03
Postgre sql statements 03Postgre sql statements 03
Postgre sql statements 03
 
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Managing user Online Training in IBM Netezza DBA Development by www.etraining...
Managing user Online Training in IBM Netezza DBA Development by www.etraining...
 
IBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategyIBM Netezza - The data warehouse in a big data strategy
IBM Netezza - The data warehouse in a big data strategy
 
NENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezzaNENUG Apr14 Talk - data modeling for netezza
NENUG Apr14 Talk - data modeling for netezza
 
Netezza workload management
Netezza workload managementNetezza workload management
Netezza workload management
 
Row or Columnar Database
Row or Columnar DatabaseRow or Columnar Database
Row or Columnar Database
 
Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?Column-Stores vs. Row-Stores: How Different are they Really?
Column-Stores vs. Row-Stores: How Different are they Really?
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
スタートアップ組織づくりの具体策を学ぶ 先生:金子 陽三
 
Introduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web ServicesIntroduction to Cloud Computing with Amazon Web Services
Introduction to Cloud Computing with Amazon Web Services
 

Similar to An introduction to the architecture and components of Netezza data warehouse appliances

Netezza fundamentals-for-developers
Netezza fundamentals-for-developersNetezza fundamentals-for-developers
Netezza fundamentals-for-developersTariq H. Khan
 
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasadBigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasadBhawani N Prasad
 
SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5Gianluca Hotz
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_WhitepaperSourav Maity
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database OverviewSteve Min
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
 
12c Database new features
12c Database new features12c Database new features
12c Database new featuresSandeep Redkar
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0alok khobragade
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTCGuy Coates
 
Dmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL ServerDmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL ServerZeba Ansari
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQLGrant Fritchey
 
Sql interview question part 10
Sql interview question part 10Sql interview question part 10
Sql interview question part 10kaashiv1
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLinaro
 
Sql server lesson13
Sql server lesson13Sql server lesson13
Sql server lesson13Ala Qunaibi
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?Jeremy Schneider
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-featuresNavneet Upneja
 

Similar to An introduction to the architecture and components of Netezza data warehouse appliances (20)

Netezza fundamentals-for-developers
Netezza fundamentals-for-developersNetezza fundamentals-for-developers
Netezza fundamentals-for-developers
 
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasadBigdata netezza-ppt-apr2013-bhawani nandan prasad
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
 
td1.ppt
td1.ppttd1.ppt
td1.ppt
 
SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5SQL Server 2019 CTP 2.5
SQL Server 2019 CTP 2.5
 
DataStage_Whitepaper
DataStage_WhitepaperDataStage_Whitepaper
DataStage_Whitepaper
 
Course content (netezza dba)
Course content (netezza dba)Course content (netezza dba)
Course content (netezza dba)
 
NewSQL Database Overview
NewSQL Database OverviewNewSQL Database Overview
NewSQL Database Overview
 
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceMySQL Cluster Performance Tuning - 2013 MySQL User Conference
MySQL Cluster Performance Tuning - 2013 MySQL User Conference
 
12c Database new features
12c Database new features12c Database new features
12c Database new features
 
Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0Whitepaper tableau for-the-enterprise-0
Whitepaper tableau for-the-enterprise-0
 
Blades for HPTC
Blades for HPTCBlades for HPTC
Blades for HPTC
 
Dmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL ServerDmv's & Performance Monitor in SQL Server
Dmv's & Performance Monitor in SQL Server
 
Migrating To PostgreSQL
Migrating To PostgreSQLMigrating To PostgreSQL
Migrating To PostgreSQL
 
Sql interview question part 10
Sql interview question part 10Sql interview question part 10
Sql interview question part 10
 
Ebook10
Ebook10Ebook10
Ebook10
 
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)
 
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
LAS16-300: Mini Conference 2 Cortex-M Software - Device Configuration
 
Sql server lesson13
Sql server lesson13Sql server lesson13
Sql server lesson13
 
String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?String Comparison Surprises: Did Postgres lose my data?
String Comparison Surprises: Did Postgres lose my data?
 
Oracle 12 c new-features
Oracle 12 c new-featuresOracle 12 c new-features
Oracle 12 c new-features
 

Recently uploaded

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceanilsa9823
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AIABDERRAOUF MEHENNI
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️Delhi Call girls
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 

Recently uploaded (20)

How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 

An introduction to the architecture and components of Netezza data warehouse appliances

  • 1. NETEZZA What is Bigdata Netezza ? An introduction to Netezza Vijaya Chandrika J 1
  • 2. Netezza Architecture  Netezza uses a proprietary architecture called Asymmetric Massively Parallel Processing (AMPP)  AMPP is based on the concept of Massively Parallel Processing (MPP) where nothing (CPU, memory, storage) is shared .  The MPP is achieved through an array of S-Blades which are servers on its own running its own operating systems connected to disks.  Netezza architecture has one unique hardware component called the Database Accelerator card which is attached to the S-Blades. An introduction to Netezza Vijaya Chandrika J 2
  • 3. Hardware components of the Netezza 05/25/15An introduction to Netezza Vijaya Chandrika J 3 The following diagram provides a high level logical schematic which will help imagine the various components in the Netezza appliance. Uses a Linux OS Each S-Blades has 8 processor cores and 16 GB of RAM . Each processor in the S-Blade is connected to disks in a disk array through a Database Accelerator card which uses FPGA technology.
  • 4. What are S-Blades  S blades are called Snippet blades or Snippet Processing Array (SPA)  The S-Blade is a specialized processing board which combines the CPU processing power of a blade server with the query analysis intelligence  The Netezza Database Accelerator card contains the FPGA query engines, memory, and I/O for processing the data from the disks where user data is stored. An introduction to Netezza Vijaya Chandrika J 4 1- S Blade 2- accelerator card
  • 5. How it works ? An example  Assumptions : Assume an example data warehouse for a large retail firm and one of the tables store the details about all of its 10 million customers. Also assume that there are 25 columns in the tables and the total length of each table row is 250 bytes.  Query : user query the application for say Customer Id, Name and State who joined the organization in a particular period sorted by state and name An introduction to Netezza Vijaya Chandrika J 5
  • 6. High level steps  In Netezza the 10 million customer records will be stored fairly equally across all the disks available in the disk arrays connected to the snippet processors in the S-Blades in a compressed form.  The Database Accelerator card in the snippet processor will un-compress the data which will include all the columns in the table, then it will remove the unwanted columns from the data which in case will be 22 columns i.e. 220 bytes out of the 250 bytes, applies the where clause which will remove the unwanted rows from the data and passes the small amount of the data to the CPU in the snippet processor. In traditional databases all these steps are performed in the CPU.  The CPU in the snippet processor performs tasks like aggregation, sum, sort etc on the data from the database accelerator card and parses the result to the host through the network. An introduction to Netezza Vijaya Chandrika J 6
  • 7. The key takeaways  The Netezza has the ability to process large volume of data in parallel and the key is to make sure that the data is distributed appropriately to leverage the massive parallel processing.  Implement designs in a way that most of the processing happens in the snippet processors; minimize communication between snippet processors and minimal data communication to the host. An introduction to Netezza Vijaya Chandrika J 7
  • 8. Netezza Tools  NzAdmin : This is a GUI based administration tool  The tool has a system view which it provides a visual snapshot of the state of the appliance including issues with any hardware components. The second view the tool provides is the database view which lists all the databases including the objects in them, users and groups currently defined, active sessions, query history and any backup history. The database view also provides options to perform database administration tasks like creation and management of database and database objects, users and groups. An introduction to Netezza Vijaya Chandrika J 8
  • 9. An introduction to Netezza Vijaya Chandrika J 9
  • 10. 05/25/15An introduction to Netezza Vijaya Chandrika J 10
  • 11. NZSQL  “nzsql” is the second tool that is most commonly used .  The “nzsql” command invoke the SQL command interpreter through which all Netezza supported SQL statements can be executed.  nzsql –d testdb –u testuser –p password  This command Will connect and create a “nzsql” session with the database “testdb” as the user “testuser” after which the user can execute SQL statements against the database. Also as with all the Netezza commands the “nzsql” has the “-h” help option which displays details about the usage of the command. 05/25/15An introduction to Netezza Vijaya Chandrika J 11
  • 12. System Objects  The appliance comes preconfigured with the following 3 user ids which can’t be modified or deleted from the system. They are used to perform all the administration tasks and hence should be used by restricted number of users.  root : The super user for the host system on the appliance and has all the access as a super user in any Linux system.  nz : Netezza system administrator Linux account that is used to run host software on Linux  admin : The default Netezza SQL database administrator user which has access to perform all database related tasks against all the databases in the appliance. An introduction to Netezza Vijaya Chandrika J 12
  • 13. Create Table create table employee ( emp_id integer not null, first_name varchar(25) not null, last_name varchar(25) not null, sex char(1), dept_id integer not null, created_dt timestamp not null, created_by char(8) not null, updated_dt timestamp not null, updated_by char(8) not null, constraint pk_employee primary key(emp_id) constraint fk_employee foreign key (dept_id) references department(dept_id) on update restrict on delete restrict ) distribute on random; An introduction to Netezza Vijaya Chandrika J 13 the statement will look familiar except for the “distribute on” clause details. Also there are no storage related details like tablespace on which the table needs to be created or any bufferpool details which are handled by the Netezza appliance.
  • 14. Netezza vs traditional dbs  Netezza doesn’t enforce any of the constraints like the primary key or foreign key when inserting or loading data into the tables for performance reasons. It is up to the application to make sure that these constraints are satisfied by the data being loaded into the tables. Even though the constraints are not enforced by Netezza defining them will provide additional hints to the query optimizer to generate efficient snippet execution code which in turn helps performance.  Modifying the column length is only applicable to columns defined as varchar.  If a table gets renamed the views attached to the table will stop working  If a table is referenced by a stored procedure adding or dropping a column is not permitted. The stored procedure needs to be dropped first before adding or dropping a column and then the stored procedure needs to be recreated. An introduction to Netezza Vijaya Chandrika J 14
  • 15. Netezza vs traditional dbs - MV  Only one table can be specified in the FROM clause of the create statement for MV  There can be no where clause in the select clause of the create statement for MV  The columns in the projection list must be columns from the base table and no expressions  External, temporary, system or clustered base tables can’t be used as base table for materialized views An introduction to Netezza Vijaya Chandrika J 15
  • 16. Netezza vs traditional dbs - Sequence  The following is a sample sequence creation statement which can be used to populate the id column in the employee table. create sequence seq_emp_id as integer start with 1 increment by 1 minvalue 1 no maxvalue no cycle;  Since no max value is used, the sequence will be able to hold up to the largest value of the sequence type which in this case is 35,791,394 for integer type.  System will be forced to flush cached values of sequences in situations like stopping of the system, system or SPU crashes or during some alter sequence statements which will also create gaps in the sequence number generated by a sequence. An introduction to Netezza Vijaya Chandrika J 16
  • 17. Netezza Storage  Each disk in the appliance is partitioned into primary, mirror and temp or swap partitions. The primary partition in each disk is used to store user data like database tables, the mirror stores a copy of the primary partition of another disk so that it can be used in the event of disk failures and the temp/swap partition is used to store the data temporarily like when the appliance does data redistribution while processing queries. The logical representation of the data saved in the primary partition of each disk is called the data slice. When users create database tables and loads data into it, they get distributed across the available data slices. Logical representation of data slices is called the data partition. An introduction to Netezza Vijaya Chandrika J 17
  • 18. Netezza Storage - Diagram An introduction to Netezza Vijaya Chandrika J 18
  • 19. Data Organization  When users create tables in databases and store data into it, data gets stored in disk extents which is the minimum storage allocated on disks for data storage. Netezza distributes the data in data extents across all the available data slices based on the distribution key specified during the table creation. A user can specify upto four columns for data distribution or can specify the data to be distributed randomly or none at all during the table creation process.  When the user selects random as the option for data distribution, then the appliance uses round robin algorithm to distribute the data uniformly across all the available dataslices.  The key is to make sure that the data for a table is uniformly distributed across all the data slices so that there are no data skews. By distributing data across the data slices, all the SPUs in the system can be utilized to process any query and in turn improves performance. An introduction to Netezza Vijaya Chandrika J 19
  • 20. Netezza Transactions  By default Netezza SQLs are executed in auto-commit mode i.e. the changes made by a SQL statement takes in effect immediately after the completion of the statement as if the transaction is complete.  If there are multiple related SQL statements where all the SQL execution need to fail if any one of them fails, user can use the BEGIN, COMMIT and ROLLBACK transaction control statements to control the transaction involving multiple statements. All SQL statements between a BEGIN statement and COMMIT or ROLLBACK statement will be treated as part of a single transaction An introduction to Netezza Vijaya Chandrika J 20
  • 21. Alternate for redo logs in Netezza  Netezza doesn’t use logs and all the changes are made on the storage where user data is stored which also helps with the performance.  Netezza maintains three additional hidden columns (createxid, deletexid and row id) per table row which stores the transaction id which created the row, the transaction id which deleted the row and a unique row id assigned to the data row by the system. An introduction to Netezza Vijaya Chandrika J 21
  • 22. Best Practices  Define all constraints and relationships between objects. Even though Netezza doesn’t enforce them other than the not null constraint, the query optimizer will still use these details to come-up with an efficient query execution plan.  If data for a column is known to have a fixed length value, then use char(x) instead of varchar(x). Varchar(x) uses additional storage which will be significant when dealing with TB of data and also impacts the query processing since additional data need to be pulled in from disk for processing.  Use NOT NULL wherever data permits. This will help improve performance by not having to check for null condition by the appliance and will reduce storage usage. An introduction to Netezza Vijaya Chandrika J 22
  • 23. Best Practices  Distribute on columns of high cardinality and ones that used to join often. It is best to distribute fact and dimension table on the same column. This will reduce the data redistribution during queries improving the performance.  Create materialized view on a small set of the columns from a large table often used by user queries. An introduction to Netezza Vijaya Chandrika J 23
  • 24. Questions ? An introduction to Netezza Vijaya Chandrika J 24