Netezza uses a proprietary architecture called Asymmetric Massively Parallel Processing (AMPP). The AMPP architecture distributes data and query processing across multiple processing blades called S-Blades. Each S-Blade contains processors, memory, and is connected to disk arrays through a database accelerator card. This architecture allows Netezza to process large volumes of data in parallel across the S-Blades for high performance. Netezza also uses some unique tools and concepts compared to traditional databases, such as not enforcing constraints for improved load performance and using hidden columns to track transaction details instead of redo logs.
PostgreSQL (or Postgres) began its life in 1986 as POSTGRES, a research project of the University of California at Berkeley.
PostgreSQL isn't just relational, it's object-relational.it's object-relational. This gives it some advantages over other open source SQL databases like MySQL, MariaDB and Firebird.
by Mahesh Pakal, AWS
PostgreSQL is a powerful, enterprise class open source object-relational database system with an emphasis on extensibility and standards-compliance. PostgreSQL boasts many sophisticated features and runs stored procedures in more than a dozen programming languages. We’ll explore the advantages and limitations of PostgreSQL, examples of where it is best suited for use, and examples of who is using PostgreSQL to power their applications.
Spencer Christensen
There are many aspects to managing an RDBMS. Some of these are handled by an experienced DBA, but there are a good many things that any sys admin should be able to take care of if they know what to look for.
This presentation will cover basics of managing Postgres, including creating database clusters, overview of configuration, and logging. We will also look at tools to help monitor Postgres and keep an eye on what is going on. Some of the tools we will review are:
* pgtop
* pg_top
* pgfouine
* check_postgres.pl.
Check_postgres.pl is a great tool that can plug into your Nagios or Cacti monitoring systems, giving you even better visibility into your databases.
PostgreSQL (or Postgres) began its life in 1986 as POSTGRES, a research project of the University of California at Berkeley.
PostgreSQL isn't just relational, it's object-relational.it's object-relational. This gives it some advantages over other open source SQL databases like MySQL, MariaDB and Firebird.
by Mahesh Pakal, AWS
PostgreSQL is a powerful, enterprise class open source object-relational database system with an emphasis on extensibility and standards-compliance. PostgreSQL boasts many sophisticated features and runs stored procedures in more than a dozen programming languages. We’ll explore the advantages and limitations of PostgreSQL, examples of where it is best suited for use, and examples of who is using PostgreSQL to power their applications.
Spencer Christensen
There are many aspects to managing an RDBMS. Some of these are handled by an experienced DBA, but there are a good many things that any sys admin should be able to take care of if they know what to look for.
This presentation will cover basics of managing Postgres, including creating database clusters, overview of configuration, and logging. We will also look at tools to help monitor Postgres and keep an eye on what is going on. Some of the tools we will review are:
* pgtop
* pg_top
* pgfouine
* check_postgres.pl.
Check_postgres.pl is a great tool that can plug into your Nagios or Cacti monitoring systems, giving you even better visibility into your databases.
How to integrate paytm payment gateway using react js in seven easy stepsKaty Slemon
Are you stuck with integrating a payment gateway into your project? If Yes, here learn how to Integrate Paytm Payment Gateway using ReactJS in this guide.
Secure Data Sharing in Cloud Computing Using Revocable-Storage Identity-Based...Yashwanth Reddy
Presentation on Secure Data Sharing in Cloud Computing Using Revocable-Storage Identity-Based Encryption done at my college during Mini Project Review..
Hope you like it and can be thank me later..
yashkruk@gmail.com
This presentation is for people who want to understand how PostgreSQL shares information among processes using shared memory. Topics covered include the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures.
PostgreSQL Tutorial For Beginners | EdurekaEdureka!
YouTube Link: https://youtu.be/-VO7YjQeG6Y
** MYSQL DBA Certification Training https://www.edureka.co/mysql-dba **
This Edureka PPT on PostgreSQL Tutorial For Beginners (blog: http://bit.ly/33GN7jQ) will help you learn PostgreSQL in depth. You will also learn how to install PostgreSQL on windows. The following topics will be covered in this session:
What is DBMS
What is SQL?
What is PostgreSQL?
Features of PostgreSQL
Install PostgreSQL
SQL Command Categories
DDL Commands
ER Diagram
Entity & Attributes
Keys in Database
Constraints in Database
Normalization
DML Commands
Operators
Nested Queries
Set Operations
Special Operators
Aggregate Functions
Limit, Offset & Fetch
Joins
Views
Procedures
Triggers
DCL Commands
TCL Commands
Export/ Import Data
UUID Datatype
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
How to integrate paytm payment gateway using react js in seven easy stepsKaty Slemon
Are you stuck with integrating a payment gateway into your project? If Yes, here learn how to Integrate Paytm Payment Gateway using ReactJS in this guide.
Secure Data Sharing in Cloud Computing Using Revocable-Storage Identity-Based...Yashwanth Reddy
Presentation on Secure Data Sharing in Cloud Computing Using Revocable-Storage Identity-Based Encryption done at my college during Mini Project Review..
Hope you like it and can be thank me later..
yashkruk@gmail.com
This presentation is for people who want to understand how PostgreSQL shares information among processes using shared memory. Topics covered include the internal data page format, usage of the shared buffers, locking methods, and various other shared memory data structures.
PostgreSQL Tutorial For Beginners | EdurekaEdureka!
YouTube Link: https://youtu.be/-VO7YjQeG6Y
** MYSQL DBA Certification Training https://www.edureka.co/mysql-dba **
This Edureka PPT on PostgreSQL Tutorial For Beginners (blog: http://bit.ly/33GN7jQ) will help you learn PostgreSQL in depth. You will also learn how to install PostgreSQL on windows. The following topics will be covered in this session:
What is DBMS
What is SQL?
What is PostgreSQL?
Features of PostgreSQL
Install PostgreSQL
SQL Command Categories
DDL Commands
ER Diagram
Entity & Attributes
Keys in Database
Constraints in Database
Normalization
DML Commands
Operators
Nested Queries
Set Operations
Special Operators
Aggregate Functions
Limit, Offset & Fetch
Joins
Views
Procedures
Triggers
DCL Commands
TCL Commands
Export/ Import Data
UUID Datatype
Follow us to never miss an update in the future.
YouTube: https://www.youtube.com/user/edurekaIN
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Castbox: https://castbox.fm/networks/505?country=in
The IBM Netezza Data Warehouse ApplianceIBM Sverige
Netezza - Ett enklare sätt till smart analys.
Denna presentation hölls på IBM Data Server Day den 22 maj i Stockholm av Jacques Milman, Datawarehouse Architecture Leader, IBM
Managing user Online Training in IBM Netezza DBA Development by www.etraining...Ravikumar Nandigam
Dear Student,
Greetings from www.etraining.guru
We provide BEST online training in Hyderabad for IBM Netezza DBA and/or Development by a senior working professional. Our Netezza Trainer comes with a working experience of 10+ years, 6+ years in Netezza and an Netezza 7.1 certified professional.
DBA Course Content: http://www.etraining.guru/course/dba/online-training-ibm-netezza-puredata-dba
Development Course Content: http://www.etraining.guru/course/ibm/online-training-ibm-puredata-netezza-development
Course Cost: USD 300 (or) INR 18000
Number of Hours: 24 hours
*Please note the course also includes Netezza certification assitance.
If there is any opportunity, we will be very happy to serve you. Appreciate if you can explore other training opportunities in our website as well.
We can be reachable at info@etraining.guru (or) 91-996-669-2446 for any further info/details.
Regards,
Karthik
www.etraining.guru"
IBM Netezza - The data warehouse in a big data strategyIBM Sverige
Big Data - Trender och verklighet inom Information Management.
Denna presentation hölls på IBM Data Server Day den 22 maj i Stockholm av Jacques Milman, Datawarehouse Architecture Leader, IBM
Join this foundational session to understand the core concepts of “Cloud Computing” and different attributes such as reliability, fault tolerance, elasticity, scalability and pay-as-you-go pricing. Whether you are a startup who wants to accelerate growth without a big upfront investment in cash or time for technology or an Enterprise looking for IT innovation, agility and resiliency while reducing costs, the AWS Cloud provides a complete set of infrastructure services at zero upfront costs which are available with a few clicks and within minutes. Join this webinar to learn more about the benefits of Cloud Computing.
MySQL Cluster Performance Tuning - 2013 MySQL User ConferenceSeveralnines
Slides from a presentation given at Percona Live MySQL Conference 2013 in Santa Clara, US.
Topics include:
- How to look for performance bottlenecks
- Foreign Key performance in MySQL Cluster 7.3
- Sharding and table partitioning
- efficient use of datatypes (e.g. BLOBS vs varbinary)
12cR1 new features. I have tried to cover all new features of 12cR1 and many more may be missing. These are all my own views and do not necessarily reflect the views of Oracle. Requesting all visitors to comment on it to improve further.
Are blade server suitable for HPTC? This talk covers the pros and cons of building your next cluster using blades.
Talk given at International Supercomputing blade workshop in 2007.
As the popularity of PostgreSQL continues to soar, many companies are exploring ways of migrating their application database over. At Redgate Software, we recently added PostgreSQL as an optional data store for SQL Monitor, our flagship monitoring application, after nearly 18 years of being backed exclusively by SQL Server. Knowing that others will be taking this journey in the near future, we'd like to discuss what we learned. In this training, we'll discuss the planning that needs to take place before a migration begins, including datatype changes, PostgreSQL configuration modifications, and query differences. This will be a mix of slides and demo from our own learnings, as well as those of some clients we've helped along the way.
MySQL 8 -- A new beginning : Sunshine PHP/PHP UK (updated)Dave Stokes
MySQL 8 has many new features and this presentation covers the new data dictionary, improved JSON functions, roles, histograms, and much more. Updated after SunshinePHP 2018 after feedback
LAS16-300: Mini Conference 2 Cortex-M Software - Device ConfigurationLinaro
LAS16-300: Mini Conference 2 RTOS-Zephyr - Device Configuration
Speakers: Andy Gross
Date: September 28, 2016
★ Session Description ★
SoC Vendors, board vendors, software middle layers, scripting languages, etc all need to have access to system configuration information (pin muxes, what sensors are on a system, what amount of memory, flash, etc, etc). We need a means to convey this in a vendor neutral mechanism but also one that is friendly for Cortex-M/constrained footprint devices. This session will be to discuss the topic, how its done today, what kinda tooling might exist from different vendors, what we could utilize (device tree) and what issues that creates.
★ Resources ★
Etherpad: pad.linaro.org/p/las16-300
Presentations & Videos: http://connect.linaro.org/resource/las16/las16-300/
★ Event Details ★
Linaro Connect Las Vegas 2016 – #LAS16
September 26-30, 2016
http://www.linaro.org
http://connect.linaro.org
String Comparison Surprises: Did Postgres lose my data?Jeremy Schneider
Comparisons are fundamental to computing - and comparing strings is not nearly as straightforward as you might think. Come learn about the history, nuance and surprises of “putting words in order” that you never knew existed in computer science, and how that nuance impacts both general programming and SQL programming. Next, walk through a few actual scenarios and demonstrations using PostgreSQL as a user and administrator, which you can re-run yourself later for further study, including one way you could easily corrupt your self-managed PostgreSQL database if you aren't prepared. Finally we’ll dive into an explanation of the surprising behaviors we saw in PostgreSQL, and learn more about user and administrative features PostgreSQL provides related to localized string comparison.
Developing Distributed High-performance Computing Capabilities of an Open Sci...Globus
COVID-19 had an unprecedented impact on scientific collaboration. The pandemic and its broad response from the scientific community has forged new relationships among public health practitioners, mathematical modelers, and scientific computing specialists, while revealing critical gaps in exploiting advanced computing systems to support urgent decision making. Informed by our team’s work in applying high-performance computing in support of public health decision makers during the COVID-19 pandemic, we present how Globus technologies are enabling the development of an open science platform for robust epidemic analysis, with the goal of collaborative, secure, distributed, on-demand, and fast time-to-solution analyses to support public health.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
We describe the deployment and use of Globus Compute for remote computation. This content is aimed at researchers who wish to compute on remote resources using a unified programming interface, as well as system administrators who will deploy and operate Globus Compute services on their research computing infrastructure.
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Globus
The U.S. Geological Survey (USGS) has made substantial investments in meeting evolving scientific, technical, and policy driven demands on storing, managing, and delivering data. As these demands continue to grow in complexity and scale, the USGS must continue to explore innovative solutions to improve its management, curation, sharing, delivering, and preservation approaches for large-scale research data. Supporting these needs, the USGS has partnered with the University of Chicago-Globus to research and develop advanced repository components and workflows leveraging its current investment in Globus. The primary outcome of this partnership includes the development of a prototype enterprise repository, driven by USGS Data Release requirements, through exploration and implementation of the entire suite of the Globus platform offerings, including Globus Flow, Globus Auth, Globus Transfer, and Globus Search. This presentation will provide insights into this research partnership, introduce the unique requirements and challenges being addressed and provide relevant project progress.
In the ever-evolving landscape of technology, enterprise software development is undergoing a significant transformation. Traditional coding methods are being challenged by innovative no-code solutions, which promise to streamline and democratize the software development process.
This shift is particularly impactful for enterprises, which require robust, scalable, and efficient software to manage their operations. In this article, we will explore the various facets of enterprise software development with no-code solutions, examining their benefits, challenges, and the future potential they hold.
Utilocate offers a comprehensive solution for locate ticket management by automating and streamlining the entire process. By integrating with Geospatial Information Systems (GIS), it provides accurate mapping and visualization of utility locations, enhancing decision-making and reducing the risk of errors. The system's advanced data analytics tools help identify trends, predict potential issues, and optimize resource allocation, making the locate ticket management process smarter and more efficient. Additionally, automated ticket management ensures consistency and reduces human error, while real-time notifications keep all relevant personnel informed and ready to respond promptly.
The system's ability to streamline workflows and automate ticket routing significantly reduces the time taken to process each ticket, making the process faster and more efficient. Mobile access allows field technicians to update ticket information on the go, ensuring that the latest information is always available and accelerating the locate process. Overall, Utilocate not only enhances the efficiency and accuracy of locate ticket management but also improves safety by minimizing the risk of utility damage through precise and timely locates.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Code reviews are vital for ensuring good code quality. They serve as one of our last lines of defense against bugs and subpar code reaching production.
Yet, they often turn into annoying tasks riddled with frustration, hostility, unclear feedback and lack of standards. How can we improve this crucial process?
In this session we will cover:
- The Art of Effective Code Reviews
- Streamlining the Review Process
- Elevating Reviews with Automated Tools
By the end of this presentation, you'll have the knowledge on how to organize and improve your code review proces
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
2. Netezza Architecture
Netezza uses a proprietary architecture called Asymmetric Massively Parallel
Processing (AMPP)
AMPP is based on the concept of Massively Parallel Processing (MPP) where
nothing (CPU, memory, storage) is shared .
The MPP is achieved through an array of S-Blades which are servers on its own
running its own operating systems connected to disks.
Netezza architecture has one unique hardware component called the
Database Accelerator card which is attached to the S-Blades.
An introduction to Netezza Vijaya Chandrika J
2
3. Hardware components of the Netezza
05/25/15An introduction to Netezza Vijaya Chandrika J 3
The following diagram provides a
high level logical schematic
which will help imagine the
various components in the
Netezza appliance.
Uses a Linux OS
Each S-Blades has 8 processor
cores and 16 GB of RAM .
Each processor in the S-Blade is
connected to disks in a disk array
through a Database Accelerator
card which uses FPGA
technology.
4. What are S-Blades
S blades are called Snippet blades or Snippet Processing Array (SPA)
The S-Blade is a specialized processing board which combines the CPU
processing power of a blade server with the query analysis intelligence
The Netezza Database Accelerator card contains the FPGA query engines,
memory, and I/O for processing the data from the disks where user data is
stored.
An introduction to Netezza Vijaya Chandrika J
4
1- S Blade
2- accelerator card
5. How it works ? An example
Assumptions : Assume an example data warehouse for a large retail firm and
one of the tables store the details about all of its 10 million customers. Also
assume that there are 25 columns in the tables and the total length of each
table row is 250 bytes.
Query : user query the application for say Customer Id, Name and State who
joined the organization in a particular period sorted by state and name
An introduction to Netezza Vijaya Chandrika J
5
6. High level steps
In Netezza the 10 million customer records will be stored fairly equally across
all the disks available in the disk arrays connected to the snippet processors
in the S-Blades in a compressed form.
The Database Accelerator card in the snippet processor will un-compress the
data which will include all the columns in the table, then it will remove the
unwanted columns from the data which in case will be 22 columns i.e. 220
bytes out of the 250 bytes, applies the where clause which will remove the
unwanted rows from the data and passes the small amount of the data to the
CPU in the snippet processor. In traditional databases all these steps are
performed in the CPU.
The CPU in the snippet processor performs tasks like aggregation, sum, sort
etc on the data from the database accelerator card and parses the result to
the host through the network.
An introduction to Netezza Vijaya Chandrika J
6
7. The key takeaways
The Netezza has the ability to process large volume of data in parallel and
the key is to make sure that the data is distributed appropriately to leverage
the massive parallel processing.
Implement designs in a way that most of the processing happens in the
snippet processors; minimize communication between snippet processors and
minimal data communication to the host.
An introduction to Netezza Vijaya Chandrika J
7
8. Netezza Tools
NzAdmin : This is a GUI based administration tool
The tool has a system view which it provides a visual snapshot of the state of
the appliance including issues with any hardware components. The second
view the tool provides is the database view which lists all the databases
including the objects in them, users and groups currently defined, active
sessions, query history and any backup history. The database view also
provides options to perform database administration tasks like creation and
management of database and database objects, users and groups.
An introduction to Netezza Vijaya Chandrika J
8
11. NZSQL
“nzsql” is the second tool that is most commonly used .
The “nzsql” command invoke the SQL command interpreter through which all
Netezza supported SQL statements can be executed.
nzsql –d testdb –u testuser –p password
This command Will connect and create a “nzsql” session with the database
“testdb” as the user “testuser” after which the user can execute SQL
statements against the database. Also as with all the Netezza commands the
“nzsql” has the “-h” help option which displays details about the usage of the
command.
05/25/15An introduction to Netezza Vijaya Chandrika J
11
12. System Objects
The appliance comes preconfigured with the following 3 user ids which can’t
be modified or deleted from the system. They are used to perform all the
administration tasks and hence should be used by restricted number of users.
root : The super user for the host system on the appliance and has all the
access as a super user in any Linux system.
nz : Netezza system administrator Linux account that is used to run host
software on Linux
admin : The default Netezza SQL database administrator user which has
access to perform all database related tasks against all the databases in the
appliance.
An introduction to Netezza Vijaya Chandrika J
12
13. Create Table
create table employee (
emp_id integer not null,
first_name varchar(25) not null,
last_name varchar(25) not null,
sex char(1),
dept_id integer not null,
created_dt timestamp not null,
created_by char(8) not null,
updated_dt timestamp not null,
updated_by char(8) not null,
constraint pk_employee primary key(emp_id)
constraint fk_employee foreign key (dept_id) references department(dept_id)
on update restrict on delete restrict
) distribute on random;
An introduction to Netezza Vijaya Chandrika J
13
the statement will look familiar except for the
“distribute on” clause details. Also there are
no storage related details like tablespace on
which the table needs to be created or any
bufferpool details which are handled by the
Netezza appliance.
14. Netezza vs traditional dbs
Netezza doesn’t enforce any of the constraints like the primary key or foreign
key when inserting or loading data into the tables for performance reasons. It
is up to the application to make sure that these constraints are satisfied by
the data being loaded into the tables. Even though the constraints are not
enforced by Netezza defining them will provide additional hints to the query
optimizer to generate efficient snippet execution code which in turn helps
performance.
Modifying the column length is only applicable to columns defined as varchar.
If a table gets renamed the views attached to the table will stop working
If a table is referenced by a stored procedure adding or dropping a column is
not permitted. The stored procedure needs to be dropped first before adding
or dropping a column and then the stored procedure needs to be recreated.
An introduction to Netezza Vijaya Chandrika J
14
15. Netezza vs traditional dbs - MV
Only one table can be specified in the FROM clause of the create statement
for MV
There can be no where clause in the select clause of the create statement for
MV
The columns in the projection list must be columns from the base table and
no expressions
External, temporary, system or clustered base tables can’t be used as base
table for materialized views
An introduction to Netezza Vijaya Chandrika J
15
16. Netezza vs traditional dbs - Sequence
The following is a sample sequence creation statement which can be used to
populate the id column in the employee table.
create sequence seq_emp_id as integer start with 1 increment by 1
minvalue 1 no maxvalue no cycle;
Since no max value is used, the sequence will be able to hold up to the
largest value of the sequence type which in this case is 35,791,394 for integer
type.
System will be forced to flush cached values of sequences in situations like
stopping of the system, system or SPU crashes or during some alter sequence
statements which will also create gaps in the sequence number generated by
a sequence.
An introduction to Netezza Vijaya Chandrika J
16
17. Netezza Storage
Each disk in the appliance is partitioned into primary, mirror and temp or
swap partitions. The primary partition in each disk is used to store user data
like database tables, the mirror stores a copy of the primary partition of
another disk so that it can be used in the event of disk failures and the
temp/swap partition is used to store the data temporarily like when the
appliance does data redistribution while processing queries. The logical
representation of the data saved in the primary partition of each disk is
called the data slice. When users create database tables and loads data into
it, they get distributed across the available data slices. Logical representation
of data slices is called the data partition.
An introduction to Netezza Vijaya Chandrika J
17
18. Netezza Storage - Diagram
An introduction to Netezza Vijaya Chandrika J
18
19. Data Organization
When users create tables in databases and store data into it, data gets stored
in disk extents which is the minimum storage allocated on disks for data
storage. Netezza distributes the data in data extents across all the available
data slices based on the distribution key specified during the table creation.
A user can specify upto four columns for data distribution or can specify the
data to be distributed randomly or none at all during the table creation
process.
When the user selects random as the option for data distribution, then the
appliance uses round robin algorithm to distribute the data uniformly across
all the available dataslices.
The key is to make sure that the data for a table is uniformly distributed
across all the data slices so that there are no data skews. By distributing data
across the data slices, all the SPUs in the system can be utilized to process
any query and in turn improves performance.
An introduction to Netezza Vijaya Chandrika J
19
20. Netezza Transactions
By default Netezza SQLs are executed in auto-commit mode i.e. the changes
made by a SQL statement takes in effect immediately after the completion of
the statement as if the transaction is complete.
If there are multiple related SQL statements where all the SQL execution
need to fail if any one of them fails, user can use the BEGIN, COMMIT and
ROLLBACK transaction control statements to control the transaction involving
multiple statements. All SQL statements between a BEGIN statement and
COMMIT or ROLLBACK statement will be treated as part of a single transaction
An introduction to Netezza Vijaya Chandrika J
20
21. Alternate for redo logs in Netezza
Netezza doesn’t use logs and all the changes are made on the storage where
user data is stored which also helps with the performance.
Netezza maintains three additional hidden columns (createxid, deletexid and
row id) per table row which stores the transaction id which created the row,
the transaction id which deleted the row and a unique row id assigned to the
data row by the system.
An introduction to Netezza Vijaya Chandrika J
21
22. Best Practices
Define all constraints and relationships between objects. Even though
Netezza doesn’t enforce them other than the not null constraint, the query
optimizer will still use these details to come-up with an efficient query
execution plan.
If data for a column is known to have a fixed length value, then use char(x)
instead of varchar(x). Varchar(x) uses additional storage which will be
significant when dealing with TB of data and also impacts the query
processing since additional data need to be pulled in from disk for processing.
Use NOT NULL wherever data permits. This will help improve performance by
not having to check for null condition by the appliance and will reduce
storage usage.
An introduction to Netezza Vijaya Chandrika J
22
23. Best Practices
Distribute on columns of high cardinality and ones that used to join often. It
is best to distribute fact and dimension table on the same column. This will
reduce the data redistribution during queries improving the performance.
Create materialized view on a small set of the columns from a large table
often used by user queries.
An introduction to Netezza Vijaya Chandrika J
23