• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Bigdata netezza-ppt-apr2013-bhawani nandan prasad
 

Bigdata netezza-ppt-apr2013-bhawani nandan prasad

on

  • 2,041 views

Big Data Netezza by Bhawani nandan Prasad (MBA, B.E. IT)

Big Data Netezza by Bhawani nandan Prasad (MBA, B.E. IT)

Statistics

Views

Total Views
2,041
Views on SlideShare
2,041
Embed Views
0

Actions

Likes
7
Downloads
0
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel

11 of 1 previous next

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • slides look good. but the content is lacking. should have been good, if there is an explanation on the points of the slide.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Bigdata netezza-ppt-apr2013-bhawani nandan prasad Bigdata netezza-ppt-apr2013-bhawani nandan prasad Presentation Transcript

    • © 2013 AgreeYa Solutions. All rights reserved.1© 2013, AgreeYa Solutions. All rights reserved.www.agreeya.comNetezza OverviewNetezza ArchitectureNetezza Performance TuningNetezza AdminApril 10, 2013 – BHAWANI NANDAN PRASAD – BI Practice HeadSMP – IIM Calcutta, MBA – Stratford University USA, B.E. (IT)
    • © 2013 AgreeYa Solutions. All rights reserved.2Agenda• Netezza Architecture• Netezza Connectivity• NZSQL• Data Types in Netezza• Metadata Tables• Types of Joins in Netezza• Data Loading and Unloading in Netezza• Data Distribution in Netezza• Transactions in Netezza• GROOM/Reclaim Process in Netezza• Zone Maps in Netezza• GENERATE STATISTICS in Netezza
    • © 2013 AgreeYa Solutions. All rights reserved.3© 2013, AgreeYa Solutions. All rights reserved.www.agreeya.comNetezza Architecture
    • © 2013 AgreeYa Solutions. All rights reserved.4Netezza Architecture
    • © 2013 AgreeYa Solutions. All rights reserved.5Data Stream Processing in Netezza
    • © 2013 AgreeYa Solutions. All rights reserved.6Netezza Connectivity
    • © 2013 AgreeYa Solutions. All rights reserved.7NZSQL• Utility to interact with Netezza database• Useful to writing multi-liner queries, executing those for analysis orreporting purpose• Setting environment is a pre-requisite before starting on nzsql• Logging into nzsql opens the pg.log file and start capturing allactivities performed by user on corresponding DB
    • © 2013 AgreeYa Solutions. All rights reserved.8Data Types in NetezzaDATATYPE DESCRIPTION SIZEBOOL boolean, true/false 1BPCHAR char(length), blank-padded string, fixed storage length VARCHAR single character 1DATE ANSI SQL date 4FLOAT4 single-precision floating point number, 4-byte storage 4FLOAT8 double-precision floating point number, 8-byte storage 8INT1 -128 to 127, 1-byte storage 1INT2 -32 thousand to 32 thousand, 2-byte storage 2INT4 -2 billion to 2 billion integer, 4-byte storage 4INT8 ~18 digit integer, 8-byte storage 8INTERVAL @ <number> <units>, time interval 12NCHAR nchar VARNUMERIC numeric(precision, decimal), arbitrary precision number 19NVARCHAR nvarchar VARTIME hh:mm:ss, ANSI SQL time 8TIMESTAMP date and time 8TIMETZ hh:mm:ss, ANSI SQL time 12VARCHAR varchar(length), non-blank-padded string, variable storage length VAR
    • © 2013 AgreeYa Solutions. All rights reserved.9Metadata Tables in Netezza• Like any other database, Netezza also provides metadata tables andviews which provides information about objects• Some of the frequently required MD tables are:System Table Name Usage_V_OBJECTS Used to display information related to different objects like tables, views,external tables, synonyms and more_V_TABLES Used to display information related to different tables present in Netezza_V_VIEW Used to display information related to different views present in Netezza_V_RELATION_COLUMN Used to display information related to different columns present in Netezzatables
    • © 2013 AgreeYa Solutions. All rights reserved.10Types of Joins in Netezza• Netezza internally processes joins in following order:– Hash Join (in memory)– Hash Join (in disk)– Sort Merge Join– Nested Loop Join– Cross Join• Netezza has three main types of joins available:– Co-located Join– Re-distribution of data– Broadcasting of data
    • © 2013 AgreeYa Solutions. All rights reserved.11Data Loading and Unloading in Netezza• NZLOAD (only loading)• EXTERNAL TABLES (both loading and unloading)• CTAS (CREATE TABLE AS) (both loading and unloading)• Nzsql with –o option (only unloading)
    • © 2013 AgreeYa Solutions. All rights reserved.12Data Distribution in Netezza• Key factor in shooting performance to great extent• Backbone of MPP architecture• Can be leverage using DISTRIBUTE ON clause after CRAETE TABLEstatement• Of three types:– DISTRIBUTE ON (column name);– DISTRIBUTE ON RANDOM;– No DISTRIBUTE specification– Very useful while loading data into tables and fetching data from table
    • © 2013 AgreeYa Solutions. All rights reserved.13Selecting a distribution key• Columns with many distinct values• Column or columns based on selection set• As few columns as possible• Data distributed on same key• DO NOT use Boolean keys• Checking distribution of data in table
    • © 2013 AgreeYa Solutions. All rights reserved.14Collocated Join
    • © 2013 AgreeYa Solutions. All rights reserved.15Single Redistribute
    • © 2013 AgreeYa Solutions. All rights reserved.16Double Redistribute
    • © 2013 AgreeYa Solutions. All rights reserved.17Broadcast
    • © 2013 AgreeYa Solutions. All rights reserved.18Transactions in Netezza• Three basic columns to carry out transaction in Netezza– Createxid– Deletexid– Rowid• Values in these columns keep on changing with every transaction• These are hidden columns with every table in Netezza• Also used to track deleted records in many cases
    • © 2013 AgreeYa Solutions. All rights reserved.19Transactions in Netezza contd..
    • © 2013 AgreeYa Solutions. All rights reserved.20Aborted Transaction in Netezza
    • © 2013 AgreeYa Solutions. All rights reserved.21Locking, Concurrency and Isolation• Netezza implements serializable transaction isolation for highest levelof consistency• Multi-versioning and Serialization dependency checking• User cannot explicitly lock a table in Netezza• UPDATE clause works differently in Netezza
    • © 2013 AgreeYa Solutions. All rights reserved.22GROOM/Reclaim in Netezza• Logically deleted records reside in memory in Netezza in followingcases:– INSERT– UPDATE– Failed INSERT or aborted nzload operation– Failed UPDATE operation• Logically deleted records in Netezza causes:– Occupancy of extra disk space– Requires extra time for full table scan
    • © 2013 AgreeYa Solutions. All rights reserved.23GROOM/Reclaim contd..• GROOM/ RECLAIM process recovers this unused disk space inNetezza• GROOM command support operations for:– Single table– All tables in one database– All tables in all database• Benefits of GROOM:– Permits shared access to target table– Can be interrupted without leaving target table locked– Refreshed materialized views created on base table• Syntax:
    • © 2013 AgreeYa Solutions. All rights reserved.24Zone Maps in Netezza• Zone Maps are similar to indexes in any other DB• Created on integer, date and timestamp fields• Created and refreshed automatically when:• GENERATE STATSTICS• NZLOAD• INSERT or UPDATE• GROOM Operation
    • © 2013 AgreeYa Solutions. All rights reserved.25GENERATE STATISTICS in Netezza• Netezza optimizer relies on GENERATE STATISTICS to gatherstatistics about tables• GENERATE STATISTICS collects statistics about each tablecolumns:– Minimum and maximum values on character data– Maximum and average length on varchar– NULL Counts– Updates the system catalog• GENERATE STATISTICS can be collected at three levels:– Database Level– Table level– Column Level• Can also be collected using Nzadmin tool
    • © 2013 AgreeYa Solutions. All rights reserved.26GENERATE STATISTICS contd..• Netezza system generates two basic statistics, table row count andmin-max values for character columns while doing:– INSERT– UPDATE– CTAS (GENERATE STATISTICS is automatically created is row count >=10k)– Nzload– GROOM– TRUNCATE TABLE• It is important to generate statistics for:
    • © 2013 AgreeYa Solutions. All rights reserved.27SPU Failover Activity Disk timing : It shows the SPU showing the slowperformance Step 1) Pause the system• nzsql>> nzsystem pause Step 2) Confirm that the system is paused• nzsql>> nzstate Step 3) Failover the SPU• nzsql>> nzspu failover -id <SPU ID> Step 4) Resume the system• nzsql>> nzsystem resume
    • © 2013 AgreeYa Solutions. All rights reserved.28Genstats Command To generate statistics on any database table(s) forwhich the statistics• are not currently 100% "up-to-date". The optimizer uses statistics to guide its decisions onhow best to execute a query. The more reliable andup-to-date the statistics are,more accurateoptimizers decisions are likely to be.
    • © 2013 AgreeYa Solutions. All rights reserved.29Backup & Restore Types of Back up : Full Back up Differential backup Incremental Differential backup Cumulative Differential backup Elaborative Example•
    • © 2013 AgreeYa Solutions. All rights reserved.30Back up Command Backup command / scripts is used for backing up tables from NPS. Backup command / nz_backup script must be run locally (on the NPS host being backedup). These command/scripts processes a single table, multiple tables, or an entire database. The data format that is used can be either ascii -- which is very portable. binary-- which is Netezzas compressed/internal format, which ismuch faster, and results in significantly smaller backup sets. gzip -- ascii, which is gziped on the NPS host. The data is written to (or read from) disk files or named pipes. If pipes are used, another application is used to produce the data. These scripts just concern themselves with the DATA itself. When backing up• a table, the DDL is not included.
    • © 2013 AgreeYa Solutions. All rights reserved.31Back up Command Examples Full backup:• /nz/kit/bin/nzbackup -db CIDB_PRD -dir/back_folder• nohup nzbackup -db CIDB_PRD -u admin -dir/back_folder Differential backup: /nz/kit/bin/nzbackup -db CIDB_PRD -u admin -dir/back_folder -differential -v• nohup nzbackup -db CIDB_PRD -u admin -dir/back_folder -schema-only
    • © 2013 AgreeYa Solutions. All rights reserved.32Restore Command Restore command / scripts is used to restore tables to NPS. Restore command / nz_restore script must be run locally (on the NPS host being restored ). These command/scripts processes a single table, multiple tables, or an entire database. The data format that is used can be either ascii -- which is very portable. binary-- which is Netezzas compressed/internal format, which ismuch faster, and results in significantly smaller backup sets. gzip -- ascii, which is gziped on the NPS host. The data is written to (or read from) disk files or named pipes. If pipes are used, another application is used to produce the data. These scripts just concern themselves with the DATA itself. When backing up• a table, the DDL is not included.
    • © 2013 AgreeYa Solutions. All rights reserved.33Restore Command Syntax : nzrestore [-db database] [-dir directory]• [-connector name] [-connectorArgs] [-schema only]» [-users] [-v] [-rev] [-h] [-increment] [-mode]• [-backupset ID] [-lockdb] Here, -dir specifies the backup root directory when using the file system connector -connector specifies the connector type either File System, Veritas or Tivoli• NOTE : If -connector is omitted defaults to File System connector -connectorArgs specifies:• - DATASTORE_SERVER and DATASTORE_POLICY when using the Veritasconnector• - The TSM password when using Tivoli connector• - may optionally be specified as environment variables -If incremental is omitted, defaults to full backup -mode specifies REST /NEXT mode . -lockdb specifies locking of database during restore [ TRUE/FALSE]
    • © 2013 AgreeYa Solutions. All rights reserved.34Tape Back up Command Similar to Backup Command. Syntax is also similar Command only Differs in the destination location which is “Tape” instead ofany file location as that of normal backup . Example : nzbackup -v -db EDW_STANDBY -dir/migration/TF12_EDW_STANDBY_Tape_Backup/tape1/migration/TF12_EDW_STANDBY_Tape_Backup/tape2/migration/TF12_EDW_STANDBY_Tape_Backup/tape3/migration/TF12_EDW_STANDBY_Tape_Backup/tape4 -streams 4
    • © 2013 AgreeYa Solutions. All rights reserved.35Netezza Performance Server
    • © 2013 AgreeYa Solutions. All rights reserved.36Defaults in Netezza Default Users nz (Linux OS user ) admin ( NPS database super-user with full access to allsystem functions and objects ) root ( Linux root user ) System Defaults system database public group 5480 – ODBC port Note : By default user created is added to the public group User can’t be deleted from public group group, user & database share a common namespace.so group name, user name anddatabase names must be unique.
    • © 2013 AgreeYa Solutions. All rights reserved.37Managing Users By default user have access to only system views allowing then to retrieve a list of useddatabase objects. Sql for Creating User : CREATE USER user_name WITH PASWORD ‘string’ [options] Sql for altering User credentials/privileges ALTER USER user_name WITH [options] Sql for deleting User : DROP USER user_name Note : Here options can be : Row limit, Group name, Validity, Session Time out, Query Time out, Default priority, Maximumpriority, Resource group Nzsql command to list user• SYSTEM(ADMIN) => du Nzsql command to list user’s permission• SYSTEM(ADMIN) => dpu
    • © 2013 AgreeYa Solutions. All rights reserved.38Managing Groups By default group created is public group. By default user is added in public group. Sql for Creating Group : CREATE GROUP group_name WITH PASWORD ‘string’ [options] Sql for altering group credentials/privileges ALTER GROUP group_name [ADD|OWNER|RENAME|WITH] Sql for deleting User : DROP USER use_name Note : Here options can be : Row limit, Session Time out, Query Time out, Default priority, Maximum priority, Resource limit, usernames Nzsql command to list user• SYSTEM(ADMIN) => dg Nzsql command to list user’s permission• SYSTEM(ADMIN) => dpg
    • © 2013 AgreeYa Solutions. All rights reserved.39Permissions Types of Permission : Object Permissions [ 11 nos. ]:• List, Select• Insert, Delete, Update• Alter, Drop, Truncate• Lock, Abort, Load, Genstat Admin Permissions [ 13 nos. ]:• Database, Temporary Table, External Table , System Table, view• User, Group• Create, Backup, Restore, Reclaim• Hardware, system Scope of Permission : Applicable only to Object Permissions : Two classes• Local Scope : Applicable when logged into particular database• Global Scope : Applicable when logged into system database By default Admin permissions are Global in Nature
    • © 2013 AgreeYa Solutions. All rights reserved.40Object Permissions Object Permissions Granted in the system database are inherited by all other databases• i.e. they have global scope Object Permissions Granted within database are local to the databases• i.e. they have local scope Object Permissions are additive in nature• i.e. Effectively all permission the of an object• = User Permissions + Group Permissions + Public Permission Sql for Granting Object Permission : GRANT object_permission On object TO {PUBLIC | GROUP group_name | user_name } [ WITH GRANTOPTION ] Sql for Revoking Object Permission : REVOKE object_permission On object TO {PUBLIC | GROUP group_name | user_name }
    • © 2013 AgreeYa Solutions. All rights reserved.41Admin Permissions Admin Permissions are Global in scope Sql for Granting Admin Permission : GRANT admin_permission TO {PUBLIC | GROUP group_name | user_name } [WITH GRANT OPTION ] Sql for Revoking Admin Permission : REVOKE admin_permission TO {PUBLIC | GROUP group_name | user_name } Nzsql command to list user’s permission SYSTEM(ADMIN) => dpu Nzsql command to list group’s permission SYSTEM(ADMIN) => dpg
    • © 2013 AgreeYa Solutions. All rights reserved.42Listing All Permissions to User/Group
    • © 2013 AgreeYa Solutions. All rights reserved.43Viewing The Distribution & Skew In CLI on linux prompt• $ nz_skew utility ( on Linux Prompt ) on nzsql prompt• nzsql => SELECT datasliceid, COUNT(datasliceid) AS "ROWS"FROM MB_STU_PRE• GROUP BY datasliceid• ORDER BY "ROWS"; In GUI– In nzAdmin –> Tools –> Table Skews NOTE : For changing the distribution key Create Table table_name AS ( select clause ) is used with DistributionKey If distribution clause is not specified in the CTAS, parent table distribution key column is used asdistribution by default. The default threshold to display skew of table 100 MB.
    • © 2013 AgreeYa Solutions. All rights reserved.44Log Files in Netezza All the log file in Netezza are in the directory: /nz/kit/log/ Various log created are Alcapp, alcloader, waitForAlcapp backupsvr, bnrmgr, restoresvr bootsvr, dbos Clientmgr , eventmgr, sessionmgr, sysmgr fcommrtx gencErrors, hostStatsGen Loadmgr , nzloadTmpLogs Plans , planshist, postgres sendMail ssgdba startupsvr, statsSvr
    • © 2013 AgreeYa Solutions. All rights reserved.45Priority Priority are Job Priority Session Priority Priority values are defined for a user, a group, or as the system default Sys determines value of priority to use when the user connects to the host– and executes SQL commands Two more are there- SYSTEM CRITICAL (highest) and SYSTEM BACKGOUND (lowest), which are not visible touser. The possible priorities are critical, high, normal, low, or none. The default priority for groups, and the system is none. If priorities are not set, user sessions run at normal priority.
    • © 2013 AgreeYa Solutions. All rights reserved.46Priority (Contd…) The syntax to set system priority is: SET SYSTEM DEFAULT– [SESSIONTIMEOUT | ROWSETLIMIT | QUERYTIMEOUT ] TO [number |UNLIMITED ]– [DEFPRIORITY | MAXPRIORITY ] to [CRITICAL | HIGH | NORMAL | LOW |NONE] The syntax to create group and set default priority is : SHOW SYSTEM DEFAULT MAXPRIORITY; SHOW SYSTEM DEFAULT DEFPRIORITY; The syntax to create group and set default priority is : CREATE GROUP group_name WITH DEFPRIORITY TO HIGH; The syntax to create group and set default priority is : CREATE USER user_name WITH DEFPRIORITY TO CRITICAL;
    • © 2013 AgreeYa Solutions. All rights reserved.47Priority (Contd…) The syntax to change the priority of a session ALTER SESSION [<session_id>] SET PRIORITY TO<priority> ; Example :– nzsql=> ALTER SESSION 21664 SET PRIORITY TO HIGH; The syntax to change priority of a session using nzsession nzsession priority -high -u nz -pw password -id 21664;
    • © 2013 AgreeYa Solutions. All rights reserved.48Migration Based on the activities migration activities are classified as : Data Migration Environment migration : Code Migration
    • © 2013 AgreeYa Solutions. All rights reserved.49Data Migration Data Migration : It is done on two different ways : nz_migrate utility – Syntax : nz_migrate -shost <name/IP> -thost <name/IP>» -sdb <dbname> -tdb <dbname>» [optional args] This script must be invoked from the source machine. Optionally, this script can automatically create the target database and objectsvia the options• -CreateTargetTable• -CreateTargetDatabase Through means of external table – Database/table is converted to external table from the source External table is converted back to database/table to the target
    • © 2013 AgreeYa Solutions. All rights reserved.50Environment Migration Environment Migration : Objects created by developer in personal database EDW_UT is moved todevelopment EDW_SIT database All the Objects present in development database EDW_SIT is moved toProduction database EDW_PROD This is done through customized script“export/home/nz/psm/scripts/dlc/object.in.work.prod.bash “ NOTE :– There is also a script“export/home/nz/psm/scripts/dlc/object.in.work.create.bash”which– promotes objects from personal database EDW_UT to EDW_SITfor testing purpose
    • © 2013 AgreeYa Solutions. All rights reserved.51Code Migration Code migration : This is a manually method where ddl of the objects liketable,view etc are obtained using nz_ddl_table, nz_ddl_viewand through this ddl’s objects are created. The privileges (ACL) of the object is obtained throughnz_get_acl utility, is used to reproduce the ACL on the newlycreated object . This is done through customized script .
    • © 2013 AgreeYa Solutions. All rights reserved.52Events and AlertsBy default there are total 40 events for which alert isbeing raised. They are :1. CPUcoresOK_em_NzCS2. CPUcoresReduced_em_NzCS3. HostNoLongerOnline4. HostNotOnline5. MemFlt_rc_NzCS6. NzDAC_QDR_fault_em_NzCS7. Regen_em_NzCS8. RunAwayQuery_TF129. RunAway_rc_NzCS10. RunAway_rc_monitor11. SystemOnline12. coreRequest_em_NzCS13. dFPGA_em_NzCS14. dFPGA_em_NzCS_r15. diskFull_8x_em_NzCS16. diskFull_90_em_NzCS17. diskFull_95_em_NzCS18. histCapture_em_NzCS19. histLoad_em_NzCS20. hwFlt_FanOrPwr_em_NzCS
    • © 2013 AgreeYa Solutions. All rights reserved.53Monitoring & Gathering Scripts Monitoring on the system is done through customized/ system scripts whichexecutes daily in following ways : Hourly performance data of each server is generated and mailed Consolidated reports on various daily dba activites ( like backup, restore,genstat, reclaim etc..) for all the servers is generated and mailed Complete health report is generated by nz_health and mailed Report of all the log activities is generated by and mailed SPU performance is checked by disk_timing script on every 8 hrs.
    • © 2013 AgreeYa Solutions. All rights reserved.54www.agreeya.com54Thank YouBHAWANI NANDAN PRASADBI & Analytics Practice HeadBhawani.prasad@agreeya.net+91 9717570222