Download File


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Download File

  1. 1. BID209: Best Practices for Implementing with Sybase IQ Mark D. Mumy Principal Systems Consultant [email_address] July 19, 2010
  2. 2. About This Presentation <ul><li>Setup and Administration </li></ul><ul><ul><li>Hardware Configuration </li></ul></ul><ul><ul><li>Creating the Database </li></ul></ul><ul><ul><li>IQ Server Settings </li></ul></ul><ul><ul><li>Connectivity </li></ul></ul><ul><ul><li>Maintenance Tasks </li></ul></ul><ul><ul><li>Memory Usage </li></ul></ul><ul><li>Development and Design </li></ul><ul><ul><li>Data Model Recommendations </li></ul></ul><ul><ul><li>Database Programming </li></ul></ul><ul><ul><li>Data Manipulation </li></ul></ul>
  3. 3. Target Audience <ul><li>Database Administrators </li></ul><ul><ul><li>All Levels </li></ul></ul><ul><li>Query Developers </li></ul><ul><ul><li>All Levels </li></ul></ul>
  4. 4. For This Presentation <ul><li>Let’s Keep It Interactive </li></ul><ul><ul><li>I will entertain questions on a subject </li></ul></ul><ul><ul><li>Would be happy to speak to you offline, if desired </li></ul></ul><ul><li>No Question is too Basic </li></ul><ul><ul><li>Chances are others may have the same question </li></ul></ul><ul><li>Open Question Forum at the end </li></ul><ul><ul><li>Time Permitting </li></ul></ul>
  5. 5. Setup and Administration <ul><li>Hardware Configuration </li></ul><ul><li>Creating the Database </li></ul><ul><li>IQ Server Settings </li></ul><ul><li>Connectivity </li></ul><ul><li>Maintenance Tasks </li></ul><ul><li>Memory Usage </li></ul>
  6. 6. Hardware Configuration –Drive Arrays <ul><li>Volume Manager </li></ul><ul><ul><li>Not a necessary or recommended component for IQ </li></ul></ul><ul><ul><li>Can use MPXIO on Sun Platform to provide controller failover, dynamic multi-pathing, and world wide naming convention </li></ul></ul><ul><ul><li>Additional overhead and software that has no value add to an IQ installation </li></ul></ul><ul><ul><li>Acceptable for an IQ single node operation where hardware does not have the ability to apply RAID to the disk devices </li></ul></ul>
  7. 7. Hardware Configuration – Drive Arrays <ul><li>RAID Level </li></ul><ul><ul><li>Recommend using RAID 5 as a blend between performance, protection, and cost </li></ul></ul><ul><li>Recommend using raw devices (a must for multi-node IQ) </li></ul><ul><li>For details on drive array specifications and configurations see the Sun Reference Architecture Whitepapers and EMC Whitepapers on array configurations with IQ </li></ul>
  8. 8. Hardware Configuration – Memory <ul><li>How Much RAM for IQ </li></ul><ul><ul><li>As much as possible! </li></ul></ul><ul><ul><li>Most systems should have at least 2 GB </li></ul></ul><ul><ul><li>Rough rule of thumb is 2 GB per CPU </li></ul></ul><ul><ul><li>Don’t forget additional RAM for each load being performed </li></ul></ul><ul><ul><li>Shared memory is only needed on 32-bit systems and some 64-bit systems to wire memory into RAM </li></ul></ul><ul><ul><ul><li>HP-UX 64-bit should not use –iqsmem </li></ul></ul></ul>
  9. 9. Hardware Configuration – Processors and Disk Controllers <ul><li>Optimal CPU configuration </li></ul><ul><ul><li>Queries: 1 CPU per active query – more if queries are complex and can be run in parallel </li></ul></ul><ul><ul><li>Loads: 1 CPU per HG index plus 1 CPU per 2-5 columns being loaded </li></ul></ul><ul><li>Disk Controllers </li></ul><ul><ul><li>1 fiber controller per 5 write CPU’s </li></ul></ul><ul><ul><li>1 fiber controller per 10 read CPU’s </li></ul></ul><ul><ul><li>In mixed mode operation use the 5:1 ratio </li></ul></ul><ul><ul><li>Recommend a minimum of 2 controllers </li></ul></ul>
  10. 10. Creating The Database – Creation Options <ul><li>Creation Options </li></ul><ul><ul><li>Block Size and Page Size </li></ul></ul><ul><ul><ul><li>Minimum should be 128K page size (8K block) </li></ul></ul></ul><ul><ul><ul><li>Use 256K page for larger databases (>200 GB or so) </li></ul></ul></ul><ul><ul><ul><li>Larger the page size the more RAM that will be required </li></ul></ul></ul><ul><ul><li>CASE IGNORE vs. CASE RESPECT </li></ul></ul><ul><ul><ul><li>Use CASE RESPECT whenever possible </li></ul></ul></ul><ul><ul><ul><li>Removes case comparison steps and improves performance </li></ul></ul></ul><ul><ul><li>Java in the Database </li></ul></ul><ul><ul><ul><li>Install if Java will be needed in the database </li></ul></ul></ul><ul><ul><ul><li>No impact if installed and not used </li></ul></ul></ul>
  11. 11. Creating The Database – File Placement <ul><li>Use relative links – makes it easier to relocate files </li></ul><ul><li>Use symbolic links for IQ MAIN and IQ TEMP </li></ul><ul><ul><li>/IQM_devs/IQ_MAIN_00 </li></ul></ul><ul><ul><li>/IQM_devs/IQ_TEMP_00 </li></ul></ul><ul><li>Place the transaction log and transaction log mirror on significantly large file system </li></ul><ul><ul><li>Recommend 5 GB of filesystem space per IQ node for database related storage </li></ul></ul><ul><ul><ul><li>Transaction logs, IQ MSG, scripts, catalog file, etc </li></ul></ul></ul>
  12. 12. Creating The Database – Files Locations <ul><li>Filename rules </li></ul><ul><ul><li>Catalog: DATABASE_NAME.db </li></ul></ul><ul><ul><li>Transaction log: DATABASE_NAME.tlog </li></ul></ul><ul><ul><li>IQ MSG: DATABASE_NAME.iqmsg </li></ul></ul><ul><ul><li>IQ Main Store: DATABASE_NAME_iqmain_000 </li></ul></ul><ul><ul><ul><li>000 would be an incrementing number </li></ul></ul></ul><ul><ul><li>IQ Temp Store: DATABASE_NAME_iqtmp_000 </li></ul></ul><ul><ul><ul><li>000 would be an incrementing number </li></ul></ul></ul><ul><ul><li>IQ Configuration File: params.cfg (default) </li></ul></ul>
  13. 13. Creating The Database – Top Level Directory <ul><li>Use for writer and readers (multi-node configuration) to communicate </li></ul><ul><li>IQ 12.4.3 and earlier </li></ul><ul><ul><li>Must be shared across all hosts </li></ul></ul><ul><ul><li>Generally done via NFS, but some customers use cluster based filesystems (Veritas Cluster Manager) </li></ul></ul><ul><li>IQ 12.5 </li></ul><ul><ul><li>No longer a shared filesystem! </li></ul></ul><ul><ul><li>Should be local filesystem on each host that is protected accordingly </li></ul></ul><ul><ul><li>The name and path can be unique for each node </li></ul></ul><ul><ul><li>No need for NFS or cluster based filesystems </li></ul></ul>
  14. 14. IQ Server Settings – Query Performance <ul><li>FORCE_NO_SCROLL_CURSORS </li></ul><ul><ul><li>Should always be set to ON </li></ul></ul><ul><ul><li>Very few applications require this to be OFF </li></ul></ul><ul><ul><li>Can improve query performance </li></ul></ul>
  15. 15. IQ Server Settings – Query Plans <ul><li>Query Plan Settings to Provide Optimal Query Information to DBA’s and Sybase Engineering </li></ul><ul><ul><li>set temporary option query_plan='on'; </li></ul></ul><ul><ul><li>set temporary option query_plan_after_run='on'; </li></ul></ul><ul><ul><li>set temporary option query_timing='on'; </li></ul></ul><ul><ul><li>set temporary option query_detail='on'; </li></ul></ul><ul><ul><li>set temporary option dml_options10='on'; </li></ul></ul><ul><ul><li>set temporary option query_plan_as_html='on'; </li></ul></ul><ul><ul><li>set temporary option Query_Name = ‘Query Name‘ </li></ul></ul><ul><li>Should not be set globally as the IQ MSG file will grow rapidly </li></ul>
  16. 16. IQ Server Settings – Storage <ul><li>Append_Load </li></ul><ul><ul><li>Can be used to improve load performance </li></ul></ul><ul><ul><li>Will not reuse Row ID’s or the space occupied by those Row ID’s </li></ul></ul><ul><ul><li>Great for systems where large, contiguous chunks of data are deleted </li></ul></ul><ul><ul><li>May not be good if random rows are deleted as it can lead to fragmentation and allocated, but unused space </li></ul></ul>
  17. 17. IQ Server Settings – Storage Continued <ul><li>Disk_Striping </li></ul><ul><ul><li>If ON, IQ will stripe writes to all available devices </li></ul></ul><ul><ul><li>If OFF, the first device must be full before the next is used </li></ul></ul><ul><li>Disk_Striping_Packed </li></ul><ul><ul><li>If ON, it forces better space usage and less fragmentation </li></ul></ul><ul><ul><li>Fragmentation is indicated when out of space messages are returned but the main dbspace used is less than 100% full </li></ul></ul><ul><ul><li>Allows Adaptive Server IQ to better utilize small pieces of unused space that remain after compression </li></ul></ul>
  18. 18. IQ Server Settings – Data Loads <ul><li>LOAD_MEMORY_MB </li></ul><ul><ul><li>Set to 0 (default) on systems with enough RAM </li></ul></ul><ul><ul><li>Set to <= 500 on systems where RAM is tight or there are simultaneous loads taking place </li></ul></ul><ul><ul><li>Rough rule of thumb for load memory requirements </li></ul></ul><ul><ul><ul><li>Run: select tablewidth( ‘tablename’ ) </li></ul></ul></ul><ul><ul><ul><li>Divide that value by 2 </li></ul></ul></ul><ul><ul><ul><li>New number is rough calculation of MB needed to load that table </li></ul></ul></ul><ul><ul><li>Remember that each load requires separate memory </li></ul></ul>
  19. 19. Network Connectivity – TCP/IP Ports <ul><li>When configuring a TCP/IP port for IQ to listen on, use something other than the default of 2638 </li></ul><ul><li>ASA and IQ use port 2638 as a broadcast listener </li></ul><ul><li>Multiple servers on the same host share the default port for broadcast listening, but not for client network traffic </li></ul><ul><li>Broadcast listening takes precedence over client network traffic as it is the first port started </li></ul><ul><li>Easier and safer to assume that 2638 is in use already by ASA/IQ and to use another port for client network traffic </li></ul><ul><li>Many things must be considered if an IQ server is to use port 2638 for client network traffic on a host with more than one IQ server (server start order, multiplex synchronization, etc) </li></ul><ul><li>See for details </li></ul>
  20. 20. Client Access and Network Connectivity <ul><li>Different internal environments and settings for ODBC, JDBC, and Open Client connections </li></ul><ul><li>JDBC and ODBC are recommended </li></ul><ul><ul><li>JConnect is Sybase’s implementation of JDBC </li></ul></ul><ul><ul><li>Use the JDBC 2 driver (JConnect 5.2/5.5) </li></ul></ul><ul><ul><ul><ul><li>com.sybase.jdbc2.jdbc.SybDriver </li></ul></ul></ul></ul><ul><li>Open Client </li></ul><ul><ul><li>Use with caution </li></ul></ul><ul><ul><li>Most applications written using Open Client expect an ASE server </li></ul></ul>
  21. 21. Client Access and Network Connectivity – Open Client vs. ODBC <ul><li>See Chapter 32 of the ASA Users Manual as well as the ASA Reference Section “Transact-SQL and SQL/92 compatibility options” for complete list of differences </li></ul><ul><li>If writing stored procedures or embedded application code, make sure to explicitly make settings for compatibility as these options will get set to different values for Open Client vs. ODBC connections </li></ul><ul><li>ALLOW_NULLS_BY_DEFAULT </li></ul><ul><li>QUOTED_IDENTIFIER </li></ul><ul><li>STRING_RTRUNCATION </li></ul><ul><li>ANSI_BLANKS </li></ul><ul><li>ANSINULL </li></ul><ul><li>CHAINDED </li></ul><ul><li>FLOAT_AS_DOUBLE </li></ul>
  22. 22. Client Access and Network Connectivity – Open Client vs. ODBC <ul><li>ODBC and JDBC connectivity support all IQ datatypes </li></ul><ul><li>Open Client does not support all IQ datatypes </li></ul><ul><ul><li>Unsigned integer </li></ul></ul><ul><ul><li>Depending on version of Open Client, bigint and unsigned bigint should be OK (OC 12.5 and later) </li></ul></ul>
  23. 23. Client Access and Network Connectivity <ul><li>AutoPreCommit Within ODBC </li></ul><ul><ul><li>Set registry setting AutoPreCommit to Y </li></ul></ul><ul><ul><li>Forces applications to issue a COMMIT before each query </li></ul></ul><ul><ul><li>Go to the registry and update the corresponding Sybase Data Source Name (DSN) created, by adding a new value 'AutoPreCommit’ with a value of ‘Y' </li></ul></ul><ul><ul><ul><li>HKEY_LOCAL_MACHINE/SOFTWARE/ODBC/ODBC.INI/{DSN} </li></ul></ul></ul><ul><li>Packet Sizes </li></ul><ul><ul><li>Larger packet sizes will help with large data retrieval </li></ul></ul><ul><ul><li>Use –p option in IQ configuration file to increase size </li></ul></ul><ul><ul><li>Use CommBufferSize parameter in ODBC connection string </li></ul></ul>
  24. 24. Client Access and Network Connectivity <ul><li>Network Speed </li></ul><ul><ul><li>Data retrieval will be as fast as the network </li></ul></ul><ul><ul><li>100 MB of data will take 80 seconds on a 10 Mbit LAN </li></ul></ul><ul><ul><li>100 MB of data will take 8 seconds on a 100 Mbit LAN </li></ul></ul><ul><ul><li>100 MB of data will take 0.8 seconds on a Gigabit LAN </li></ul></ul><ul><ul><li>LAN speed may be the performance bottleneck for queries that return large amounts of data </li></ul></ul><ul><ul><li>The faster the network cards and LAN the better off concurrency will be: more available bandwidth per user </li></ul></ul><ul><li>Network Interface Cards </li></ul><ul><ul><li>Adding multiple network cards to the IQ node will help with network </li></ul></ul>
  25. 25. Client Access and Network Connectivity <ul><li>ASE Component Integration Services (CIS) </li></ul><ul><ul><li>Can have ASE CIS reference IQ tables (proxy tables) </li></ul></ul><ul><ul><li>Use ASE 12.5 CIS class of ASIQ </li></ul></ul><ul><ul><li>Prior to ASE 12.5 ASAnywhere or ASEnterprise classes had to be used </li></ul></ul><ul><ul><li>Not a viable option if joining non-IQ tables with an IQ proxy table </li></ul></ul><ul><ul><li>Can map multiple ASE logins to a single IQ login </li></ul></ul><ul><ul><li>Data modifications should not be performed on proxy tables </li></ul></ul><ul><ul><li>Can map ASE proxy tables to stored procedure calls in IQ </li></ul></ul>
  26. 26. Client Access and Network Connectivity <ul><li>IQ Component Integration Services (CIS) </li></ul><ul><ul><li>Can only be used on the Solaris and WinNT versions of IQ (12.4.3) </li></ul></ul><ul><ul><li>All platforms support CIS with IQ 12.5 </li></ul></ul><ul><ul><li>Can map to tables in Oracle (ODBC), ASE (ODBC & JDBC), ASA (ODBC & JDBC), DB2 (ODBC), MS SQL Server (ODBC), and any ODBC source </li></ul></ul><ul><ul><li>Not a viable option if joining IQ tables with non-IQ proxy tables </li></ul></ul><ul><ul><li>Data modifications may be performed on proxy tables </li></ul></ul>
  27. 27. Maintenance Tasks – Database Consistency Checks <ul><li>IQ 12.4.3 </li></ul><ul><ul><li>The routine to check the database for potential corruptions is SP_IQCHECKDB </li></ul></ul><ul><ul><li>There are no run-time options for this command, however the DBCC options control the behavior of the SP_IQCHECKDB command </li></ul></ul><ul><ul><li>Recommendations </li></ul></ul><ul><ul><ul><li>Run Level 1 (DBCC_OPTION=1) DBCC’s every 1 to 4 weeks (will process 1-2 GB per second) </li></ul></ul></ul><ul><ul><ul><li>Run Level 0 (DBCC_OPTION=0) DBCC’s every 1 to 3 months (will process 50-100 MB per second) </li></ul></ul></ul><ul><ul><li>Use DBCC_TEAM_SIZE option to control parallel execution </li></ul></ul>
  28. 28. Maintenance Tasks – Database Consistency Checks <ul><li>IQ 12.5 </li></ul><ul><ul><li>New syntax for IQ 12.5 – See the Reference Manual </li></ul></ul><ul><ul><li>Recommendations </li></ul></ul><ul><ul><ul><li>Perform CHECK every 1 to 4 weeks </li></ul></ul></ul><ul><ul><ul><li>Perform VERIFY every 1 to 3 months </li></ul></ul></ul><ul><ul><ul><li>Perform REPAIR if errors are reported from VERIFY </li></ul></ul></ul><ul><ul><li>Use RESOURCE_PERCENT parameter to control CPU usage during DBCC. </li></ul></ul>
  29. 29. Maintenance Tasks – Parallel Create index <ul><li>Create Multiple Indexes in a batch </li></ul><ul><ul><li>Syntax: </li></ul></ul><ul><ul><ul><li>BEGIN PARALLEL IQ </li></ul></ul></ul><ul><ul><ul><li>Create HG Index … ; </li></ul></ul></ul><ul><ul><ul><li>Create LF Index … ; </li></ul></ul></ul><ul><ul><ul><li>END PARALLEL IQ ; </li></ul></ul></ul><ul><li>Recommendation </li></ul><ul><ul><li>Create no more indexes in a batch than you have CPU's </li></ul></ul><ul><ul><li>Note: Two indexes on same column will be serial </li></ul></ul>
  30. 30. Maintenance Tasks – Setting the IQ Caches <ul><li>Start by allocating memory to caches </li></ul><ul><ul><li>40% IQ Main Cache </li></ul></ul><ul><ul><li>60% IQ Temporary Cache </li></ul></ul><ul><li>Adjust allocations from here based on monitoring </li></ul><ul><li>Number of HG indexes in a table may alter this plan for loading data </li></ul><ul><li>Make sure not to starve other operations that need RAM (OS, LOAD_MEMORY_MB, other applications, etc) </li></ul>
  31. 31. Maintenance Tasks – IQ Monitoring <ul><li>IQ Monitor is a diagnostic tool for DBA’s </li></ul><ul><li>It collects and reports internal counters from the IQ Buffer Caches </li></ul><ul><ul><li>Main Cache interaction with the IQ Store </li></ul></ul><ul><ul><li>Temporary Cache interaction with the IQ Temp Store </li></ul></ul><ul><li>IQ Monitor offers a series of “views” of the counters to showing differing aspects of the server and buffer cache workload </li></ul>
  32. 32. Maintenance Tasks – IQ Monitoring <ul><li>Provides different views of buffer activity </li></ul><ul><ul><li>summary report of both caches </li></ul></ul><ul><ul><li>detailed report of one cache </li></ul></ul><ul><ul><li>i/o activity of a cache </li></ul></ul><ul><ul><li>a debug report of all buffer cache activity </li></ul></ul><ul><li>You must specify an ‘option’ when you start the monitor to specify what view to monitor </li></ul><ul><li>One Monitor may be running for a cache </li></ul><ul><ul><li>May have one for each (Main & Temp) </li></ul></ul>
  33. 33. Managing Memory – How Much Can IQ Address? <ul><li>Solaris and AIX (64-bit) – limited to RAM + swap space with no kernel changes necessary </li></ul><ul><li>AIX (32-bit) – limited to 2.5 GB using –iqsmem to achieve this </li></ul><ul><li>HP-UX – limited to kernel value of maxdsiz_64bit </li></ul><ul><ul><li>maxdsiz_64bit should never be set to exceed RAM + swap space </li></ul></ul><ul><li>Windows NT/2000/XP – Must use a 4GT supported version </li></ul><ul><ul><li>2 GB for main and temp buffer caches </li></ul></ul><ul><ul><li>3 GB total for the process </li></ul></ul><ul><ul><li>Load memory counts against 3 GB total, not 2 GB limit on buffer caches </li></ul></ul>
  34. 34. Managing Memory – How Much Can IQ Address? <ul><li>Keep in mind that more memory may mean tuning the O/S </li></ul><ul><li>On Solaris, a system with 288 GB spent a tremendous amount of time managing memory with standard configuration </li></ul><ul><ul><li>Changed following kernel parameters to help (/etc/system): </li></ul></ul><ul><ul><ul><li>set autoup=900 </li></ul></ul></ul><ul><ul><ul><li>set tune_t_fsflushr=1 </li></ul></ul></ul><ul><ul><ul><li>set lotsfree=0x10000 </li></ul></ul></ul><ul><ul><ul><li>set bufhwm=8000 </li></ul></ul></ul><ul><ul><ul><li>set maxphys=1048576 </li></ul></ul></ul><ul><ul><ul><li>set maxusers=2048 </li></ul></ul></ul><ul><ul><ul><li>set rlim_fd_max=2048 </li></ul></ul></ul><ul><ul><li>Keep in mind that above options were for a limited test and were the result of site specific analysis performed by Sun and should be used as reference only </li></ul></ul><ul><li>Refer to hardware and O/S specific tuning guides for help </li></ul>
  35. 35. Managing Memory – Inventory Your System Memory <ul><li>Assess what’s running and how much memory it uses (exclusive of IQ) </li></ul><ul><li>Typically these include </li></ul><ul><ul><li>Operating System </li></ul></ul><ul><ul><li>OLAP Servers </li></ul></ul><ul><ul><li>Middleware </li></ul></ul><ul><ul><li>Other applications </li></ul></ul>
  36. 36. Managing Memory <ul><li>Deduct that total from RAM </li></ul><ul><li>Allow 20% for File System Cache </li></ul><ul><ul><li>If using file systems </li></ul></ul><ul><li>Deduct another 10% (just in case) </li></ul><ul><li>The rest is for IQ and IQ Caches </li></ul><ul><li>A picture tells this story </li></ul>
  37. 37. Memory – The Big Picture <ul><li>Start with the O/S and other applications </li></ul><ul><li>Determine what the IQ server needs (normal load) </li></ul><ul><li>IQ Overhead (Heap for loading) </li></ul><ul><li>Allocate IQ Caches </li></ul><ul><ul><li>40% IQ Main Cache </li></ul></ul><ul><ul><li>60% IQ Temp Cache </li></ul></ul>Operating System + All Other Apps IQ Server IQ Overhead IQ Main Cache IQ Temp cache
  38. 38. Development and Design <ul><li>Data Model Recommendations </li></ul><ul><li>Database Programming </li></ul><ul><li>Data Manipulation </li></ul>
  39. 39. Data Model Recommendations – Proper Datatype Sizing <ul><li>Use the smallest datatypes possible for data </li></ul><ul><li>Be aware of all datatypes in IQ – there may be more than you know </li></ul><ul><li>If hour, minute and second information is not necessary, use DATE instead of DATETIME </li></ul><ul><li>If the data will fit within a TINYINT or SMALLINT datatype use that rather than INTEGER or BIGINT </li></ul><ul><li>Allows the engine to store data in smaller units (1-byte TINYINT or 2-byte SMALLINT versus 4-byte INTEGER or 8-byte BIGINT </li></ul><ul><li>Don’t over allocate storage when defining NUMERIC() or DECIMAL() as it can be costly for data that doesn’t need all that space </li></ul>
  40. 40. Data Model Recommendations – IQ UNIQUE <ul><li>Use whenever possible to help minimize storage usage </li></ul><ul><li>IQ 12.5 </li></ul><ul><ul><li>Used solely for FP index type (storage) </li></ul></ul><ul><li>IQ 12.4.3 and earlier </li></ul><ul><ul><li>Imperative to use on all columns even if cardinality exceeds 64K </li></ul></ul><ul><ul><li>Optimizer may make use of IQ UNIQUE value at query runtime </li></ul></ul><ul><ul><li>Does not need to be exact, but should be close to cardinality </li></ul></ul>
  41. 41. Data Model Recommendations – IQ UNIQUE and the FP index <ul><li>Consider using Minimize_Storage option in Sybase IQ 12.5. This will place an IQ UNIQUE(255) on every column for every table created and removes the need to use IQ UNIQUE </li></ul><ul><li>If the value is <= 255 then IQ will place a 1-byte FP index on the column – 1 byte of storage per row </li></ul><ul><li>If the value is > 255 but <= 65536 then IQ will place a 2-byte FP index on the column – 2 bytes of storage per row </li></ul><ul><li>May slightly hinder data loads, but improve query speeds </li></ul><ul><li>May incur onetime slight load slowdown while 2-byte FP is converted to flat FP, but this usually happens during the first load </li></ul>
  42. 42. Data Model Recommendations – NULL Values <ul><li>Always specify NULL or NOT NULL </li></ul><ul><ul><li>Open Client and ODBC connections have different default behavior when table is created </li></ul></ul><ul><li>Allows the optimizer a better guess at join criteria </li></ul><ul><li>NULL data does not save space on the database page as it would in ASE </li></ul><ul><li>Will be compressed out when stored on disk </li></ul>
  43. 43. Data Model Recommendations – Unsigned Datatypes <ul><li>Use unsigned datatypes where possible </li></ul><ul><li>Use for surrogate keys and join columns </li></ul><ul><li>Unsigned data comparisons are quicker </li></ul><ul><li>The caveat to this is that Open Client may misinterpret the value if it is too large as it does not understand large unsigned data </li></ul><ul><ul><li>Can convert to signed integer, numeric, or decimal if returning data to an Open Client application </li></ul></ul><ul><ul><li>This caveat applies to moving data between IQ servers with INSERT FROM LOCATION </li></ul></ul>
  44. 44. Data Model Recommendations – Long Varchar and Long Varbinary <ul><li>Can be used to store moderate amounts of text or binary data </li></ul><ul><li>VARCHAR() or VARBINARY() datatypes </li></ul><ul><ul><li>Maximum width is 32K (64K ascii hex for VARBINARY()) </li></ul></ul><ul><li>LONG BINARY added in 12.5 </li></ul><ul><ul><li>Maximum width is 64K binary (128K ascii hex) </li></ul></ul><ul><li>The WORD index is the only index allowed on VARCHAR() data wider than 255 bytes </li></ul><ul><li>Storage will be allocated in 256 byte chunks </li></ul><ul><ul><li>A 257 byte string will require 512 bytes of storage </li></ul></ul><ul><ul><li>Much less than the 2K requirement in ASE TEXT and IMAGE columns </li></ul></ul>
  45. 45. Data Model Recommendations – Large Object Storage <ul><li>Can be used to store binary or text based objects </li></ul><ul><li>Extends the long binary datatype from a maximum size of 64K to an unlimited size!!! </li></ul><ul><li>FP index is the only viable index </li></ul><ul><li>Cannot be searched, only stored and retrieved </li></ul><ul><li>Special function to return the size of an object (byte_length64) </li></ul><ul><li>Special function to return portions of the object, not the entire contents (byte_substr64) </li></ul>
  46. 46. Data Model Recommendations – Large Object Storage Continued <ul><li>Create table example: </li></ul><ul><ul><li>create table lob_test ( seq_no int, my_lob long binary ) </li></ul></ul><ul><li>No change to load table statement, so how is data loaded in the LOB? </li></ul><ul><li>Through a “load table” source data file: </li></ul><ul><ul><li>1|/tmp/lob_1.txt| </li></ul></ul><ul><ul><li>2|/tmp/lob_2.txt| </li></ul></ul><ul><ul><li>3|/tmp/lob_3.txt| </li></ul></ul><ul><li>Load engine will place the contents of the file specified in the source data file (/tmp/lob_1.txt) into the long binary column for that row </li></ul><ul><li>Can also extract contents of a binary object to individual files with the BFILE() function </li></ul>
  47. 47. Data Model Recommendations – Varchar vs. Char <ul><li>Use CHAR() whenever possible </li></ul><ul><li>All storage in IQ is fixed width </li></ul><ul><li>VARCHAR() types only add overhead </li></ul><ul><ul><li>A VARCHAR(100) columns will require 101 bytes of storage </li></ul></ul><ul><ul><ul><li>100 bytes for data </li></ul></ul></ul><ul><ul><ul><li>1 byte for the size of data </li></ul></ul></ul><ul><li>CHAR() data is blank padded, VARCHAR() is not </li></ul>
  48. 48. Data Model Recommendations – When and Where to Use Indexes <ul><li>Always use indexes on: </li></ul><ul><ul><li>Join columns (HG index regardless of cardinality) </li></ul></ul><ul><ul><li>Searchable columns (HG or LF index) </li></ul></ul><ul><ul><li>Aggregation columns (HNG) </li></ul></ul><ul><ul><ul><li>Only if a single column is used in an aggregation </li></ul></ul></ul><ul><ul><ul><li>SUM( A * B ) will not use an HNG, but rather an FP </li></ul></ul></ul><ul><ul><li>Date, Time, and Datetime columns (DATE, TIME, DTTM) </li></ul></ul><ul><li>If uncertain, place an LF or HG index on the column </li></ul><ul><li>Use Primary Key, Unique Constraint, or UNIQUE HG indexes where appropriate </li></ul>
  49. 49. Data Model Recommendations – When and Where to Use Indexes <ul><li>A column with an HNG, CMP, or WD index should have a corresponding LF or HG index (very rare circumstances will negate this) </li></ul><ul><li>Indexes are not needed on columns whose data is ONLY returned to the client (projected) </li></ul><ul><li>Remove HNG indexes on date/time/datetime columns. Replace them with DATE, TIME, or DTTM indexes </li></ul><ul><li>Investigate the use of HNG indexes. In certain circumstances having HNG indexes on columns can hurt performance (mostly in aggregations). </li></ul>
  50. 50. Data Model Recommendations – Simple Index Selection Criteria <ul><li>Ask yourself these simple questions about each column to determine which indexes belong </li></ul><ul><ul><li>Is the cardinality greater than 1500-2000? </li></ul></ul><ul><ul><ul><li>Yes: place an HG index on this column </li></ul></ul></ul><ul><ul><ul><li>No: place an LF index on this column </li></ul></ul></ul><ul><ul><li>Does this column contain date, time, datetime, or timestamp data? </li></ul></ul><ul><ul><ul><li>Yes: place a DATE, TIME, or DTTM index on this column (these should be used in lieu of the HNG index recommended in IQM 12.4.3 and earlier) </li></ul></ul></ul><ul><ul><li>Will this column be used in range searches or aggregations? </li></ul></ul><ul><ul><ul><li>Yes: place an HNG index on this column (if the aggregation contains more than just the column, an HNG may not be appropriate) </li></ul></ul></ul><ul><ul><li>Will this column be used for word searching? </li></ul></ul><ul><ul><ul><li>Yes: place a WD index on this column </li></ul></ul></ul><ul><ul><li>Will this column be compared to another column in the same table? </li></ul></ul><ul><ul><ul><li>Yes: place a CMP index on the two columns being compared </li></ul></ul></ul>
  51. 51. Data Model Recommendations – Multi-column Indexes <ul><li>Currently, only HG, UNIQUE HG, UNIQUE CONSTRAINT, and PRIMARY KEY indexes support multiple columns </li></ul><ul><li>HG inserts are the most expensive in IQ </li></ul><ul><li>Try to guarantee that inserts will happen at the end of the index </li></ul><ul><ul><li>Place generally incrementing data at the beginning of the index list (sequential data) </li></ul></ul><ul><ul><li>For instance, a transaction date or batch number </li></ul></ul><ul><ul><li>Something that will try to guarantee a sequential key </li></ul></ul>
  52. 52. Data Model Recommendations – Join Column <ul><li>Prefer joining integer datatypes (unsigned if possible) </li></ul><ul><li>Integer comparisons are quicker than character comparisons </li></ul><ul><li>Keep the datatypes as narrow as possible to improve join performance by reducing disk I/O and memory requirements </li></ul><ul><li>Prefer using HG index on join columns rather than cardinality appropriate index (LF or HG) </li></ul>
  53. 53. Data Model Recommendations – Primary Keys <ul><li>Multi-column primary keys should have an additional LF or HG index placed on each individual column </li></ul><ul><ul><li>Must be done manually </li></ul></ul><ul><li>UNIQUE CONSTRAINT, UNIQUE HG, and Primary Key are identical structures: HG index with no G-Array </li></ul><ul><li>Use when possible </li></ul><ul><ul><li>Helps optimizer make more informed query path decisions even if index is not used in joins or searches </li></ul></ul><ul><ul><li>You get an HG Index created automatically </li></ul></ul><ul><ul><li>This HG index has no G-Array (uses less space) </li></ul></ul>
  54. 54. Data Model Recommendations – Foreign Keys <ul><li>Should be used with IQ 12.5 and later to aid in query join performance </li></ul><ul><li>Use when possible </li></ul><ul><ul><li>Helps optimizer make more informed query path decisions even if index is not used in joins or searches </li></ul></ul><ul><ul><li>You get an HG Index created automatically on the foreign key column in the current table </li></ul></ul><ul><ul><li>Requires that a primary key exist on referenced table </li></ul></ul>
  55. 55. Data Model Recommendations – Temporary Tables <ul><li>3 Types of Temporary Tables </li></ul><ul><ul><li>“#” tables </li></ul></ul><ul><ul><ul><li>create table #temp table ( col1 int ) </li></ul></ul></ul><ul><ul><li>Local Temporary Tables </li></ul></ul><ul><ul><ul><li>declare local temporary table temp table ( col1 int ) </li></ul></ul></ul><ul><ul><ul><li>Behave just like “#” tables </li></ul></ul></ul><ul><ul><li>Global Temporary Tables </li></ul></ul><ul><ul><ul><li>create global temporary table temp table ( col1 int ) </li></ul></ul></ul><ul><ul><ul><li>Tables structure is static across connections and reboots </li></ul></ul></ul>
  56. 56. Data Model Recommendations – Temporary Tables <ul><li>On Commit Preserve Rows </li></ul><ul><ul><li>Use this option so that rows in temporary tables remain after the transaction has been committed </li></ul></ul><ul><li>Temporary tables are available at the current level (parent) and all of its children </li></ul><ul><li>A parent cannot see a child's temporary table(s) </li></ul>
  57. 57. Data Model Recommendations – Cursors <ul><li>Avoid using cursors </li></ul><ul><ul><li>Generally means row based processing </li></ul></ul><ul><ul><li>IQ was designed for set based processing </li></ul></ul><ul><ul><li>Sometimes they cannot be avoided </li></ul></ul><ul><ul><li>If used, make sure to use NO SCROLL cursors </li></ul></ul><ul><li>Open With Hold </li></ul><ul><ul><li>Allows the cursor to remain open across transactions </li></ul></ul><ul><ul><li>If not used, the cursor may be closed when a commit is issued (depends on connectivity type) </li></ul></ul>
  58. 58. Database Programming Language – Watcom SQL vs. T-SQL <ul><li>IQ (ASA) is not 100% T-SQL Compatible, but very close </li></ul><ul><li>Recommend using Watcom SQL </li></ul><ul><ul><li>All system procedures written with it </li></ul></ul><ul><ul><li>Many more code examples and more IQ people versed in it </li></ul></ul><ul><li>Watcom SQL has some extensions that T-SQL does not: </li></ul><ul><ul><li>Dynamic SQL </li></ul></ul><ul><ul><li>Better Loop control </li></ul></ul><ul><ul><li>Full cursor movement rather than just read next </li></ul></ul><ul><li>Batches and procedures must be written in the same dialect </li></ul><ul><ul><li>Cannot mix T-SQL with Watcom SQL </li></ul></ul>
  59. 59. Database Programming Language – Watcom SQL vs. T-SQL <ul><li>Behavior differences include: </li></ul><ul><li>Global variables </li></ul><ul><li>Variable Names </li></ul><ul><li>CALL </li></ul><ul><li>FOR </li></ul><ul><li>ASA requires variables to be declared immediately after a BEGIN </li></ul><ul><li>DECLARE CURSOR </li></ul><ul><li>GOTO </li></ul><ul><li>IF </li></ul><ul><li>PRINT </li></ul><ul><li>RAISERROR </li></ul><ul><li>SET </li></ul><ul><li>WHILE (T-SQL) vs. LOOP </li></ul>
  60. 60. Database Programming Language – Commit and Rollback <ul><li>Use transaction control around logical units of work, even read only queries </li></ul><ul><li>Should commit before a read/write batch is started to ensure latest version of data is available </li></ul><ul><li>Should issue commit and rollback after batch completion to release all query resources </li></ul><ul><li>Rollback will free memory resources in use by previous operations </li></ul><ul><li>For systems with high number of connected users, freeing memory resources can aid in query performance </li></ul>
  61. 61. Database Programming Language – Custom Functions <ul><li>Custom functions can be written in either SQL or Java </li></ul><ul><li>Great way to encapsulate business logic for transforming data </li></ul><ul><li>Can have a significant performance impact on queries </li></ul><ul><li>Functions are executed in the catalog portion of the engine </li></ul><ul><ul><li>All result rows may need to be moved to ASA </li></ul></ul><ul><ul><li>Can be time consuming for large result sets </li></ul></ul><ul><li>Turn on query plans to see what impact the functions have on effective query plans </li></ul>
  62. 62. Database Programming Language – Outer Joins <ul><li>T-SQL Outer Join: *= or =* </li></ul><ul><li>ANSI Standard/Watcom SQL Outer Join: </li></ul><ul><li>[left|right|full] outer join </li></ul><ul><li>Be careful of outer join syntax </li></ul><ul><li>The T-SQL syntax can be very ambiguous or non-deterministic for IQ to translate </li></ul><ul><ul><li>All T-SQL outer joins must be converted to ANSI outer joins and then processed </li></ul></ul><ul><li>Use the ANSI standard instead as they are not ambiguous and are always clear in their meaning </li></ul><ul><li>Visit for more details </li></ul>
  63. 63. Data Manipulation – Load Table <ul><li>Parallel Load Table </li></ul><ul><ul><li>Make sure to put column delimiter after last column </li></ul></ul><ul><ul><li>Must use ROW DELIMITED BY and DELIMITED BY options in load table command </li></ul></ul><ul><ul><li>Column and row delimiters must be a single character </li></ul></ul><ul><ul><li>1 to 4 byte delimiters allowed for serial loads </li></ul></ul><ul><ul><li>Parallel load can only have a single character delimiter </li></ul></ul>
  64. 64. Data Manipulation – Load Table <ul><li>If possible, perform load table from binary datafiles </li></ul><ul><ul><li>Can be 3 to 10 times faster than ASCII loads </li></ul></ul><ul><li>Can use FILLER() clause with a delimiter or byte count </li></ul><ul><li>Better performance achieved by casting the date or datetime formats rather than letting IQ guess </li></ul><ul><li>If possible, issue a single load table with multiple files rather than 1 load table per file to be loaded into a table </li></ul><ul><ul><li>Only important for tables with HG, Unique HG, or Primary Key indexes </li></ul></ul>
  65. 65. Data Manipulation – Insert From Location <ul><li>Great way to move data from any Open Client source </li></ul><ul><li>Syntax: </li></ul><ul><ul><li>insert into TABLE() </li></ul></ul><ul><ul><li>location ‘SERVERNAME.DBNAME’ </li></ul></ul><ul><ul><li>{ select statement }; </li></ul></ul><ul><li>IQ username and login must match on remote system </li></ul><ul><li>Interfaces entry must match the SERVERNAME </li></ul><ul><li>Can also be used to move data quickly from an ASA table to an IQ in the same server </li></ul><ul><li>See section on Open Client datatype limitations </li></ul>
  66. 66. Data Manipulation – Single Row Operations <ul><li>Avoid at all costs for large data manipulation operations </li></ul><ul><li>Different from single statement operations that modify many rows </li></ul><ul><li>Individual INSERT … VALUES() will be slower than bulk load operations </li></ul><ul><ul><li>Expect no more than 5,000 to 20,000 operations per hour </li></ul></ul>
  67. 67. Data Manipulation – Single Row Operations <ul><li>If multiple single row inserts are needed there is a method to improve performance </li></ul><ul><li>Consider this code (1 insert per row): </li></ul><ul><ul><ul><li>insert into #my_table values ( 1, 1 ) </li></ul></ul></ul><ul><ul><ul><li>insert into #my_table values ( 2, 1 ) </li></ul></ul></ul><ul><ul><ul><li>insert into #my_table values ( 3, 1 ) </li></ul></ul></ul><ul><li>Can be converted to this (always 2 inserts): </li></ul><ul><ul><ul><li>create table #iq_dummy ( a1 int ) </li></ul></ul></ul><ul><ul><ul><li>insert into #iq_dummy values ( 1 ) </li></ul></ul></ul><ul><ul><ul><li>insert into #my_table select 1, 1 from #iq_dummy </li></ul></ul></ul><ul><ul><ul><li>union all select 2, 1 from #iq_dummy </li></ul></ul></ul><ul><ul><ul><li>union all select 3, 1 from #iq_dummy </li></ul></ul></ul>
  68. 68. Data Manipulation – Named Pipes and Flat Files <ul><li>Named pipes can be faster – no disk I/O </li></ul><ul><li>How to make a named pipe on most flavors of UNIX: </li></ul><ul><ul><ul><ul><li>mknod PIPE_NAME p </li></ul></ul></ul></ul><ul><li>Can use BCP, GZIP, UNCOMPRESS, or applications to push IQ formatted data into named pipe </li></ul><ul><li>LOAD TABLE command can read from named pipe </li></ul><ul><li>Can also fast extract data to named pipes so that they can be read by another application or even compressed and stored </li></ul>
  69. 69. Data Manipulation – Partial Width Inserts <ul><li>Can induce fragmentation and overly large space consumption if not watched </li></ul><ul><li>Row ID’s from deleted data are not reused during a partial width insert operation </li></ul><ul><li>Space from delete data is not reused (because Row ID’s are not reused) </li></ul><ul><li>Partial width inserts are analogous to APPEND_LOAD=‘on’ in terms of Row ID and space behavior </li></ul><ul><li>Only becomes a problem if partial width inserts are a way of life for a table </li></ul>
  70. 70. Data Loading – The Deep Fact Table <ul><li>Many databases have a “deep” table </li></ul><ul><ul><li>A growing table with tens of millions of rows </li></ul></ul><ul><ul><li>Rows typically ‘rolled out’ and replaced over time </li></ul></ul><ul><li>HG indexes can be slow to load/delete </li></ul><ul><ul><li>Loading time may increase as data added, but it depends </li></ul></ul><ul><li>Solution (with a caveat) </li></ul><ul><ul><li>Partition the table (Example – time: day, week, month) </li></ul></ul><ul><ul><li>Build a view that is a Union of all the tables </li></ul></ul>
  71. 71. Data Loading – Partitioned Fact Table Big Fact Table Create View bigtable as Select * from t1 Union All Select * from t2 Union All Select * from t3 Accessed by View t1 t2 t3 Partitioned
  72. 72. Data Loading – Why Partitioning? <ul><li>Loads are faster and predictable </li></ul><ul><ul><li>x million rows will load consistently </li></ul></ul><ul><li>IQ may process the Union All in parallel </li></ul><ul><ul><li>As long as CPU resources are available </li></ul></ul><ul><li>To roll data out, truncate one table </li></ul><ul><ul><li>Truncate table is much faster than Delete </li></ul></ul><ul><ul><li>No changes to DDL required </li></ul></ul><ul><ul><li>Load new data into the empty (truncated) table </li></ul></ul>
  73. 73. Data Extraction <ul><li>For fastest data unloads use the TEMP_EXTRACT options </li></ul><ul><li>Can unload data in ASCII or BINARY format </li></ul><ul><ul><li>Recommend BINARY format as it is faster to reload into IQ </li></ul></ul><ul><li>Great way to archive portions of the database </li></ul><ul><li>Can unload to one or more files, serially </li></ul><ul><li>Avoid using ISQL or DBISQL redirection to a file </li></ul><ul><ul><li>Much slower than fast unloads </li></ul></ul>
  74. 74. Best Practices for Implementing with Adaptive Server IQ <ul><li>Th </li></ul>Questions?
  75. 75. BID209: Best Practices for Implementing with Sybase IQ Mark D. Mumy Principal Systems Consultant [email_address] July 19, 2010
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.