A27 Vectorwise Performance Considerations_implementation_best_practices
Upcoming SlideShare
Loading in...5

A27 Vectorwise Performance Considerations_implementation_best_practices






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

A27 Vectorwise Performance Considerations_implementation_best_practices A27 Vectorwise Performance Considerations_implementation_best_practices Presentation Transcript

  • VectorwiseImplementation best practicesMark Van de WielDirector Product Management, VectorwiseThursday, November 01, 20121 of 9 1 of 9Confidential © 2012 Actian Corporation
  • Agenda Hardware Operating system Database configuration Database design Data loading High availability Monitoring Confidential © 2012 Actian Corporation 2
  • 100x (+) Performance Difference – 2003Custom C versus Relational Database TPC-H 1 GB query 1 (runtime in s)30 28.1 26.22520 MySQL15 DBMS X C program10 Vectorwise 5 0.2 0.6 0 MySQL DBMS X C program Vectorwise Confidential © 2012 Actian Corporation 3
  • Some Numbers Traditional RDBMS: <200 MB/s per core Even these use MPP to I/O challenges Vectorwise (lab environment): >1.5 GB/s per core Maximum throughput requirement is extremely high Realistically (cost-effectively) only RAM can serve data quick enough Confidential © 2012 Actian Corporation 4
  • What Hardware to Use CPU Memory Storage I/O and capacity Requirements Budget Confidential © 2012 Actian Corporation 5
  • Hardware Considerations – MEMORY Ideally frequently-accessed data should fit in memory May be all data May be a small portion of the data Note: data is compressed in memory buffer • 3x – 5x compression ratios are common Query execution should all take place in memory Operations against larger data sets require more memory Consider query concurrency “Spill to disk” is supported but should be a last resort Confidential © 2012 Actian Corporation 6
  • Hardware Recommendation CPUs Use CPUs with higher clock rate for better raw throughput Use more cores for higher throughput Higher power CPUs are faster Memory At least 8 GB per core (more is always better) Storage Use as many drives as possible Ensure sufficient capacity Use the fastest drives available • SAS over SATA, ideally 15k RPM • SSDs are often not cost-effective relative to more memory Confidential © 2012 Actian Corporation 7
  • ExamplesSmall configuration (1 TB) Dell R620 Lenovo RD430Medium configuration (single digit TBs) Dell R720 HP DL380 IBM x3650 Lenovo RD630High-end configuration Dell R910 HP DL580 or DL980 IBM x3750 Confidential © 2012 Actian Corporation 8
  • Operating System Considerations 64-bit Redhat Windows 7 (or higher) SuSE xfs, ext3, ext4 Windows 2008 (or higher) Ubuntu Confidential © 2012 Actian Corporation 9
  • Database ConfigurationInstallation defaults are generally good May want to adjust column buffer size (default 25% of RAM) May want to adjust processing memory (default 50% of RAM) Confidential © 2012 Actian Corporation 10
  • Database Design Schema – no particular preference Single demormalized table, star schema, snowflake schema, 3rd normal form Constraints Only on empty tables today… (to be addressed in Vectorwise 3.0) Consider data loading order and impact Indexes Note: clustered index-only today (“index-organized table”) One per table Consider incremental load Confidential © 2012 Actian Corporation 11
  • Data LoadingInitial load File-based bulk load through vwload or copy Conversion into UTF8 Use tools Pentaho Informatica Talend HVR Attunity Confidential © 2012 Actian Corporation 12
  • Data LoadingIncremental load INSERT, UPDATE and/or DELETE Append if possible Batch if possible Use COMBINE Positional Delta Trees Memory considerations Propagation to disk Use tools Confidential © 2012 Actian Corporation 13
  • Moving Window of DataConsiderations COMBINE on a large table can be expensive Mostly relevant for updates and deletes Alternative: manual partitioning One table per period Single view across all tables Confidential © 2012 Actian Corporation 14
  • High Availability Hardware and OS best practices UPS, RAID Vectorwise backup Only read-only, full backup Consider periodic full backup and file incremental loads Disaster recovery Dual load Active/active possibility Confidential © 2012 Actian Corporation 15
  • Monitoring OS monitoring CPU, memory utilization, I/O statistics vwinfo data Actian Director DBA tools Confidential © 2012 Actian Corporation 16
  • Agenda Hardware Operating system Database configuration Database design Data loading High availability MonitoringMore information in the Vectorwise Developer Guide: http://www.actian.com/images/white_papers/vw_developers_v2.5.pdf Confidential © 2012 Actian Corporation 17
  • Confidential © 2012 Actian Corporation