• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Quantifying the cost of compression - Presented by Thomas Kejser at SQL Bits
 

Quantifying the cost of compression - Presented by Thomas Kejser at SQL Bits

on

  • 614 views

A DBA running SQL Server 2008 or above will often need to understand if it is worth trading CPU cycles for I/O by enabling row or page compression. The benefits can be significant, but does the cost ...

A DBA running SQL Server 2008 or above will often need to understand if it is worth trading CPU cycles for I/O by enabling row or page compression. The benefits can be significant, but does the cost in core licensing offset the storage capacity saved? Often, comparing the workload before and after compression isn't an option. how then can you make an educated guess about the cost of compressing tables and indexes?
 
In this session, we will use Grade of the Steel type workloads to quantify the CPU cost of enabling different types of compression. Using CPU profiling, we will try to quantify the cost of this feature.

Statistics

Views

Total Views
614
Views on SlideShare
603
Embed Views
11

Actions

Likes
1
Downloads
7
Comments
0

1 Embed 11

https://twitter.com 11

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Quantifying the cost of compression - Presented by Thomas Kejser at SQL Bits Quantifying the cost of compression - Presented by Thomas Kejser at SQL Bits Presentation Transcript

    • Thomas Kejserthomas@kejser.orghttp://blog.kejser.org@thomaskejserQuantifying the cost of CompressionGrade of the Steel
    • • Reduce Cost of Storage• Reduce the number of IOPS• Is this relevant?• Squeeze more memory into DRAM• But at what cost to CPU?Why Compress Data?
    • SQL Server Compression OverviewPage3-4xRow1-2xColumn5-10xBackup5-10x
    • • It depends on WHATyour data contains• There is NO WAY totell until you try• Anyone who tellsyou otherwise arelying!How much will my Data Compress?
    • • The Information Entropy of the data• The block size of compression• Row: 1 column in one row• Page: 8K• Backup: up to 4MB• Column store: 1M rows• The algorithm you use• = time you have to compression• How the algorithm fits the dataWhat does it Depend on?
    • • VERY interesting subject• A lot can be done and said about this• But it would be another full length prezo• See my PASS Nordic presentationNothing about Column Stores
    • How does row Compression Work• Variable Length Encoding (new row format)• Special Handling of NULL and 0s0 0 0 14-byte integer1SQL Server 2008+(Compression enabled)4 BytesSQL Server 20051 Byte
    • Consider These Two ExamplesCREATE TABLE Numbers(foo BIGINT NOT NULL, bar BIGINT NOT NULL)INSERT INTO Numbers WITH (TABLOCK) (foo, bar)SELECT (n2.n - 1) * 1000 + (n1.n - 1), (n2.n - 1) * 1000 + (n1.n - 1)FROM fn_nums(1000) n1CROSS JOIN fn_nums(1000) n2CREATE UNIQUE CLUSTERED INDEX CIXON Numbers(foo)WITH (DATA_COMPRESSION = ROW)EXEC sp_spaceused NumbersCREATE TABLE Numbers(foo BIGINT NOT NULL, bar BIGINT NOT NULL)INSERT INTO Numbers WITH (TABLOCK) (foo, bar)SELECT (n2.n - 1) * 1000 + (n1.n - 1), (n2.n - 1) * 1000000000 + (n1.n - 1)FROM fn_nums(1000) n1CROSS JOIN fn_nums(1000) n2CREATE UNIQUE CLUSTERED INDEX CIXON Numbers(foo)WITH (DATA_COMPRESSION = ROW)EXEC sp_spaceused Numbers11MB 13MB
    • • Combination of• Prefix Compression – common prefix insidecolumn• Dictionary – Common values across allcolumns• Pages are kept compressed in bufferpool• So: It should be more expensive to accessthem even there?How does Page Compression Work?
    • Page Compression- Column PrefixPage Header0x5B8D800x41AABB0x9A40410x1122330x5CAABBLambert 5000000 NULL45 20x4120x4320x42
    • Page Header0x5B8D800x41AABB0x9A40410x1122330x5CAABBLambert 5000000 NULL45 20x4120x4320x42Column prefixPage Dictionary1100Decimal6000000is HEX0x5B8D80Page Compression - Dictionary
    • How fast is it?
    • Our Reasonably Priced Server• 2 Socket Xeon E3645• 2 x 6 Cores• 2.4Ghz• NUMA enabled, HT off• 12 GB RAM• 1 ioDrive2 Duo• 2.4TB Flash• 4K formatted• 64K AUS• 1 Stripe• Power Save Off• Win 2008R2• SQL 2012Image Source: DeviantArt
    • Test: Table Scan with Page Compress• Use TPC-H• Scale factor 10• 10GB total dataset• Apply PAGE compressCompression Size Build Time CPU LoadNone 6.5 GB 18 sec 100%ROW 4.9 GB 16 sec 100%PAGE 3.9 GB 38 sec 100%Gzip –5 2.5 GB NA NANTFS Compress (Best: 4K AUS) 3.7 GB NA NAWindows Zipping 2.3 GB NA NA
    • • Fast Scan ThroughTable• Table resident inmemory first• Scan NONE andPAGECompressed tables are faster, right?Compression Size Scan Time CPU LoadNone - Memory 6.5 GB 4 sec 100%PAGE - Memory 3.9 GB 8 sec 100%
    • PAGEAhh.. But Thomas: you didn’t do I/OCompression Size Scan Time CPU LoadNone - Memory 6.5 GB 4 sec 100%PAGE - Memory 3.9 GB 8 sec 100%None – I/O 6.5 GB 6 sec 100%PAGE – I/O 3.9 GB 9 sec 100%NONE
    • • Our old friendxperfWhere does the time go?xperf –on base–stackwalk profileFunction NONE PAGEMinMaxStep 15.9% 9.2%IndexDataSetSession::GetNextRowValuesInternal 14.6% 17.0%CEsExec::GeneralEval 11.8% 6.7%CValXVarTable::GetDataX 8.8% 5.2%CXVariant::CopyDeep 7.8% 4.6%memcpy 6.5% .4%CValXVarTableRow::SetDataX 6.0% 3.3%GetDataFromXvar8 5.2% 2.9%RowsetNewSS::FetchNextRow 2.7% 1.6%ps_dl_sqlhilo 2.7% 1.5%GetData 2.1% 1.2%GetDataFromXvar 1.8% 1.0%CTEsCompare<122;122>::BlCompareXcArgArg 1.5% .9%CTEsCompare<58;58>::BlCompareXcArgArg 1.2% .6%CTEsCompare<167;167>::BlCompareXcArgArg 1.1% .7%CTEsCompare<56;56>::BlCompareXcArgArg 1.1% .6%SetMultData 1.1% .6%CQScanTableScanNew::GetRow 1.0% .6%CQScanStreamAggregateNew::GetRowHelper 1.0% .6%CTEsCompare<52;52>::BlCompareXcArgArg .8% .7%ScalarCompression::AddPadding .7%PageComprMgr::DecompressColumn 4.7%DataAccessWrapper::DecompressColumnValue 9.9%CDRecord::LocateColumnInternal 14.4%AnchorRecordCache::LocateColumn 2.3%DataAccessWrapper::StoreColumnValue 4.3%Additional Runtime of GetNextRowValuesInternal 2.4%Total 38.7%
    • • Sample fromLINEITEM• Force loop join withindex seeks• Do 1.4M seeksTest: Singleton Row Fetch
    • Singleton seeks – Cost of compressionCompression Seek (1.4M seeks) CPU LoadNone - Memory 13 sec 100% one corePAGE - Memory 24 sec 100% one coreNone – I/O 21 sec 100% one corePAGE – I/O 32 sec 100% one coreFunction % WeightCDRecord::LocateColumnInternal 0.82%DataAccessWrapper::DecompressColumnValue 0.47%SearchInfo::CompareCompressedColumn 0.28%PageComprMgr::DecompressColumn 0.24%AnchorRecordCache::LocateColumn 0.18%ScalarCompression::AddPadding 0.04%ScalarCompression::Compare 0.11%Additional Runtime ofGetNextRowValuesInternal 0.14%Total Compression 2.28%Total CPU (single core) 8.33%Compression % 27.00%xperf –on base–stackwalk profile
    • Test: Updates of pagesCompression Update 1.4M CPU LoadNone - Memory 13 sec 100% one corePAGE - Memory 54 sec 100% one coreNone – I/O 17 sec 100% one corePAGE – I/O 59 sec 100% one coreL_QUANTITY is NOT NULLi.e. in place UPDATE
    • Function CPU %qsort 0.86CDRecord::Resize 0.84CDRecord::LocateColumnInternal 0.36perror 0.36Page::CompactPage 0.36ObjectMetadata::`scalardeleting destructor 0.27SearchInfo::CompareCompressedColumn 0.24CDRecord::InitVariable 0.19CDRecord::LocateColumnWithCookie 0.18memcmp 0.16PageDictionary::ValueToSymbol 0.16Record::DecompressRec 0.14PageComprMgr::DecompressColumn 0.14CDRecord::InitFixedFromOld 0.1SOS_MemoryManager::GetAddressInfo64 0.08AnchorRecordCache::LocateColumn 0.08CDRecord::GetDataForAllColumns 0.08ScalarCompression::Compare 0.07PageComprMgr::CompressColumn 0.07Record::CreatePageCompressedRecNoCheck 0.06memset 0.05PageComprMgr::ExpandPrefix 0.04PageRef::ModifyColumnsInternal 0.04Page::ModifyColumns 0.03DataAccessWrapper::ProcessAndCompressBuffer 0.03SingleColAccessor::LocateColumn 0.03CDRecord::BuildLongRegionBulk 0.02ChecksumSectors 0.02Page::MCILinearRegress 0.02DataAccessWrapper::DecompressColumnValue 0.02SOS_MemoryManager::GetAddressInfo 0.02CDRecord::FindDiff 0.02AnchorRecordCache::Init 0.02PageComprMgr::CombinePrefix 0.01Total 5.17UPDATE Compression burnersOut of 8.55 … Approx: 60%
    • • I am going to do a demo!• Lets see xperf in actionAnd now, for something unusual!
    • Benchmark Test: HammerDbCompression TPM CPU LoadNONE Compress 16.7 M 70% all coresPAGE Compress 14.5 M 70% all coresBenchmark Simulating TPC type workloadsIt is NOT TPC, but it helps you set up something similarBelow are “TPC-C like” workload
    • What about locks and contention?Xevent TraceLock Acquire/Release
    • How long are locks held?0100200300400500600PAGE NONECPU KCyclesLock Held Cycle CountAvgStdDev
    • &