Your SlideShare is downloading. ×
1 Global Marketing
REMINDER
Check in on the
COLLABORATE mobile app
206:Using Flash SSD to Optimize
Oracle Database Perform...
2 Software Group
Agenda
• Brief History of Magnetic Disk
• Solid State Disk (SSD) technologies
• SSD internals
• Oracle DB...
3 Software Group
Introductions
Web: guyharrison.net
Email: guy.harrison@software.dell.com
Twitter: @guyharrison
Google Plu...
4 Software GroupConfidential
5 Software GroupConfidential
6 Software GroupConfidential
7 Software GroupConfidential
9
10 Software Group
A brief history of
disk
11 Software Group
Magnetic Disk architecture
12 Software Group
5MB HDD circa 1956
13 Software Group
28MB HDD - 1961
• 1800 RPM
• 100,000 times smaller than a cheap 3 TB drive
• BUT spinning on 10 times sl...
14 Software Group
The more that things change....
15 Software Group
Moore’s law
16 Software Group
Moore’s law
• Transistor density doubles every 18 months
• Exponential growth is observed in most
electr...
17 Software Group
Disk trends 2001-2009
260 1,635
-630
1,013
-390
-1,000
-500
0
500
1,000
1,500
2,000
IO Rate Disk Capacit...
18 Software Group
Solid State Disk
to the rescue?
19 Software Group
Seek times
4,000
80
25
15
0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500
Magnetic Disk
SSD SATA F...
20 Software Group
1.27
0.50
0.05
0.12
0.06
0.04
0.83
2.93
13.41
26.83
0.00 5.00 10.00 15.00 20.00 25.00 30.00
0.00 0.20 0....
21 Software Group
Tiered storage management
Main Memory
DDR SSD
Flash SSD
Fast Disk (SAS, RAID 0+1)
Slow Disk (SATA, RAID ...
22 Software Group
SSD technology
and internals
23 Software Group
Flavours of Solid State Disk
• DDR RAM Drive
• SATA flash drive
• PCI flash drive
• SSD storage Server
24 Software Group
PCI SSD vs SATA SSD
• PCI vs SATA
– SATA was designed for traditional disk drives with high latencies
– ...
25 Software Group
Dell Express flash
• PCI flash performance can
normally only be achieved
by attaching a PCI card
directl...
26 Software Group
Flash SSD is the
most cost-
effective SSD
technology
Block 128K-1M
Flash SSD internals
• Cell: One (SLC), Two (MLC) or Three
(TLC) bits
• Page: Typically 4K
• Block: Typically...
28 Software Group
Flash SSD performance
25
250
2000
0 200 400 600 800 1000 1200 1400 1600 1800 2000
Read (4k page seek)
Fi...
29 Software Group
Flash Disk write degradation
• All Blocks empty:
Write time=250 us
• 25% part full:
– Write time= ( ¾ * ...
30 Software Group
Valid Data Page
Empty Data Page
InValid Data Page
Free Block Pool
Used Block Pool
SSD Controller
Insert
...
31 Software Group
Valid Data Page
Empty Data Page
Invalid Data Page
Free Block Pool
Used Block Pool
SSD Controller
Update
...
32 Software Group
Valid Data Page
Empty Data Page
Invalid Data Page
Free Block Pool
Used Block Pool
SSD Controller
Garbage...
33 Software Group
34 Software Group
Oracle Database
flash Cache
35 Software Group
Oracle DB flash cache
• Introduced in 11gR2 for OEL
and Solaris only
• Secondary cache maintained
by the...
36 Software Group
Database
files
Buffer
cache
DBWR
Oracle process
Free
Buffer
Waits
Write dirty blocks to disk
Write to bu...
37 Software Group
Database
files
Buffer
cache
DBWR
Oracle process
Write dirty blocks to disk
Write to buffer cache
Read fr...
38 Software Group
Configuration
• Create filesystem from flash device
• Set DB_FLASH_CACHE_FILE and DB_FLASH_CACHE_SIZE.
•...
39 Software Group
Flash KEEP pool
• You can prioritise blocks for important objects using the
FLASH_CACHE clause:
40 Software Group
Oracle Db flash cache statistics
http://guyharrison.squarespace.com/storage/flash_insert_stats.sql
41 Software Group
Flash Cache Efficiency
http://guyharrison.squarespace.com/storage/flash_time_savings.sql
42 Software Group
Flash cache Contents
http://guyharrison.squarespace.com/storage/flashContents.sql
43 Software Group
Performance
tests
44 Software Group
Test systems
• Third System:
– Oracle Exadata X-2 ¼ rack
– 36 × 600 GB 15K RPM SAS
HDD
– 12 x 96GB Sun F...
45 Software Group
Performance: indexed reads(X-25)
529.7
143.27
48.17
0 100 200 300 400 500 600
No Flash
Flash cache
Flash...
46 Software Group
Performance: Read/Write (X-25)
3,289
1,693
200
0 500 1000 1500 2000 2500 3000 3500
No Flash
Flash Cache
...
47 Software Group
Random reads – FusionIO
2,211
583
121
0 500 1000 1500 2000 2500
SAS disk, no flash
cache
SAS disk, flash...
48 Software Group
Updates – Fusion IO
6,219
1,934
529
0 1000 2000 3000 4000 5000 6000 7000
SAS disk, no flash cache
SAS di...
49 Software Group
Buffer Cache bottlenecks
• Flash cache architecture
avoids ‘free buffer waits’
due to waits flash IO, bu...
50 Software Group
Full table scans
418
398
72
0 50 100 150 200 250 300 350 400 450
SAS disk, no flash cache
SAS disk, flas...
51 Software Group
Sorting – what we expect
Time
PGA Memory available (MB)
Table/Index IO CPU Time Temp Segment IO
Memory S...
52 Software Group
Disk Sorts – temp tablespace SSD vs HDD
0
500
1000
1500
2000
2500
3000
3500
4000
050100150200250300
Elap...
53 Software Group
SSD for Redo?
54 Software Group
292.39
291.93
0 50 100 150 200 250 300 350
SAS based redo log
Flash based redo log
Elapsed time (s)
CPU
...
55 Software Group
Concurrent redo workload (x10)
55
1,605
1,637
397
331
1,944
1,681
0 500 1,000 1,500 2,000 2,500 3,000 3,...
57 Software Group
Redo logs - redo size
• Marcelle Kratochvil has reported significant improvements for SSD
redo when appl...
58 Software Group
Redo performance – Express Flash
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0 1 2 3 4 5 6
Redo S...
59 Software Group
Conclusions for redo
• SSD is not a good match for redo
– Sustained sequential writes lead to heavy garb...
60 Software Group
Device level SSD
caches
61 Software Group
Flash caching technologies
61
Dell FluidCache, FusionIO DirectCache, etc.
Read-
intensive, po
tentially
...
62 Software Group
Fusion IO direct cache – Table scans
147
147
147
36
0 20 40 60 80 100 120 140 160
No cache 1st scan
No c...
63 Software Group
Exadata
63
64 Software Group
Exadata X-4
65 Software Group
Exadata flash storage
• 4x96GB PCI Flash drives on each storage server (4x
increase in X3)
• Flash can b...
66 Software Group
Exadata Smart Flash Cache Architecture
Database Node
Database Node
Storage Node
1
cellsrv Flash Cache
Gr...
67 Software Group
CELL_FLASH_CACHE_KEEP
• CELL_FLASH_CACHE_KEEP applies at the segment
(table, index, partition) level
• D...
68 Software Group
Using Exadata flash as grid disk
• Exadata uses all flash disks as flash cache
• You can modify this con...
69 Software Group
Index reads
9.39
21.64
31.74
0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00
SSD tablespace, no cache
HDD ...
70 Software Group
Full Table scans
2.94
4.75
11.27
3.36
33.14
12.45
0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00
SSD tabl...
71 Software Group
Exadata: SSD for
redo
72 Software Group
73 Software Group
Note: Exadata X-2
performance: X-3
and X-4 are probably
much faster
74 Software Group
Exadata Smart
FlashLog
75 Software Group
Smart Flash Log
• Designed to reduce “outlier”
redo log sync waits
• Redo is written simultaneously
to d...
76 Software Group
All Redo log writes (16M log writes)
Flash
Log
Min Median Mean 99% Max
ON 1.0 650 723 1,656 75,740
OFF 1...
77 Software Group
Redo log outliers
WAIT #47124064145648: nam='log file sync' ela= 710 buffer#=129938 sync scn=1266588258 ...
78 Software Group
Top 10,000 waits
79 Software Group
Exadata 12c Smart Flash Cache Write-back
• Database writes go to flash
cache
– LRU aging to HDD
– Reads ...
80 Software Group
Performance tests
1,917.34
7,693.62
0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000
Write Back
W...
81 Software Group
Summary
82 Software Group
Recommendations
• Don’t wait for SSD to become as cheap as HDD
– Magnetic HDD will always be cheaper per...
83 Software Group
How to use SSD
• Database flash cache
– If your bottleneck is single block (indexed reads) and you are o...
84 Software Group
Visit the Dell Software Booth
Enter for a chance to
win a Dell Venue Pro 11
tablet
Draw is at
2:45pm
Thu...
Please complete the session
evaluation on the mobile app
We appreciate your feedback and insight
guy.harrison@software.del...
Optimizing Oracle databases with SSD - April 2014
Upcoming SlideShare
Loading in...5
×

Optimizing Oracle databases with SSD - April 2014

11,323

Published on

Presentation on using Solid State Disk (SSD) with Oracle databases, including the 11GR2 db flash cache and using flash in Exadata. Last given at Collaborate 2014 #clv14.

Published in: Technology
0 Comments
18 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
11,323
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
357
Comments
0
Likes
18
Embeds 0
No embeds

No notes for slide
  • Insanely popular – literally millions of users
  • : Dell R720CPUs: 2 sockets of 8 core processors       : 0 model name      : Genuine Intel(R) CPU  @ 2.70GHz  cache size      : 20480 KB memory: 64 GBnumber/type of disks: DATA:  consists of 16x 15k rpm HDD for RAID 10 configuration. 8 effective sprindles.                               PCIe SSDs: 2 PCIe SSDs installed, but only one (/dev/rssdb1 ) is used for the PCIDATA tablespace.
  • The higher logical read rate overwhelms the HDD
  • Transcript of "Optimizing Oracle databases with SSD - April 2014"

    1. 1. 1 Global Marketing REMINDER Check in on the COLLABORATE mobile app 206:Using Flash SSD to Optimize Oracle Database Performance Guy Harrison Executive Director, R&D Information Management Group Dell Software
    2. 2. 2 Software Group Agenda • Brief History of Magnetic Disk • Solid State Disk (SSD) technologies • SSD internals • Oracle DB flash cache architecture • Performance comparisons • Exadata flash • Recommendations and Suggestions
    3. 3. 3 Software Group Introductions Web: guyharrison.net Email: guy.harrison@software.dell.com Twitter: @guyharrison Google Plus: https://www.google.com/+GuyHarrison1
    4. 4. 4 Software GroupConfidential
    5. 5. 5 Software GroupConfidential
    6. 6. 6 Software GroupConfidential
    7. 7. 7 Software GroupConfidential
    8. 8. 9
    9. 9. 10 Software Group A brief history of disk
    10. 10. 11 Software Group Magnetic Disk architecture
    11. 11. 12 Software Group 5MB HDD circa 1956
    12. 12. 13 Software Group 28MB HDD - 1961 • 1800 RPM • 100,000 times smaller than a cheap 3 TB drive • BUT spinning on 10 times slower than that drive
    13. 13. 14 Software Group The more that things change....
    14. 14. 15 Software Group Moore’s law
    15. 15. 16 Software Group Moore’s law • Transistor density doubles every 18 months • Exponential growth is observed in most electronic components: –CPU clock speeds –RAM –Hard Disk Drive storage density • But not in mechanical components –Service time (Seek latency) – limited by actuator arm speed and disk circumference –Throughput (rotational latency) – limited by speed of rotation, circumference and data density
    16. 16. 17 Software Group Disk trends 2001-2009 260 1,635 -630 1,013 -390 -1,000 -500 0 500 1,000 1,500 2,000 IO Rate Disk Capacity IO/Capacity CPU IO/CPU %agechange
    17. 17. 18 Software Group Solid State Disk to the rescue?
    18. 18. 19 Software Group Seek times 4,000 80 25 15 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 Magnetic Disk SSD SATA Flash SSD PCI flash SSD DDR-RAM Seek time (us)
    19. 19. 20 Software Group 1.27 0.50 0.05 0.12 0.06 0.04 0.83 2.93 13.41 26.83 0.00 5.00 10.00 15.00 20.00 25.00 30.00 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 Capacity HDD Performance HDD SATA SSD MLC PCI SSD PCI SLC SSD Dollar/GB Dollar/IO Dollar/IOP Dollar/GB Economics of SSD
    20. 20. 21 Software Group Tiered storage management Main Memory DDR SSD Flash SSD Fast Disk (SAS, RAID 0+1) Slow Disk (SATA, RAID 5) Tape, Flat Files, Hadoop $/IOP $/GB
    21. 21. 22 Software Group SSD technology and internals
    22. 22. 23 Software Group Flavours of Solid State Disk • DDR RAM Drive • SATA flash drive • PCI flash drive • SSD storage Server
    23. 23. 24 Software Group PCI SSD vs SATA SSD • PCI vs SATA – SATA was designed for traditional disk drives with high latencies – PCI is designed for high speed devices – PCI SSD has latency ~ 1/3rd of SATA
    24. 24. 25 Software Group Dell Express flash • PCI flash performance can normally only be achieved by attaching a PCI card directly to the server motherboard • Dell express flash exposes the interfaces the PCI bus to front loading drive slots allowing hot swap and install of PCI flash
    25. 25. 26 Software Group Flash SSD is the most cost- effective SSD technology
    26. 26. Block 128K-1M Flash SSD internals • Cell: One (SLC), Two (MLC) or Three (TLC) bits • Page: Typically 4K • Block: Typically 128-512K Storage Hierarchy: • Read and first write require single page IO • Overwriting a page requires an erase & overwrite of the block Writes: • 100,000 erase cycles for SLC before failure • 5,000 – 15,000 erase cycles for MLC Write endurance: Page 4-8K Cell 1-2 bytes
    27. 27. 28 Software Group Flash SSD performance 25 250 2000 0 200 400 600 800 1000 1200 1400 1600 1800 2000 Read (4k page seek) First insert (4k page write) Update (256K block erase) Microseconds
    28. 28. 29 Software Group Flash Disk write degradation • All Blocks empty: Write time=250 us • 25% part full: – Write time= ( ¾ * 250 us + 1/4 * 2000 us) = 687 us • 75% part full – Write time = ( ¼ * 250 us + ¾ * 2000 us ) = 1562 us Empty Partially Full
    29. 29. 30 Software Group Valid Data Page Empty Data Page InValid Data Page Free Block Pool Used Block Pool SSD Controller Insert Data Insert
    30. 30. 31 Software Group Valid Data Page Empty Data Page Invalid Data Page Free Block Pool Used Block Pool SSD Controller Update Data Update
    31. 31. 32 Software Group Valid Data Page Empty Data Page Invalid Data Page Free Block Pool Used Block Pool SSD Controller Garbage Collection
    32. 32. 33 Software Group
    33. 33. 34 Software Group Oracle Database flash Cache
    34. 34. 35 Software Group Oracle DB flash cache • Introduced in 11gR2 for OEL and Solaris only • Secondary cache maintained by the DBWR, but only when idle cycles permit • Architecture is tolerant of poor flash write performance
    35. 35. 36 Software Group Database files Buffer cache DBWR Oracle process Free Buffer Waits Write dirty blocks to disk Write to buffer cache Read from disk Read from buffer cache Free buffer waits often occur when reads are much faster than writes.... Buffer cache and Free buffer waits
    36. 36. 37 Software Group Database files Buffer cache DBWR Oracle process Write dirty blocks to disk Write to buffer cache Read from disk Read from buffer cache Flash Cache Write clean blocks (time permitting) Read from flash cache DB Flash cache architecture is designed to accelerate buffered reads Flash Cache
    37. 37. 38 Software Group Configuration • Create filesystem from flash device • Set DB_FLASH_CACHE_FILE and DB_FLASH_CACHE_SIZE. • Consider Filesystemio_options=setall
    38. 38. 39 Software Group Flash KEEP pool • You can prioritise blocks for important objects using the FLASH_CACHE clause:
    39. 39. 40 Software Group Oracle Db flash cache statistics http://guyharrison.squarespace.com/storage/flash_insert_stats.sql
    40. 40. 41 Software Group Flash Cache Efficiency http://guyharrison.squarespace.com/storage/flash_time_savings.sql
    41. 41. 42 Software Group Flash cache Contents http://guyharrison.squarespace.com/storage/flashContents.sql
    42. 42. 43 Software Group Performance tests
    43. 43. 44 Software Group Test systems • Third System: – Oracle Exadata X-2 ¼ rack – 36 × 600 GB 15K RPM SAS HDD – 12 x 96GB Sun F20 SLC PCI flash cards. • Final System: – Dell R720 2x8 core 2.7GHz processors, 64 GB RAM – 16x15K HDD in RAID 10 – 1x Dell Express Flash SLC PCIe • First System: – Dell Optiplex dual-core 4GB RAM – 2xSeagate 7500RPM Baracuda SATA HDD – Intel X-25E SLC SATA SSD • Second System: – Dell R510 2xquad core, 32 GB RAM – 4x300GB 15K RPM,6Gbps Dell SAS HDD – 1xFusionIO ioDrive SLC PCI SSD
    44. 44. 45 Software Group Performance: indexed reads(X-25) 529.7 143.27 48.17 0 100 200 300 400 500 600 No Flash Flash cache Flash tablespace Elapsed (s) CPU db file IO flash cache IO Other
    45. 45. 46 Software Group Performance: Read/Write (X-25) 3,289 1,693 200 0 500 1000 1500 2000 2500 3000 3500 No Flash Flash Cache Flash tablespace Elapsed time (s) CPU db file IO write complete free buffer flash cache IO Other
    46. 46. 47 Software Group Random reads – FusionIO 2,211 583 121 0 500 1000 1500 2000 2500 SAS disk, no flash cache SAS disk, flash cache Table on SSD Elapsed time (s) CPU Other DB File IO Flash cache IO
    47. 47. 48 Software Group Updates – Fusion IO 6,219 1,934 529 0 1000 2000 3000 4000 5000 6000 7000 SAS disk, no flash cache SAS disk, flash cache Table on SSD Elapsed Time (s) DB CPU db file IO log file IO flash cache free buffer waits Other
    48. 48. 49 Software Group Buffer Cache bottlenecks • Flash cache architecture avoids ‘free buffer waits’ due to waits flash IO, but write complete waits can still occur on hot blocks. • Free buffer waits are still possible against the database files, because flash cache accelerates reads but not writes
    49. 49. 50 Software Group Full table scans 418 398 72 0 50 100 150 200 250 300 350 400 450 SAS disk, no flash cache SAS disk, flash cache Table on SSD Elasped time (s) CPU Other DB File IO Flash Cache IO Flash cache doesn’t accelerate Full table scans b/c scans use direct path reads and flash cache only accelerates buffered reads
    50. 50. 51 Software Group Sorting – what we expect Time PGA Memory available (MB) Table/Index IO CPU Time Temp Segment IO Memory Sort Single Pass Disk Sort Multi-pass Disk Sort
    51. 51. 52 Software Group Disk Sorts – temp tablespace SSD vs HDD 0 500 1000 1500 2000 2500 3000 3500 4000 050100150200250300 Elapsedtime(s) Sort Area Size SAS based TTS SSD based TTS Single Pass Disk Sort Multi-pass Disk Sort
    52. 52. 53 Software Group SSD for Redo?
    53. 53. 54 Software Group 292.39 291.93 0 50 100 150 200 250 300 350 SAS based redo log Flash based redo log Elapsed time (s) CPU Log IO Redo performance – Fusion IO
    54. 54. 55 Software Group Concurrent redo workload (x10) 55 1,605 1,637 397 331 1,944 1,681 0 500 1,000 1,500 2,000 2,500 3,000 3,500 4,000 4,500 SAS based redo log Flash based redo log Elapsed time (s) CPU Other Log File IO
    55. 55. 57 Software Group Redo logs - redo size • Marcelle Kratochvil has reported significant improvements for SSD redo when applying LOB updates • Performance for SSD writing small OLTP style transactions may differ significantly from large LOB updates: – Small transactions will hit the same block repeatedly, resulting in block erase overheads for most writes. – When the redo size exceeds the SSD page size then this overhead is avoided and redo performance on SSD may exceed HDD – On the other hand ““in foreground garbage collection a larger write will require more pages to be erased, so actually will suffer from even more performance issues.”” (flashdba)
    56. 56. 58 Software Group Redo performance – Express Flash 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 1 2 3 4 5 6 Redo Size MB Millions HDD SSD Block erase Required Block erase not Required
    57. 57. 59 Software Group Conclusions for redo • SSD is not a good match for redo – Sustained sequential writes lead to heavy garbage collection overhead – Magnetic disk is very good as sequential writes because seek time is minimized • Very good SSD might provide (very roughly) a 20-30% reduction in redo log sync waits – At least, that is the best I have seen – Might provide no benefit at all on a busy system – Might provide higher benefits on a lightly burdened system • Very eager to compare data with anyone who has different results
    58. 58. 60 Software Group Device level SSD caches
    59. 59. 61 Software Group Flash caching technologies 61 Dell FluidCache, FusionIO DirectCache, etc. Read- intensive, po tentially massive tablespaces •Temp Tablespace • Hot Segments • Hot Partitions • DB Flash Cache (limited to the size of the SSD) Regular Block Device Device Driver File System/ Raw Devices/ ASM FluidCache Driver File System/ Raw Devices/ ASM Caching Block Device LUN
    60. 60. 62 Software Group Fusion IO direct cache – Table scans 147 147 147 36 0 20 40 60 80 100 120 140 160 No cache 1st scan No cache 2nd scan direct cache on 1st scan direct cache on 2nd scan Elapsed time (s) CPU IO Other
    61. 61. 63 Software Group Exadata 63
    62. 62. 64 Software Group Exadata X-4
    63. 63. 65 Software Group Exadata flash storage • 4x96GB PCI Flash drives on each storage server (4x increase in X3) • Flash can be configured as: – Exadata Smart Flash Cache (ESFC) – Solid State Disk available to ASM disk groups • ESFC is not the same as the DB flash cache: – Maintained by cellsrv, not DBWR – Supports smart scans and full scans – If CELL_FLASH_CACHE= KEEP, – Statistics accessed via the cellcli program • Considerations for cache vs. SSD are similar
    64. 64. 66 Software Group Exadata Smart Flash Cache Architecture Database Node Database Node Storage Node 1 cellsrv Flash Cache Grid Disks Oracle process Buffer Cache Oracle process Buffer Cache 3 2 4 4 5 6 2
    65. 65. 67 Software Group CELL_FLASH_CACHE_KEEP • CELL_FLASH_CACHE_KEEP applies at the segment (table, index, partition) level • Default setting caches smart scan and index lookup results. Full table scans are only cached when the KEEP option is applied CELL_FLASH_CA CHE_KEEP Index lookups Smart Scans Full Table scans (not smar) NONE Not cached Not cached Not cached DEFAULT Cached Not Cached Not cached KEEP Cached Cached Cached
    66. 66. 68 Software Group Using Exadata flash as grid disk • Exadata uses all flash disks as flash cache • You can modify this configuration and assign flash disks as grid disks ASM Disk Group ASM Disk Group Cell Disks SAS Disks Grid Disks Flash Disks Flash Cache ASM Disk Group ASM Disk Group Cell Disks SAS Disks ASM Disk Group Grid Disks Flash Disks Flash Cache
    67. 67. 69 Software Group Index reads 9.39 21.64 31.74 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 SSD tablespace, no cache HDD tablespace, default cache HDD tablespace, no cache Time (s) CPU Time IO Time
    68. 68. 70 Software Group Full Table scans 2.94 4.75 11.27 3.36 33.14 12.45 0.00 5.00 10.00 15.00 20.00 25.00 30.00 35.00 SSD table, default cache HDD table, keep cache HDD table, default cache Time (s) 1st scan 2nd scan Beware of CELL_FLASH_CACHE=KEEP
    69. 69. 71 Software Group Exadata: SSD for redo
    70. 70. 72 Software Group
    71. 71. 73 Software Group Note: Exadata X-2 performance: X-3 and X-4 are probably much faster
    72. 72. 74 Software Group Exadata Smart FlashLog
    73. 73. 75 Software Group Smart Flash Log • Designed to reduce “outlier” redo log sync waits • Redo is written simultaneously to disk and flash • First write to complete wins • Introduced in Exadata storage software 11.2.2.4 Database Node Storage Node 1 cellsrv Flash Cache Grid Disks Oracle processLog Buffer 4 2 5 LGWR 3 4
    74. 74. 76 Software Group All Redo log writes (16M log writes) Flash Log Min Median Mean 99% Max ON 1.0 650 723 1,656 75,740 OFF 1.0 627 878 4,662 291,800
    75. 75. 77 Software Group Redo log outliers WAIT #47124064145648: nam='log file sync' ela= 710 buffer#=129938 sync scn=1266588258 p3=0 obj#=-1 tim=1347583167579790 WAIT #47124064145648: nam='log file sync' ela= 733 buffer#=130039 sync scn=1266588297 p3=0 obj#=-1 tim=1347583167580808 WAIT #47124064145648: nam='log file sync' ela= 621 buffer#=130124 sync scn=1266588332 p3=0 obj#=-1 tim=1347583167581695 WAIT #47124064145648: nam='log file sync' ela= 507 buffer#=130231 sync scn=1266588371 p3=0 obj#=-1 tim=1347583167582486 WAIT #47124064145648: nam='log file sync' ela= 683 buffer#=101549 sync scn=1266588404 p3=0 obj#=-1 tim=1347583167583398 WAIT #47124064145648: nam='log file sync' ela= 2084 buffer#=130410 sync scn=1266588442 p3=0 obj#=-1 tim=1347583167585748 WAIT #47124064145648: nam='log file sync' ela= 798 buffer#=130535 sync scn=1266588488 p3=0 obj#=-1 tim=1347583167586864 WAIT #47124064145648: nam='log file sync' ela= 1043 buffer#=101808 sync scn=1266588527 p3=0 obj#=-1 tim=1347583167588250 WAIT #47124064145648: nam='log file sync' ela= 2394 buffer#=130714 sync scn=1266588560 p3=0 obj#=-1 tim=1347583167590888 WAIT #47124064145648: nam='log file sync' ela= 932 buffer#=101989 sync scn=1266588598 p3=0 obj#=-1 tim=1347583167592057 WAIT #47124064145648: nam='log file sync' ela= 291780 buffer#=102074 sync scn=1266588637 p3=0 obj#=-1 tim=1347583167884090 WAIT #47124064145648: nam='log file sync' ela= 671 buffer#=102196 sync scn=1266588697 p3=0 obj#=-1 tim=1347583167885294 WAIT #47124064145648: nam='log file sync' ela= 957 buffer#=102294 sync scn=1266588730 p3=0 obj#=-1 tim=1347583167886575 WAIT #47124064145648: nam='log file sync' ela= 852 buffer#=120 sync scn=1266588778 p3=0 obj#=-1 tim=1347583167887763 WAIT #47124064145648: nam='log file sync' ela= 639 buffer#=214 sync scn=1266588826 p3=0 obj#=-1 tim=1347583167888778 WAIT #47124064145648: nam='log file sync' ela= 699 buffer#=300 sync scn=1266588853 p3=0 obj#=-1 tim=1347583167889767 WAIT #47124064145648: nam='log file sync' ela= 819 buffer#=102647 sync scn=1266588886 p3=0 obj#=-1 tim=1347583167890829
    76. 76. 78 Software Group Top 10,000 waits
    77. 77. 79 Software Group Exadata 12c Smart Flash Cache Write-back • Database writes go to flash cache – LRU aging to HDD – Reads serviced by flash prior to age out – Similar restrictions to flash cache (smart scans, etc) – Will be most effective when “buffer waits” exist – random IO writes are less problematic for flash than sequential writes.
    78. 78. 80 Software Group Performance tests 1,917.34 7,693.62 0 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 Write Back Write Through seconds FlashCacheMode CPU Time Other Wait Time Free Buffer Waits Buffer Busy Watis
    79. 79. 81 Software Group Summary
    80. 80. 82 Software Group Recommendations • Don’t wait for SSD to become as cheap as HDD – Magnetic HDD will always be cheaper per GB, SSD cheaper per IO • Consider a mixed or tiered storage strategy – Using DB flash cache, selective SSD tablespaces or partitions – Use SSD where your IO bottleneck is greatest and SSD advantage is significant • DB flash cache offers an easy way to leverage SSD for OLTP workloads, but has few advantages for OLAP or Data Warehouse
    81. 81. 83 Software Group How to use SSD • Database flash cache – If your bottleneck is single block (indexed reads) and you are on OEL or Solaris 11GR2 • Flash tablespace – Optimize read/writes against “hot” segments or partitions • Flash temp tablespace – If multi-pass disk sorts or hash joins are your bottleneck • Device cache (Dell FluidCache, FusionIO direct cache) – If you want to optimize both scans and index reads OR you are not on OEL/Solaris 11GR2 • Exadata uses Flash effectively for read AND write optimization – Consider allocating some of Exadata Flash as ASM tablespace for hot tables and segments
    82. 82. 84 Software Group Visit the Dell Software Booth Enter for a chance to win a Dell Venue Pro 11 tablet Draw is at 2:45pm Thursday
    83. 83. Please complete the session evaluation on the mobile app We appreciate your feedback and insight guy.harrison@software.dell.com @guyharrison Guyharrison.net

    ×