Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

11g r2 flashcache_Tips


Published on

11g r2 flashcache

Published in: Technology, News & Politics
  • Be the first to comment

11g r2 flashcache_Tips

  1. 1. Oracle Database Smart Flash CacheExecutive SummaryThe Oracle Database Smart Flash Cache is a quasi-extension of the Oracle database’s blockbuffer cache onto flash memory devices. The Flash Cache is a read-only cache of clean databaseblocks on tier-1 storage (such as flash memory devices on the PCI Express bus). It is intended toimprove the performance of Oracle databases by reducing the number of I/O requests that mustbe serviced by tier-2 storage (e.g., conventional hard disks or older generation solid statedevices). As blocks age out of the database’s buffer cache they can be moved to the FlashCache, and quickly recalled into the database’s buffer cache if needed by future databaseoperations. Without the Flash Cache, accessing a block that has aged out of memory requiresrelatively slow I/O to fetch the block from conventional storage back into memory.Database Smart Flash Cache is loosely referred to as Flash Cache. It is literally a cache on flash.There several key differences between the database’s buffer cache and the Flash Cache, so itwould not be correct to say the Flash Cache extends the database buffer cache. First, the FlashCache can only store clean blocks. Second, the Flash Cache is read-only: to modify a block thatis in the Flash Cache the block must first be read into the buffer cache. After the block has beenmodified the dirty image is flushed to storage and marked clean in the buffer cache. Eventually,the clean block will age back to the Flash Cache. Thus we see at no point does the Flash Cachecontain dirty block images, and users cannot modify blocks within the Flash Cache. The thirdprimary difference is the buffer cache must be persisted in RAM while the Flash Cache cannot.To understand the reason Oracle created the Flash Cache, consider how the database buffercache works. All blocks eventually age out of the database buffer cache. Subsequent requestsfor those blocks require physical I/O. Physical I/O can be an expensive operation. Arecommended solution is to move the database files from conventional storage devices ontolocally installed PCIe based flash memory storage devices, such as Fusion-io devices, whichhave higher bandwidth and lower latency. Not all database configurations support local storage.A second solution for any such customer is to significantly increase the size of Oracle SGA, byas much as 10x, but the cost of DRAM makes this cost prohibitive. Also, the amount of DRAMthat can be addressed by a user process (e.g., Oracle) is limited by many factors including theBIOS, operating system kernel, and server hardware designs. Thus, a third solution is required.Enter the Oracle Database Smart Flash Cache. The Flash Cache supports locally installed flashmemory devices, and provides a cost effective means of increasing Oracle’s buffer space.The Flash Cache is a free feature of Oracle Database Server 11gR2 Enterprise Edition. It is notavailable with any other edition of Oracle. It can only be used on an Oracle operating system.For example, it can be used on OEL but not RHEL.Editor’s NotesFlash Cache might be the best performance enhancing feature in Oracle 11gR2, and its benefitsare proportional to the Flash Cache’s underlying storage. Using a single Fusion-io card to holdthe Flash Cache can boost overall database performance many times over. Not all customers areable to use the Flash Cache due to licensing and technical restrictions. 1
  2. 2. What Is Flash CacheThe official product name is “Database Smart Flash Cache”. It is referred to in this paper asFlash Cache. There is an unrelated Exadata Flash Cache product, not discussed in this paper.The Database Smart Flash Cache is a secondary buffer cache that sits behind the Oracle SGA’sdatabase block buffer cache. Without Flash Cache, clean blocks that age out of memory requireexpensive I/O for subsequent access. With Flash Cache, clean blocks age out of the buffer cacheto the Flash Cache from where they can be instantly retrieved back into the buffer cache uponrequest. An illustration is provided later.Once the Flash Cache is configured all tables and indexes can use it. By default all tables inOracle and later are configured to use the Flash Cache, and do so automatically once theFlash Cache is enabled. However, there are some restrictions and limitations listed later in thisdocument to be aware of.Flash Cache is automated and internalized. It cannot be used directly. It is part of Oracle’smemory management sub-system, and therefore it can only be used by the database engine. Forexample: you cannot tell Oracle to load the SGA onto the Flash Cache, nor can you store tablesor indexes on a flash cache device. What you can do is tell Oracle how big to make the FlashCache, and which tables to cache in the Flash Cache. Oracle will automatically determine whento write blocks into the Flash Cache and when to erase blocks from the Flash Cache.What Flash Cache Is NotThe Flash Cache cannot be used to store database files. The Flash Cache is not a generalpurpose cache. It cannot be used to buffer reads from specific storage devices or file systems. Itcannot buffer writes at all. It cannot be used to cache query results – Oracle has another featurecalled the Server Results Cache that can be used to cache query results.Why Use Flash CacheMany customers have a sizable imbalance between the amount of data in their database and theamount of memory on their database server. They can only cache a very small percentage oftheir data in memory. When the Oracle database buffer cache is too small, blocks get evictedfrom memory and must be re-fetched from disk into memory over and over again. Such aproblem could be eliminated by adding significantly more memory (DRAM) to the host, but thatsolution is cost prohibitive. Servers may impose physical limits on DRAM as well.The Oracle Database Smart Flash Cache, or simply “Flash Cache”, allows customers to expandOracle’s data caching capabilities onto high-speed high-capacity storage devices, such Fusion-io.Flash Cache allows customers to cache a seemingly unlimited amount of data on devices thatperform many times faster than conventional storage and at a price point well below DRAM.The first indication that Flash Cache might be needed often comes from a memory advisorwithin Oracle Enterprise Manager (OEM). If OEM recommends increasing the size of the blockbuffer cache by at least 2x, then Flash Cache should be investigated. 2
  3. 3. How Flash Cache WorksThe below picture is from the Oracle 11g R2 Concepts Guide:The elements of the above picture can be described as follows:  The object shown as “magnetic disk” in the above illustration is the database’s permanent storage for database files. Permanent storage is often the slowest server component: many Oracle customers buy storage for a certain capacity-for-price point rather than for a performance-for-price point (i.e., they buy hard drives or SAN appliances).  The database buffer cache sits in main memory, which is typically very fast and very expensive DRAM, and its purpose is to buffer database bocks retrieved from permanent storage. Oracle tries to perform all queries, inserts, updates, and deletes in the buffer cache, but if the data is not in the cache it is fetched from storage by a “Server Process”.  The Flash Cache sits in locally attached storage (not shared storage). The Flash Cache is actually a single file, and should be stored on flash memory devices for maximum performance.After a Server Process has fetched a block into the buffer cache, the block can stay in the buffercache “forever”. As long as the block is touched by a Server Process often enough it remains onthe “hot” list and is kept in the buffer cache. If it is not touched for some threshold amount oftime, then it is considered a “cold” block and becomes eligible for migration to the Flash Cacheor permanent storage.Flash Cache does not buffer all data. Flash Cache only supports the DEFAULT pool of thedatabase buffer cache. It does not buffer any blocks from the nK buffer caches, the KEEP pool,or the RECYCLE pool. Also by default the Flash Cache will not buffer any blocks that wereread into memory as a result of a scan operation. This is configurable as described elsewhere inthis document.Loss of the Flash Cache cannot lead to data loss. The Flash Cache only holds blocks that arefully persistent on disk. When blocks in the database buffer cache are dirtied, they are writtenout to disk as usual. Oracle always performs the write-out before the blocks are moved into theFlash Cache. Thus, all blocks in the Flash Cache are “clean”. 3
  4. 4. Here is a description of the workflow that happens when a user requests a block of data:  The user’s Server Process first scans the database’s in-memory buffer cache. • IF the requested block is not found in the database buffer cache, then the Server Process scans the Flash Cache.  If the block is found in the Flash Cache, then the Server Process moves the block into the database buffer cache using a type of physical I/O called “optimized physical read”.  If the block is not found in the Flash Cache, then the Server Process sends a request to the host file system. • The host will read the block from the file system into the file system buffer cache (physical I/O) and then signal the Server Process to take over. • The Server Process will read the block from the file system buffer cache into the database buffer cache (logical I/O). • Finally, the Server Process reads the block from within the database buffer cache for processing. Notice above we see three I/O’s occur for each block read, although Oracle’s metrics will only reflect one physical and one logical I/O since it does not count reads by file system processes. • ELSIF the block is found in the database buffer cache, or if a Server Process has fetched the block into the buffer cache from storage or Flash Cache, then the Server Process can operate on the block.  The Server Process performs logical I/O on the block within the in-memory buffer cache.  If the operation is read-only, then the block is not dirtied and does not need to be written to permanent storage. Over time the block can be aged out of the buffer cache. It will go to the Flash Cache if enabled, or to permanent storage.  If the operation dirties the block, then the Database Block Writer process (DBWn) is responsible for writing it to permanent storage and then marking the block clean in the buffer cache. • If using direct I/O, then DBWn writes the dirty buffers to storage. Otherwise, DBWn only writes to the file system buffer cache which will eventually write to storage. • Once the block is marked clean it is subject to aging rules. It will go to the Flash Cache if enabled, or to permanent storage.The above workflow is general. The user may specify a unique operation that will behavedifferently.How Does The Flash Cache Work With Full Table ScansIf the user performs a full table scan, then Oracle will count the number of blocks already in thebuffer cache for that table. 4
  5. 5.  If the number is high, then Oracle will read the blocks from disk into the buffer cache. In this case blocks are eligible for the Flash Cache, but the blocks can only age out to the Flash Cache if the table’s storage property is set to FLASH_CACHE KEEP.  If the number is low, then Oracle will perform a direct-path read and will not cache the blocks. In this case the blocks are not eligible for the Flash Cache regardless of the table’s storage property.Some types of scans will never read data into memory. The SELECT COUNT(*) statement willnot read blocks of data into memory, and you will not see any use of the Flash Cache.See this document’s section on “Configuration” for more information about configuring tables towork with the Flash Cache. The discussion includes full table scans. 5
  6. 6. RequirementsWhen the product was initially released it required an Oracle Exadata V2 machine and was onlysupported on Oracle Enterprise Linux (OEL). Support for Exadata V2 with Solaris SPARC wasadded later. Eventually, Oracle announced support for using Flash Cache on OEL, and for FlahCache on Solaris without Exadata hardware.Database Smart Flash Cache has the following requirements (at the time of this writing,September 2011):  It requires Oracle Database Server or higher. If you really must run it on, you can obtain Linux patch 8974084 and PSU from Oracle.  It requires Enterprise Edition. No other editions are supported; Flash Cache is not supported on Standard or Standard One Edition.  It requires one of the following operating systems: Oracle Solaris SPARC 64-bit, Oracle Solaris X86_64, OEL 32-bit, or OEL x86_64. See the Oracle Database Licensing Information documentation on-line. Other operating systems like Windows, AIX, RHEL, SuSE are not supported.  To use Flash Cache on Solaris you must have Solaris 10U6 or higher with the following patches: 125555-03, 140796-01, 140899-01, 141016-01, 139555-08, 141414-10, 141736- 05.  For every block stored in the Flash Cache, Oracle consumes 100 bytes of storage in the buffer cache for metadata (pointers, etc.) If you are using RAC, the number is 200 bytes per block. A 640 GB flash cache with 8K block size can hold up to 8388608 blocks, so you lose 800 MB of buffer cache in non-RAC systems, or 1.6 GB of buffer cache in RAC systems. The solution is to increase parameter db_cache_size by the same amount being taken away by Flash Cache.  The storage device must be at least 101 MB larger than DB_FLASH_CACHE_SIZE. If the Flash Cache encroaches on this reserved space, then the database will not start.  If you are using RAC, then the Database Smart Flash Cache must be configured on either all nodes or none of them. You cannot use it on “some” of the nodes. Parameter DB_FLASH_CACHE_FILE must be set identically on all nodes. (Note: I have heard from at least one customer who is using Flash Cache on just one node of a 3 node RAC).  The initial release of Flash Cache required Oracle’s own Sun Flash PCI cards loaded with high speed SLC NAND flash memory. The current release of Flash Cache allows you to use Fusion-io PCIe devices with SLC and MLC NAND flash memory. 6
  7. 7. ConfigurationGeneral Information About Configuring Flash CacheBy default an Oracle database has no Flash Cache, but all tables and indexes are automaticallyconfigured to use it anyways. Once the DBA has created the Flash Cache the tables and indexesautomatically start using it. If this is not desirable, the DBA can alter each table or index’s FlashCache properties.Start by installing one or more flash memory cards in the database server.Next, format the cards, or import them into an ASM diskgroup for better performance. ASM isnot required, but recommended. My recommendation is to leave the flash cards unformatted,create a partition offset by 1 MB, and feed the partitions to ASM; all flash devices should bemanaged as disks within redundant ASM diskgroup.RAC Tip: each instance requires its own Flash Cache device, and when using ASM eachinstance requires its own ASM diskgroup.The next step is to set the database initialization parameters DB_FLASH_CACHE_FILE andDB_FLASH_CACHE_SIZE.Bounce the instance so Oracle can initialize the Flash Cache.The next step is optional: you can re-configure each table or index’s Flash Cache properties. Bydefault all tables and indexes are set to STORAGE(FLASH_CACHE DEFAULT). This meansall blocks fetched into the buffer cache by a “db file sequential read” operation can use the FlashCache. If you would also like to include blocks fetched by scan operations simply alter theproperty to KEEP, or to prevent it from using the Flash Cache set the property to NONE. The STORAGE clause looks like this: STORAGE ({ INITIAL size_clause | NEXT size_clause | MINEXTENTS integer | MAXEXTENTS { integer | UNLIMITED } | maxsize_clause | PCTINCREASE integer | FREELISTS integer | FREELIST GROUPS integer | OPTIMAL [ size_clause | NULL ] | BUFFER_POOL { KEEP | RECYCLE | DEFAULT } | FLASH_CACHE { KEEP | NONE| DEFAULT } | ENCRYPT ) Notice the FLASH_CACHE clause has three settings, which are described below:  DEFAULT is the default setting. It tells Oracle you want blocks to be written to the flash cache when they are aged out of the database buffer cache, and they can be aged out of the flash cache according to Oracle’s LRU algorithm. Since DEFAULT is the default, you can omit it entirely for the same effect as shown in the below example. 7
  8. 8.  KEEP tells Oracle to cache the object’s blocks in Flash as long as space permits.  NONE tells Oracle you do not want blocks for this table to use the flash cache.Example 1: you do not want table EMP to use Flash Cache. ALTER TABLE EMP STORAGE (FLASH_CACHE NONE);Example 2: you want table EMP to use the Flash Cache regardless of how the blocks werefetched into memory. ALTER TABLE EMP STORAGE (FLASH_CACHE KEEP);Example 3: you want to return table EMP to the default use of Flash Cache. ALTER TABLE FOO STORAGE (FLASH_CACHE);Understanding The Flash Cache Initialization ParametersThere are only two initialization parameters related to Flash Cache. Each is detailed below.DB_FLASH_CACHE_FILEThis parameter is used to specify the ASM disk group or the fully qualified name of a file thatrepresents your Database Smart Flash Cache. This parameter should only be set by customerswho are using the Database Smart Flash Cache feature. The Flash Cache should be stored on thefastest possible flash memory storage device, like a Fusion-io ioDrive. The storage device shouldbe dedicated to the Flash Cache.The parameter can be set using an ALTER SYSTEM statement. However, you must first installhardware (flash storage), and set parameter DB_FLASH_CACHE_SIZE, then set this parameterto the SPFILE and then bounce the database.The parameter can be used with ASM diskgroups, file systems, and unformatted block devices.To use a raw device you must create a symbolic link that points to the raw device, and give thename of the link to Oracle. Here are a few examples: ALTER SYSTEM SET DB_FLASH_CACHE_FILE=/dev/fioa1 SCOPE=SPFILE SID=*; ALTER SYSTEM SET db_flash_cache_file=/dev/sdd1 SCOPE=spfile SID=*; ALTER SYSTEM SET DB_FLASH_CACHE_FILE=+FLASH/MYDBA/FLASHFILE/fc.ora SCOPE=SPFILE;Please observe the following notes:  The SID clause is optional.  If the file does not exist Oracle will create it.  When using raw devices there are no files, so specify the device partition (such as sdd1).  The device mst be partitioned, and the flash cache must be on the partition to avoid Oracle clobbering the disk’s volume label.  The oracle user must be granted r/w permissions (i.e., chmod 660 /dev/sdd1).  When using ASM to store the Flash Cache you must specify a file name and not just a diskgroup name. See above example. 8
  9. 9.  The parameter can be set to a symbolic link that points to the real flash cash file.If you are using RAC please note the Flash Cache file cannot be shared by multiple instances.Every instance must point to a separate file. However, you must set this parameter to the samevalue on all nodes. This means if you are using ASM then you must use a separate diskgroup forinstance.DB_FLASH_CACHE_SIZEThis parameter allows you to specify the size of the Flash Cache, which is defined by anotherparameter DB_FLASH_CACHE_FILE. The default is 0, which disables the Flash Cachefeature. The minimum suggested size is 2 * db_cache_size. The maximum suggested size is 10* db_cache_size. These are not strictly enforced. However, the larger the Flash Cache the morebuffers are consumed in the database buffer cache, so that a sufficiently large Flash Cache mayprevent Oracle from starting unless you also increase the parameter db_cache_size.The parameter can be set using an ALTER SYSTEM command like this: ALTER SYSTEM SET DB_FLASH_CACHE_SIZE=2400G SCOPE=SPFILE SID=*;This parameter may only be specified at instance startup. After instance startup you cannotchange the size of the Flash Cache but you can disable/enable the Flash Cache. That is, youcannot change the size from 500G to 501G at run time, but you can set the parameter to 0 usingan ALTER SYSTEM command which effectively disables the Flash Cache while the database isrunning. You can re-enable flash cache by setting this parameter to the same value you whenusing the database was started, but you cannot set it to a different value.The size of the Flash Cache must be at least 100 MB smaller than the flash storage device. Forexample, if the storage device is 320 GB then the maximum db_flash_cache_size is roughly31900M.RAC TipsAccording to the Oracle documentation and Oracle Support web site, the Flash Cache file cannotbe shared by multiple instances and every instance must point to a separate file, and you must setthis parameter to the same value on all nodes. These three requirements mean if you are usingASM then you must use a separate diskgroup for each RAC instance, and also means you mustuse local storage like Fusion-io, not shared storage.I have talked to customers who have implemented the Flash Cache on “some” instances. Forexample, one customer had a 8-node RAC and only implemented Flash Cache on 4 nodes. 9
  10. 10. Monitoring Flash CacheActivity shows up in the AWR as Optimized Physical Reads.Metrics can be obtained easily from the view V$SYSSTAT like this: select * from v$sysstat where name like flash cache%;To see which segments and blocks are in the Flash Cache, use the view V$BH like this: SELECT owner || . || object_name object, SUM (CASE WHEN b.status LIKE flash% THEN 1 END) flash_blocks, SUM (CASE WHEN b.status LIKE flash% THEN 0 else 1 END) cache_blocks, count(*) total_blocks FROM v$bh b JOIN dba_objects ON (objd = object_id) GROUP BY owner, object_name order by 4 desc;The above SQL statement was copied from Guy Harrison’s web site. 10
  11. 11. Troubleshooting, Issues, Bugs & PatchesThe Flash Cache is populated by the Database Block Writer (DBWn) processes. This is a lowpriority task for DBWn, compared to writing dirty blocks to permanent storage. Thus, whenDBWn is saturated the Flash Cache will not appear to be used. The solution is to increase thedatabase initialization parameter DB_WRITER_PROCESSES. The default value for thisparameter is 4, which is good for most customers, but some customers will need a higher value.The initial release of 11gR2 ( did not support Flash Cache. Oracle released a patch tomake Flash Cache work on, but it had many bugs. I recommend against using FlashCache with Oracle version second release of 11gR2 ( also had many bugs related to Flash Cache. Some of thebugs were limited to RAC, but others bugs affected all customers.The third release of 11gR2 ( is considered to be stable.Below is list of major bugs in Flash Cache:  Bugs 8444791 and 10216012: NOT ABLE TO SPECIFY THE DISKGROUP NAME IN DB_FLASH_CACHE_FILE. When using ASM you cannot set db_flash_cache_file to the name of a diskgroup. At last check this bug was not fixed. Do not worry. In this document I describe the correct was to set parameter db_flash_cache_file using a full file specification.  Bugs 12730844 and 12673694: ORA-600 [KJBRASR:PKEY], [62839680], "Lock conflicting with the NEW request is on the remastering queue". Affects versions and higher, fixed in The published workaround is “do not use flash cache”.  Bug 9199151: DATABASE FLASH CACHE FILE RE-USE SEMANTICS ARE FAULTY. In both RAC and non-RAC environments, if you point two instances to the same flash cache file, then the 1st instance will own the file only until the 2nd instance starts, at which time the 2nd instance will take ownership of the file so that the 1st instance now has no flash cache. Any blocks belonging to the 1st instance will be trapped in the flash cache until the next shutdown/restart of which ever instance currently owns the flash cache file (the 2nd instance in this case). Oracle says this bug is not feasible to fix, so everyone should be made aware of how to avoid it. The workaround is to manually ensure every init.ora file uses a distinct value for db_flash_cache_file, such as using the instance name as part of the file name. If you are using RAC, then you might set parameter sid.db_flash_cache_file rather than *.db_flash_cache_file. NOTE: this workaround has not been tested.Some customers have reported Oracle will not start if db_flash_cache_size is set to a value thatis more than ten times the value of db_cache_size. The Oracle documentation states that theFlash Cache “should” be 2x - 10x of db_cache_size, but Oracle does not enforce a minimum ormaximum size. The Oracle documentation also states the maximum value of parameterdb_flash_cache_size is operating system dependent. There are two things that can limit it:available memory and the file system limits. In other words, anytime you create a file themaximum size is limited by the file system on which you create the file. If you are using Linuxfile systems ext2 or ext3, then the max file size is based on the block size: if the block size is 512 11
  12. 12. bytes or 1 KB then the max file size is 16 GB; 2K = 256 GB; and 4K+ = 2TB. If you arecreating the file on ext4 then the maximum size of the file is 4 GB * disk block size (not databaseblock size). In summary, flash cache on Linux is limited to the file sizes shown in this chart … ext2 / ext3 ext4 Block Size Max File Size Max File Size 512 bytes 1 GB 2 TB 1K 16 GB 4 TB 2K 256 GB 8 TB 4K or higher 2 TB 16 TB 12