Recently while developing a caching tool capableof providing access to millions of records in sub-seconds, we analysed some of the existing bestpractices. (1) One choice is MySql native memory cache for avoiding GC-pauses.MySQL InnoDB storage engines dictionary cache provides blazing fast data acceess.On demand, data blocks are read from disk and stored in memory using LRU chainsor other advanced mechanisms(like MySQLs Midpoint Insertion Strategy) .When should we use MySQL query cache?Here is the summary from - http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html • Identical queries are issued by the same or multiple clients on a repetitive basis. • The underlying data being accessed is static or semi-static in nature. • Queries have the potential to be resource-intensive and/or build brief, but complexly computed result sets. Data warehouse/business intelligence situations, web-based applications, and traditional OLTP systems all qualify on the surface. The query cache looks for identical queries (spacing, upper/lower case, all come into play), ergo the first point above.For internals of InnoDB Cache, refer to the MySql tutorial.In general, MySQl memory cache provides best performance if the majority of yourdatabase queries aresimply SELECT statements.Here the speed gain is definitely due to negligible GC pauses and direct diskaccess. This type of caching strategy is popular for eCommerce and bidding platformwhere data query is more frequent than data updates.(2) An alternative solution is Memcached which lives in a different process andmanages its own GC (optimized for caching). We know that in order to lookup thevalue for a key, Memcached cache clents issue multiple parallel requests tomemcached servers where keys are sharded across those servers.The value-items resolved by Memcached are usable objects while those returnedfrom the MySQL cache must then be converted into objects.As queries get more complex, Memcached provides more efficient performance.
If an application (using mysql cache) is load-balanced over multiple servers, theneach web process will maintain a separate duplicate cache unless otherwisecustomized and optimized specifically.But the advantage of using Memcached is that it creates a single global cacheavailable to all web processes . Individual web processes and threads do not maintaintheir own caches in this case, but rather use the single global cache. When one webprocess puts data in the cache, it becomes available for all others to access. Itleverages fast Network connections.The detail architecture of Memcahed is out of the scope of this article.The memcached server and clients work together to implement one global cacheacross as many machines as you have. In fact, its recommended you run both webnodes (which are typically memory-lite and CPU-hungry)and memcached processes (which are memory-hungry and CPU-lite) on the samemachines.For a frequently changing data set , Memcahed is surely better than DB Cache asbulk of the data are read from memory rather than disk . But point to note - it stillsuffers from GC-pauses however minimal it could be !(3) Now lets see how Ehcache (leveraging Terracotta BigMemory) has reallygot the basics right !Just by simply adopting a very well known Disk I/O pattern of accessing nativememory directly from java, it has simply eliminated the GC factor and proved to bethe fastest caching option !Now the choice is ours - (a) whether to adopt a caching strategy that requires "build abigger computer with huge RAMs" (b) or use the faster cache in a machine withsmaller RAM.The bottom line is - the cache must be outside GCs control. There is no GCmechanism available as of now (unless we leverage JDK 7 Garbage First ) - whichprovides a relief from Stop-of the World stuation.The trick is to implement a smart serialization technique by storing key, value pairsinto Direct Byte Buffer.Here are some insights - http://www.ibm.com/developerworks/java/library/j-nativememory-linux/ Direct ByteBuffers (allocated using the java.nio.ByteBuffer.allocateDirect() method) that are backed by native memory rather than Java heap. Direct ByteBuffers can be passed directly to native OS library functions for
performing I/O — making them significantly faster in some scenarios because they can avoid copying data between Java heap and native heap. Its easy to become confused about where direct ByteBuffer data is being stored. The application still uses an object on the Java heap to orchestrate I/ O operations, but the buffer that holds the data is held in native memory — the Java heap object only contains a reference to the native heap buffer.A non-direct ByteBuffer holds its data in a byte array on the Java heap.A small prototype using ButeBuffer really yielded high performance.Here is a sample code for creating direct-buffer-based cache (of course its just forlearning, no where close to actual ehcahe implementation).http://tusharkhairnar.blogspot.com/2010/09/bigmemory-memory-with-no-garbage.htmlThis is how Terracotta Bigmemory got the basics of caching right by implementingByteBuffer and bypassing GC pauses !http://blog.terracottatech.com/2010/09/bigmemory_explained_a_bit.html“ ConcurrentMarkSweep collectors can run for hours w/ no full pauses and then suddenly stop the world for 30 minutes trying to catch up to the mess they have made. .... if (k1, v1) is in cache in Ehcache BigMemory, it has no references to (k2, v2). They cannot refer to each other. Which means I dont have to walk the cache looking for cross-references in the data like a GC does. No mark-and-sweep. I merely wait for the dev to call cache.remove( element ) or for the evictor to evict stale data. The evictor and the remove() API are signaling my MemoryManager explicitly just like new() and delete() or malloc() and free(). There is no notion of garbage or collection. Only memory re-use and fragmentation over time. ”If we want to enjoy the fruits of fastest cache we have to perform some simple steps :http://ehcache.org/documentation/offheap_store.html(i) Update the Java classpath to include the new ehcache jar(ii) Modify the App Jvm settings to allocate enough direct byte buffer-XX:MaxDirectMemorySize affect the memory available with this call java.nio.ByteBuffer.allocateDirect(int).(iii) Update the ehcache.xml to set the size of theBigMemory off-heap store.<ehcache><cache name="mycache" maxElementsInMemory="10000"overflowToOffHeap="true" maxMemoryOffHeap="4G"/>
</ehcache>Here in this link - http://code.google.com/p/lucene-bytebuffer/ , we can see howLucene Buffers bypasses memory intensive slow java garbage collector. The objectswhose life-cycle is known beforehand can be effectively cleaned up through simplemanual process instead of relying on GC.The basic advantages of a direct-buffer-based cache (like BigMemory) is :1. Bigmemory takes away all cache data from gc-managed memory to its ownbytebuffer-managed world !2. No need to tune GC3. Low hardware cost and operational simplicity.User can have one 16 GB RAM and and make use of all that memory with only 1JVM. This saves the user from the huge effort of maintaining multiple JVMs to createa distributed cache. User can still go fot it and utilize every byte of the RAM withoutwaiting for GC pause.... “On a different note, though any one can implement a fast cache (as discussed above)using nio package in jdk 1.6; but if the app involes heavy file handling (index files orweb resources or jars) then for faster experience we should use jdk 1.7 nio2.Here are reasons :(http://download.oracle.com/javase/tutorial/essential/io/legacy.html#mapping ) Prior to the JDK7 release, the java.io.File class was the mechanism used for file I/O, but it had several drawbacks. • Many methods didnt throw exceptions when they failed, so it was impossible to obtain a useful error message. For example, if a file deletion failed, the program would receive a "delete fail" but wouldnt know if it was because the file didnt exist, the user didnt have permissions, or there was some other problem. • The rename method didnt work consistently across platforms. • There was no real support for symbolic links. • More support for metadata was desired, such as file permissions, file owner, and other security attributes. • Accessing file metadata was inefficient. • Many of the File methods didnt scale. Requesting a large directory listing over a server could result in a hang. Large directories could also cause memory resource problems, resulting in a denial of service.
• It was not possible to write reliable code that could recursively walk a file tree and respond appropriately if there were circular symbolic links.Jdk 1.7 has great many changes to improve developers productvity and appsperformance - (http://tech.puredanger.com/java7/ )The last but not the least, since we are discussing handling of direct byte buffers, itsworth keeping in mind the recommendations (IO bound cache) posted at -http://nadeausoftware.com/articles/2008/02/java_tip_how_read_files_quickly (a) Minimize I/O operations by reading an array at a time, not a byte at a time. An 8Kbyte array is a good size. (b) Minimize method calls by getting data an array at a time, not a byte at a time. Use array indexing to get at bytes in the array. (c) Minimize thread synchronization locks if you dont need thread safety. Either make fewer method calls to a thread-safe class, or use a non- thread-safe class like FileChanneland MappedByteBuffer. (d) Minimize data copying between the JVM/OS, internal buffers, and application arrays. Use FileChannel with memory mapping, or a direct or wrapped array ByteBuffer.References :http://blog.terracottatech.com/2010/09/bigmemory_explained_a_bit.htmlhttp://ehcache.org/documentation/offheap_store.htmlhttp://www.ibm.com/developerworks/java/library/j-nativememory-linux/http://developers.sun.com/learning/javaoneonline/2008/pdf/TS-5419.pdfhttp://tusharkhairnar.blogspot.com/2010/09/bigmemory-memory-with-no-garbage.htmlhttp://download.oracle.com/javase/tutorial/essential/io/legacy.html#mappinghttp://java.sun.com/developer/technicalArticles/javase/nio/http://tech.puredanger.com/java7/