Hashing Directory Scheme for NAND Flash File System


Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Hashing Directory Scheme for NAND Flash File System

  1. 1. Hashing Directory Scheme for NAND Flash File System Seung-Ho Lim, Chul Lee and Kyu-Ho Park Computer Engineering Research Laboratory Department of Electrical Engineering and Computer Science Korea Advanced Institute of Science and Technology, Daejeon Korea {shlim, chullee}@core.kaist.ac.kr, kpark@ee.kaist.ac.kr Abstract— Flash memory, especially NAND flash memory, in-place update is allowed. This means that when data is has become a major method for data storage in mobile systems modified, new data must be written to an available free page because of its small size, shock resistance, and low power in another position, and this page is considered as a live consumption. When flash memory is used, conventional file systems cannot be used upon a bare flash memory due to page. The page which contains old data is considered as a physical characteristics. Instead of it, a few flash file systems dead page. As time passes, a large portion of flash memory are developed which use the direct interfaces with flash is composed of dead pages, and the system should reclaim memory. However, their data recording method have problems the available free pages for write operations. The erase as the capacity of flash increases, because these use a log based operation makes free pages available. However, because the data recording method, which means that they encapsulates the metadata information into data node for all files when unit of an erase operation is a block, which is much larger data are written. Their method are well for small capacity than a write unit, this mismatch causes an additional copy of flash memory, but not adequate for large size because it operation of live pages in erasing the block somewhere else. consumes more mount time, memory footprint and operation This process is called garbage collection. times as capacity increases. In this paper, we present an To address these problems, a flash translation layer has efficient metadata management scheme of flash file system for giga scale flash memory. Specifically, we design the hashing been introduced between the existing file system and flash directory structure for directory management and two level memory[3][4], and later, a few flash file systems were index structure for file index management. These management developed with no extra layer of translation in between scheme can reduce the file system mount time and memory for more efficient use of flash characteristics and file footprint during runtime. system characteristics[5][6][7]. Currently designed flash file Keywords— Flash File System, NAND Flash Memory, Hash- systems use log based data and metadata management ing Directory, Indexing scheme when data are written to flash memory. It means that they write the data using their logging technique that encapsulates the metadata information into data node when 1. I NTRODUCTION data are written. These metadata management are well Flash memory has become an increasingly impor- for small capacity of flash memory because this metadata tant component as a nonvolatile storage media because management is easy and adequate to flash memory char- of its small size, shock resistance, and low power acteristics. However, their method have problems as the consumption[1]. In nonvolatile memories, NOR flash mem- capacity increases. As flash memory capacity increases, it ory provides fast random access speed, but high cost and consumes more file system mount time, runtime memory low density, compared with NAND flash memory. NAND footprint and operation times, since all the logged region flash has advantages in large storage capacity and relatively should be retrieved at file system mount to identify their high performance for large read/write in contrast to NOR metadata contents and make file system structure in mem- flash. Therefore, NAND flash is widely used as data storage ory. These memory based file system directory and file in mobile embedded systems. structures should prolong during system runtime. NAND flash memory chips are arranged into blocks, and In this paper, we present an efficient metadata manage- each block has a fixed number of pages, which is the unit of ment scheme of flash file system for giga scale flash mem- read or write. A page is further divided into a data region for ory. Specifically, we design the hashing directory structure storing data and a spare region for storing the status of the for directory management, and two level index structure data region. Due to the flash memory characteristics, form for file index management. These management scheme can of Electrically Erasable Read Only Memory (EEPROM), no reduce the file system mount time and memory footprint I S 9 7N 8 - 8 9 - 5 5 1 9 -0 1 - 2 1 7 - 38 B 9 3 5 6 3 F e 1b 2. - 2 1 0 4 I , C A 0 7 C T 2 0 0
  2. 2. /home/shlim/a.out during runtime. Also, hashing directory scheme can reduce Hashing file lookup and creation latency when directory hierarchy 0x20ca01 is much complex. Modulated Value 000 001 010 011 100 101 110 111 Limited Hash Table 2. R ELATED W ORK Hash Bucket Conflict List In this section, we describe some features of previously Hash Value #1 0x00d891 Physical Offset CP CL 0xd2000 Hash Value #2 0x2e103 Physical Offset 0x12800 designed flash file sytesm. The first design of flash file sys- 0x20ca01 0x12800 0x12103 0x61000 0x1b8071 0x7c000 0xda101 0xc3800 tem is Journalling Flash File System version 1 and 2. The ... ... JFFS2[5] is a log-structured file system which stores the nodes containing data and metadata to every free region in Fig. 1. Directory Hash Table Management flash chip sequentially. JFFS2 makes new node containing both inode and data for the file when write operation is performed, The corresponding inode’s version is increased the separation of data and metadata in flash file system by 1 for every node writes. Therefore, JFFS2 should scan can enhance the mount time and memory footprint require- entire flash memory media at mounting time in order to find ments, as disk based file system adopted this separation. inode which has the latest update version. Also, lots of in- For example, if tree based directory structure is stored in memory footprint is required to maintain all the metadata flash medium as a type of metadata, the flash file system information in memory. Another approach is YAFFS[6], can mount by finding only root directory. Also, all directory which is designed for specific NAND flash memory chips. tree doesn’t need to reside in main memory during runtime. In YAFFS, each page is marked with a file id and chunk Our flash file system design motivation is to reduce the number. The file id denotes the file inode number, and the additional page consumption overhead when file is created chunk number is determined by dividing the file position or written to flash memory, while metadata recording is by the page size. These are stored in the spare region of the apart from real data. NAND flash memory. Therefore, the boot-time scanning to In our design, flash blocks are classified into three build file structures should be required only the spare region types; hashing directory, inode, and data blocks. Hashing reading, this would be relatively faster than that of JFFS2. directory blocks contain Hashing table of directory files, However, the entire scan of flash medium at mount is still inode blocks contain file’s inode, and data blocks contains required, and memory footprint is also proportional to the file’s real data. Among three types, hashing directory and size of flash medium. The CFFS[7] tried to store the entire inode represent metadata of file system. For each inode data index entries to dedicated flash region for each inode whether it is directory or regular file, we allocated one flash to reduce the flash scan time. For this, each inode occupied page to store sufficient metadata information. In case of an entire flash page to preserve sufficient indexing space. superblock, the first block of flash memory is dedicated For inode indexing allocation, CFFS provide two inode to superblock which representss the map of block. Our classes; direct indexing and indirect indexing. In addition superblock management is similar to other flash file system to, it allocated separate flash blocks for the metadata and design issue so we skip to explain this issue. From now on, data regions which leads to a pseudo hot-cold separation the detailed design of metadata management is described. because metadata are hotter than data. By doing this, CFFS can reduce flash scan time and garbage collection overhead. 3.1. Directory Structure However, its memory footprint is still linearly increased as In the directory management, let’s consider the conven- the file system usage increases such as number of file and tional approach of directory entry in indexing file system. directory, and file size. For each file creation, the file name and its inode number is added to its parent’s directory entry. The directory entries 3. F LASH F ILE S YSTEM A RCHITECTURE contain the pair of f ile name and its inode number In file systesm, metadata is a key information to construct included in this directory. Since directory entry contains file system architecture such as directory structure and file’s all the file names included in it, it’s entry data can be data structure. Almost flash based file systems put together large and should be pointed by indirect or higher degree metadata and data in the same node when the data is written of indexing pointer. In that circumstance, write of directory to flash memory. Metadata management itself gives an entries, that are file name and inode number, will result in additional overhead if these are not added into real data, but lots of additional flash page consumption, because every stored apart from real data. It is because metadata update data should be written to a new flash page and it leads the incurs additional page consumption in flash memory due update of the index information of inode. The additional to the out-of-place update problems. However, basically, page consumption means the consumed page due to the I S 9 7N 8 - 8 9 - 5 5 1 9 -0 1 - 2 1 7 - 48 B 9 3 5 6 3 F e 1b 2. - 2 1 0 4 I , C A 0 7 C T 2 0 0
  3. 3. /home/shlim/ /home/shlim/a.out ‘/home/shlim/’s Dentry i−class 1 Inode Hashing Hashing Attribute Attributes 0x20ca01 0x1b8071 data page File Type File Name Directory Name Parent inode Hash Bucket {inode #, Physical Offset} Direct index pointer {inode #, Physical Offset} Hash Value #1 Physical Offset CP CL {0x1b8071, 0x7c000} data page 0x00d891 0xd2000 {inode #, Physical Offset} 0x20ca01 0x12800 . 0x1b8071 0x7c000 ... . Data Spare . Direct Index Pointer i−class 2 Inode data page NAND flash Attributes index page data page File Name Fig. 2. Dentry Structure and File creat/lookup method Parent inode Indirect index pointer Data Spare Data Spare NAND Flash Memory update of metadata. We note that the number of additional flash page consumed due to the update of inode index Fig. 3. The different Inode Index Mode between i-class 1 and i-class 2 information is proportional to degree of indexing level. This situation is occurred not only directory but also file’s inode. In this subsection, we consider directory related, and in the next subsection, we will focus on the file related. To reduce 3.2. File Data Indexing the higher degree indexing level in the directory indexing The inode blocks contain file’s inode. Since we use a pointer, we just store the inode number of files included in whole flash page for inode, as described in previous section, its parent directory to directory entry. For this, we allocate we can allocate a lot of indexing entries to point to its data one flash page for each inode, and the directory entries are regions. For example, if we use flash memory with a 2KB stored directly to its inode. The file name is stored in it’s page size, 448 four-byte index entries can exist except the inode page. file’s attributes. Using these index entries, we classifies the Our directory is management by hash structire. The inode into two classes; i − class1 keeps only the direct hashing directory blocks contain list of pair of hashed indexing for all index entries except the last one, and i − directory inode number and it’s physical position in flash class2 keeps only the indirect indexing for all index entries memory. The size of one pair in the list is 8 bytes. Since except the last one, as shown in Fig. 3. The last index every updated inode should be written to another clean page entry is used as indirect indexing for i − class1 and double in flash memory, fixed inode bitmap allocation method is indirect indexing for i − class2, respectively. less meaning in flash file system, so we use dynamic inode The reason we classifies inodes into two types is the allocation method. The inode number to be assigned for relationship between file size and usage patterns. Recent each file is generated using the absolute path name and studies[10] confirm the often repeated observations that hashing function as shown in Fig. 1. For directory file, most files are small and most write accesses are to small hashed inode value is stored in hashing directory blocks. files, however, most storage is consumed by large files For regular file, hashed inode value is stored in it’s parent which are read most of the time. In conventional indexing directory entry. We manage conflict list in hashing table as file system, fixed inode index strategy is applied to all shown in Fig. 1 for conflict cases. files which causes that a large portion of index entries is The file lookup sequence is as follows. If the file is di- dedicated to indirect or higher level of indexing and few rectory, we can find directly its inode number and physical index entries are dedicated to direct indexing. For instance, position by hashing the absolute path name, and directory Ext2 allocates only 12 entries for direct indexing which lookup is completed. For the regular files, first, the absolute represents 2.4% space when file size is 1MB, if the general path name, which is up to parent directory, is hashed by threshold of file size between small file and large file is hashing function. In case of Fig. 1, first, ‘/home/shlim/’ 1MB. Therefore, the probability that file should use indirect is hashed and found from the hash table. Then, the ab- or higher level indexing entries is much high, even if the solute path name including it’s name, ‘/home/shlim/a.out’ file is small. In that cases, write of data will result in lots of is hashed and found by searching directory entry of parent additional flash page consumption, as directory entries do. directory. Finally, we can read the inode of file. The overall Repeatedly speaking, we note that the number of additional flow can be describe as in Fig. 2. If this hashing directory flash page consumed is proportional to degree of indexing scheme is used, we can reduce file creation and lookup level. When the small file is accessed, the additional flash overhead. page consumption is only the inode page itself because I S 9 7N 8 - 8 9 - 5 5 1 9 -0 1 - 2 1 7 - 58 B 9 3 5 6 3 F e 1b 2. - 2 1 0 4 I , C A 0 7 C T 2 0 0
  4. 4. most small files are in i − class1 and they have all characteristics of flash memory, instead of it, dedicated flash direct indexing entries. However, each writing of large files file system should be used. All the current developed flash consumes two additional page consumptions because large file systems use log based data recording that encapsulates files are in i − class2. For i − class2, the write of any the metadata information into data node for all files when entry in the indirect indexing entries causes additional page data are written. These metadata management are well consumption due to the update of the indexed page, which for small capacity of flash memory, but not adequate for contains the real pointers of data. In our indexing, most large size because it consumes more mount time, memory files are included in i − class1, and most write accesses footprint and operation times as capacity increases. In are concentrated to the i − class1 files, Most operations for this paper, we present an efficient metadata management large files are read and inode updates are rarely performed, scheme for flash file system for giga scale flash memory. so the overhead for indirect indexing in i − class2 files are Specifically, we design the hashing directory structure for insignificant. directory management and two level index structure for file index management. These management scheme can reduce 4. A NALYSIS AND P ERFORMANCE E VALUATION the file system mount time and memory footprint during runtime. Also, these schemes can reduce file lookup and First, we analyze the file system mount time and memory creation latency when directory hierarchy is much complex. footprint. At file system mount, we find out what blocks are hashing directory blocks by reading superblock. Then we read the hashing directory blocks, and make hashing table R EFERENCES and tree directory structure in main memory using hashing [1] F. Douglis, R. Caceres, F. Kaashoek, K. Li, B. Marsh, and J. A. Tauber, “Storage alternatives for mobile computers”, In Proc. of value in hashing directory blocks. Since the number of the 1st Symposium on Operating Systems Design and Implementa- hashing directory blocks are small in comparison with huge tion(OSDI), 1994, pp 25-37 data blocks, the mount time can be significantly reduced [2] Samsung Electronics Co., “NAND Flash Memory & SmartMedia Data Book”, 2002. http://www.samsung.com/ compared to other flash file system. Also, the memory [3] “Memory Technology Device (MTD) subsystem for Linux.”, footprint is reduced since we just manage the hashing value http://www.linux-mtd.infradead.org. of directory files. For example, when we use 2KB page flash [4] Intel Corporation, “Understanding the flash translation layer(FTL) specification”, http://developer.intel.com/. memory, one page contains 256 directories and one block [5] D. Woodhouse, “JFFS: The Journalling Flash File System”, Ottawa contains 16K directories. The number of allocated hashing Linux Symposium, 2001. directory blocks can be less than ten blocks for giga scale [6] Aleph One Ltd, Embedded Debian, “Yaffs: A NAND-Flash Filesys- tem”, http://www.aleph1.co.uk/yaffs/. file system. [7] Seung-Ho Lim and Kyu-Ho Park, “An Efficient NAND Flash File For the directory update, when we use 2KB page size System for Flash Memory Storage”, IEEE Transactions on Comput- for inode, we allocate 1504 bytes for direct entries region, ers, Vol 55, No. 7, pp 906-912, July 2006. [8] A. Kawaguchi, S. Nishioka, and H. Motoda, “A Flash-Memory Based seven entries for direct indexing, and 1 for indirect in- File System”, Usenix Technical Conference, 1995. dexing, since we allocate one page for one inode. Then [9] Mendel Rosenblum and John K. Ousterhout, “The Design and Imple- directory can contains 188 files for direct entries, 1792 mentation of a Log-Structured File System”. ACM Transactions on Computer Systems, 10(1), 1992. files for direct indexing entries, and almost 100K files [10] Roselli D, Lorch JR, Anderson TE., “A Comparison of File System for indirect indexing entries for four-byte indexing pointer. Workloads”, Proceedings of the 2000 USENIX Annual Technical Hundreds of directory entries is enough for one directory, Conferences, June 2000. [11] An-I A. Wang, et al,. “Conquest: Better Performance Through A so we can update only its directory inode page when file is Disk/Persistent-RAM Hybrid File System”, Proceedings of the 2002 created in almost cases. This can reduce the additional page USENIX Annual Technical Conferences, June 2002. consumption due to the update of indexing page. In case of file indexing method, we can reduce the page consumption overhead by reducing the number of the updates that are related to indirect indexing pages. As a result, overall flagh page consumption cycle would prolong and also garbage collection overhead can be reduced. 5. C ONCLUSION NAND flash memory has become a major method for data storage in mobile systems because of its small size, shock resistance, and low power consumption. When using flash memory in the system, conventional file systems cannot be used upon a bare flash memory due to physical I S B N 8 - 8 9 - 5 5 9 3 5 6 0 9 7 1 9 - 1 3 1 - 8- 2 7 - 6 F e b .2 1 - 1 4 ,0 I 7 C 2 0 A C T 2 0 0 7