FlexFS: A Flexible Flash File System for MLC NAND Flash Memory

         Sungjin Lee† , Keonsoo Ha† , Kangwon Zhang† , Jih...
)91%087 )(6 )5 )4)3
£
                                                                                       tion 2, we pr...
DCBA@         DEBA@                        DFBA@

            ƒ‚€u                                                       ...
„Œ‡ˆ‹‡m ‡„rm Š                                                                                                         µ Õ...
ßÞÛÝ ÜÛÚ Ù Ø×
able for subsequent write requests, so as to maximize the
capacity of flash memory. This simple approach has ...
¢£ÿ¦û¥  ¢£ÿ¦û¥ û¨¢                                                                                              '( 40F)6 E...
hot
where Np is the number of page writes for hot pages            evenly across the flash medium [11, 12]. FlexFS uses
sto...
cpbia`YreXgdq
                              W        cpbia`YreXgdq
                                                 W     ...
Table 2: Summary of the schemes used in throughput
                                                              evaluatio...
3.5                                                                                               3.5

                   ...
1                                                                                       1200                              ...
Table 4: Summary of results relevant to a wear-leveling      Table 6: A performance comparison of FlexFSMLC and
          ...
FlexFS: A Flexible Flash File System for MLC NAND Flash Memory
FlexFS: A Flexible Flash File System for MLC NAND Flash Memory
Upcoming SlideShare
Loading in …5
×

FlexFS: A Flexible Flash File System for MLC NAND Flash Memory

1,014 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,014
On SlideShare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

FlexFS: A Flexible Flash File System for MLC NAND Flash Memory

  1. 1. FlexFS: A Flexible Flash File System for MLC NAND Flash Memory Sungjin Lee† , Keonsoo Ha† , Kangwon Zhang† , Jihong Kim† , and Junghwan Kim∗ † Seoul National University, Korea {chamdoo, air21c, kwzhang, jihong}@davinci.snu.ac.kr ∗ Samsung Electronics, Korea junghwani.kim@samsung.com Abstract capacity, performance, and endurance. The capacity of MLC flash memory is larger than that of SLC flash mem- The multi-level cell (MLC) NAND flash memory ory. By storing two (or more) bits on a single memory technology enables multiple bits of information to be cell, MLC flash memory achieves significant density in- stored on a single cell, thus making it possible to in- creases while lowering the cost per bit over SLC flash crease the density of the memory without increasing the memory which can only store a single bit on a cell. How- die size. For most MLC flash memories, each cell can ever, SLC flash memory has a higher performance and be programmed as a single-level cell or a multi-level cell a longer cell endurance over MLC flash memory. Es- during runtime. Therefore, it has a potential to achieve pecially, the write performance of SLC flash memory is both the high performance of SLC flash memory and the much higher than that of MLC flash memory. high capacity of MLC flash memory. As the demand for the high capacity storage system is In this paper, we present a flexible flash file system, rapidly increasing, MLC flash memory is being widely called FlexFS, which takes advantage of the dynamic re- adopted in many mobile embedded devices, such as configuration facility of MLC flash memory. FlexFS di- smart phones, digital cameras, and PDAs. However, be- vides the flash memory medium into SLC and MLC re- cause of a poor performance characteristic of MLC flash gions, and dynamically changes the size of each region to memory, it is becoming harder to satisfy users’ require- meet the changing requirements of applications. We ex- ments for the high performance storage system while ploit patterns of storage usage to minimize the overhead providing increased storage capacity. of reorganizing two different regions. We also propose a novel wear management scheme which mitigates the ef- To overcome this poor performance, in this paper, we fect of the extra writes required by FlexFS on the lifetime propose exploiting the flexible programming feature of of flash memory. Our implementation of FlexFS in the MLC flash memory [1]. Flexible programming is a writ- Linux 2.6 kernel shows that it can achieve a performance ing method which enables each cell to be programmed as a single-level cell (SLC programming) or a multi-level comparable to SLC flash memory while keeping the ca- pacity of MLC flash memory for both simulated and real cell (MLC programming). If SLC programming is used mobile workloads. to write data into a particular cell, the effective proper- ties of that cell become similar to those of an SLC flash memory cell. Conversely, MLC programming allows us 1 Introduction to make use of the high capacity associated with MLC flash memory. As flash memory technologies quickly improve, NAND The most attractive aspect of flexible programming is flash memory is becoming an attractive storage solution that it allows fine-grained storage optimizations, in terms for various IT applications from mobile consumer elec- of both performance and capacity, to meet the require- tronics to high-end server systems. This rapid growth is ments of applications. For instance, if the current capac- largely driven by the desirable characteristics of NAND ity of flash memory is insufficient for some application, flash memory, which include high performance and low- MLC flash memory can change its organization and in- power consumption. crease the number of multi-level cells to meet the space There are two types of NAND flash memory in the requirement. However, to exploit flexible cell program- market: a single-level cell (SLC) and a multi-level cell ming effectively, several issues need to be considered. (MLC) flash memory. They are distinctive in terms of First, heterogeneous memory cells should be managed
  2. 2. )91%087 )(6 )5 )4)3 £ tion 2, we present a brief review of NAND flash memory and explain MLC flash memory in detail. In Section 3, ¨ we give an overview of FlexFS and introduce the prob- © ¨ §¤¢ ¦            lems that occur with a naive approach to exploiting flexi- ¥¢ £¤ ¡¢ ble cell programming. In Section 4, we describe SLC and ! 2 1 00 )( ' %$ # ! MLC management techniques. In Section 5, we present experimental results. Section 6 describes related work on heterogeneous storage systems. Finally, in Section 7, we Figure 1: Threshold voltage distributions for SLC (1 conclude with a summary and future work. bit/cell) and MLC (2 bits/cell) 2 Background in a way that is transparent to the application layer, be- cause flexible programming allows two different types of 2.1 NAND Flash Memory a cell to exist in the same flash chip simultaneously. Second, dynamic cell reconfigurations between the NAND flash memory consists of multiple blocks, each SLC and MLC must be handled properly. For example, if of which is composed of several pages. In many NAND too many flash cells are used as single-level cells, the ca- flash memories, the size of a page is between 512 B and 4 pacity of flash memory might be critically impaired, even KB, and one block consists of between 4 and 128 pages. though the overall I/O performance is improved. There- NAND flash memory does not support an overwrite op- fore, it is important to determine the number of SLC cells eration because of its write-once nature. Therefore, be- and MLC cells so that both the performance and capacity fore writing new data into a block, the previous data must would be optimally supported. be erased. Furthermore, the total number of erasures Third, the cost of dynamic cell reconfigurations should allowed for each block is typically limited to between be kept as low as possible. Changing the type of a cell 10,000 and 100,000 cycles. requires expensive erase operations. Since an erase op- Like SRAM and DRAM, flash memory stores bits in a eration resets cells to their initial bit value (e.g., 1), the memory cell, which consists of a transistor with a float- data stored in the cells must first be moved to elsewhere. ing gate that can store electrons. The number of electrons The performance overhead of this data migration impairs stored on the floating gate determines the threshold volt- the overall I/O performance. age, Vt , and this threshold voltage represents the state of Finally, write and erase operations required to change the cell. In case of a single-level cell (SLC) flash mem- the type of a cell reduce the endurance of each cell, re- ory, each cell has two states, and therefore only a single sulting in the decrease of the lifetime of flash memory. bit can be stored in that cell. Figure 1(a) shows how the This problem also needs to be addressed properly. value of a bit is determined by the threshold voltage. If In this paper, we propose a flexible flash file system, the threshold voltage is greater than a reference voltage, called FlexFS, for MLC flash memory that addresses the it is interpreted as a logical ‘1’; otherwise, it is regarded above requirements effectively. FlexFS provides appli- as a logical ‘0’. In general, the write operation moves the cations with a homogeneous view of storage, while in- state of a cell from ‘1’ to ‘0’, while the erase operation ternally managing two heterogeneous memory regions, changes ‘0’ to ‘1’. an SLC region and an MLC region. FlexFS guarantees If flash memory is composed of memory cells which the maximum capacity of MLC flash memory to users have more than two states, it is called a multi-level cell while it tries to write as much data as possible to the (MLC) flash memory, and two or more bits of informa- SLC region so as to achieve the highest I/O performance. tion can be stored on each cell, as shown in Figure 1(b). FlexFS uses a data migration policy to compensate for Even though the density of MLC flash memory is higher the reduced capacity caused by overuse of the SLC re- than that of SLC flash memory, it requires more precise gion. In order to prolong the lifespan of flash memory, a charge placement and charge sensing (because of nar- new wear management scheme is also proposed. rower voltage ranges for each cell state), which in turn In order to evaluate the effectiveness of FlexFS, we reduces the performance and endurance of MLC flash implemented FlexFS in the Linux 2.6.15 kernel on a memory in comparison to SLC flash memory. development board. Evaluations were performed using synthetic and real workloads. Experimental results show 2.2 MLC NAND Flash Memory Array that FlexFS achieves 90% of the read and 96% of the write performance of SLC flash memory, respectively, In MLC flash memory, it is possible to use SLC pro- while offering the capacity of MLC flash memory. gramming, allowing a multi-level cell to be used as a The rest of this paper is organized as follows. In Sec- single-level cell. To understand the implications of SLC
  3. 3. DCBA@ DEBA@ DFBA@ ƒ‚€u Table 1: Performance comparison of different types of DIBA G cell programming (us) SLC M LCLSB M LCBOTH ... Operation ... ... ... ... VVQU TSRP QP cb a`Y XW Read (page) 399 409 403 yxwv uts ‡†hgf…d DHBA G Write (page) 417 431 994 ryxwv Erase (block) 860 872 872 ... yxwv ut„ ... ... ... ... DCBA G ... logical ‘0’ for the MSB page. If the LSB position is then rqpi hgfed programmed as ‘0’, the bit pattern will change to ‘00’. Figure 2: An organization of an MLC flash memory ar- 2.3 SLC Programming in MLC ray (2 bits/cell) Since MLC flash memory stores multiple pages in the same word line, it is possible for it to act as SLC flash programming, it is necessary to know the overall archi- memory by using only the LSB pages (or MSB pages, tecture of a flash memory array. Figure 2 illustrates the depending on the manufacturer’s specification). Thus, array of flash memory cells which forms a flash memory SLC programming is achieved by only writing data to the block. We assume that each cell is capable of holding LSB pages in a block. In this case, since only two states two bits. For a description purpose, this figure does not of a cell, ‘11’ and ‘10’, are used shown in Figure 1(b), show all the elements, such as source and drain select the characteristics of a multi-level cell become very sim- gates, which are required in a memory array. (For a more ilar to those of a single-level cell. The logical offsets of detailed description, see references [2, 3].) the LSB and MSB pages in a block are determined by the flash memory specification, and therefore SLC program- As shown in Figure 2, the memory cells are arranged ming can be managed at the file system level. Naturally, in an array of rows and columns. The cells in each SLC programming reduces the capacity of a block by row are connected to a word line (e.g., W L(0)), while half, because only the LSB pages can be used. the cells in each column are coupled to a bit line (e.g., Table 1 compares the performance of the three dif- BL(0)). These word and bit lines are used for read and write operations. During a write operation, the data to be ferent types of cell programming method. The SLC column shows the performance data in a pure SLC written (‘1’ or ‘0’) is provided at the bit line while the flash memory; the M LCLSB column gives the perfor- word line is asserted. During a read operation, the word line is again asserted, and the threshold voltage of each mance data when only the LSB pages are used; and the M LCBOTH column gives the data when both the LSB cell can then be acquired from the bit line. and MSB pages are used. The access times for page reads Figure 2 also shows the conceptual structure of a flash and writes, and for block erase operations were measured block corresponding to a flash memory array. The size using the Samsung’s KFXXGH6X4M flash memory [4] of a page is determined by the number of bit lines in the at the device driver interface level. As shown in Table 1, memory array, while the number of pages in each flash there are no significant performance differences between block is twice the number of word lines, because two page read and block erase operations for the three pro- different pages share the memory cells that belong to the gramming methods. However, the write performance is same word line. These two pages are respectively called significantly improved with M LCLSB , and approaches the least significant bit (LSB) page and the most signif- to that of SLC. icant bit (MSB) page. As these names imply, each page This improvement in the write performance under only uses its own bit position of a bit pattern stored in a M LCLSB is the main motivation for FlexFS. Our pri- cell. (This is possible because each memory cell stores mary goal is to improve the write performance of MLC two bits, for example, one bit for the LSB page and the flash memory using the M LCLSB method, while main- other for the MSB page.) Thus, if a block has 128 pages, taining the capacity of MLC flash memory using the there are 64 LSB and 64 MSB pages. M LCBOTH method. Because multiple pages are mapped to the same word line, read and write operations must distinguish the des- tination page of each operation. For example, if a cell is 3 Overview of the FlexFS File System in an erased state (i.e., a logical ‘11’) and a logical ‘0’ is programmed to the MSB position of the cell, the cell will We will now describe the overall architecture of the pro- then have a bit pattern of ‘01’, which is interpreted as a posed FlexFS system. FlexFS is based on JFFS2 file sys-
  4. 4. „Œ‡ˆ‹‡m ‡„rm Š µ Õµ· “Ö •”“’‘ Ž •”“’‘ ¬«ª •”“’‘ ¬«ª š˜ ™“˜•˜—– š±° ¯®­– š±° ¯®­– ’•˜˜—– •”“’‘ ‰h ’•˜˜—– •”“’ ‘ ‰ ˆ gfe d™ gfe d™ ³²“¬ ¢¡ Ÿž œ› ¢¡ Ÿž œ› •”“’‘ Ž kji kj t ©¨§ ¦¥¤£ ¹¸µ² ·¶’µ´ ©¨§ ¦¥¤£ š˜ ™“˜•˜—– osrqq pmonml osrqqpmonml µÕµ· ·¶’µÔ ¹•”“’‘ ¹µº kji kji ¢¡ Ÿž œ› ¢¡ Ÿž œ› •”“’‘ Ž xwnvu kj t xwnvu ‡‡m† xwnvu kj t  ~} |{zy xwnvu xwnvu s ‰nsxsˆy ~} |{zy  ©¨§ ¦¥¤£ ©¨§ ¦¥¤£ š˜ ™“˜•˜—– ~} z€{y ~} z€{y osroonj Ä Á¼ Áà ¼À ÁÀ¿ ¾ ½¼» Í ÃÄÊ¼È ÌÀÂ¼Ë Ê¿À ÉÈÇÆ ½Å» Ä Á¼ Áà ¼¿ÀÓ ½Ñ» p„pƒ ƒrv p‚s  ÃÒÑÇÂÅ ÆÐ Ï Ê¿ÀüÎÄ p„pƒ ƒrvp… Œx wnvu onj Figure 3: The layout of flash blocks in FlexFS Figure 4: Steps in data migration tem [5], and hence the overall architecture is very simi- is used to write to the SLC block. If existing data is up- lar to JFFS2 except for some features required to manage dated, the old version of the data is first invalidated, while heterogeneous cells and to exploit flexible programming. the new data is appended to the free space of a log block. Therefore, in this section, we focus on how FlexFS deals The space used by this invalid data is later reclaimed by with different types of a cell. We also introduce a base- the garbage collector (Section 4.3). line approach to exploit flexible cell programming in or- After all the free pages in the current log block have der to illustrate the need for better policies, which will be been exhausted, a new log block is allocated from the introduced in detail on the following section. free blocks. However, if there is not enough free space to store the data, the data migrator triggers a data mi- gration (Section 4.1.1) to create more free space. This 3.1 Design Overview expands the effective capacity of flash memory by mov- ing the data from the SLC region to the MLC region. In order to manage heterogeneous cells efficiently, Figure 4 illustrates the steps in data migration. In this FlexFS logically divides the flash memory medium into example, there are initially two SLC blocks and one free an SLC region, composed of SLC blocks, and an MLC block, as shown in Figure 4(a). We assume that all the region consisting of MLC blocks. If a block does not pages in the two SLC blocks contain valid data. Dur- contain any data, it is called a free block. In FlexFS, a ing the data migration, the free block is converted into an free block is neither an SLC block nor an MLC block; its MLC block, and the 128 pages in the two SLC blocks are type is only determined when data is written into it. copied to this MLC block. Then the two SLC blocks are Figure 3 shows the layout of flash memory blocks in erased, making them free blocks. This migration frees FlexFS. We assume that the number of pages in a block is up one block, doubling the remaining capacity of flash 128, and the page size is 4 KB. (These values will be used memory, as shown in Figure 4(c). throughout the rest of this paper.) When a write request When a read request arrives, FlexFS first checks arrives, FlexFS determines the type of region to which whether the write buffers contain the requested data. the data is to be written, and then stores the data tem- If so, the data in the write buffer is transferred to the porarily in an appropriate write buffer. This temporary page cache. Otherwise, FlexFS searches an inode cache, buffering is necessary because the unit of I/O operations which is kept in main memory, to find a physical address is a single page in flash memory. Therefore, the write for the requested file data. The inode cache maintains the buffer stores the incoming data until there is at least the inode numbers and physical locations of data that belong page size of data (i.e., 4 KB), which can be transferred to each inode. If the physical address of the required data to flash memory. In order to ensure the data reliability, is found, regardless of the type of block in which the data if there is an explicit flush command from the operating is stored, FlexFS can read the data from that address. system, all the pending data is immediately written to flash memory. In FlexFS, separate write buffers are used for the SLC and MLC regions. 3.2 Baseline Approach and Its Problems FlexFS manages flash memory in a similar fashion to other log-structured file systems [5, 6, 7], except that two The major objective of FlexFS is to support both high log blocks (one for the SLC and another for the MLC re- performance and high capacity in MLC flash memory. A gion) are reserved for writing. When data is evicted from simplistic solution, which we call the baseline approach, the write buffer to flash memory, FlexFS writes them se- is first to write as much data as possible into SLC blocks quentially from the first page to the last page of the corre- to maximize the I/O performance. When there are no sponding region’s log block. MLC programming is used more SLC blocks available, the baseline approach initi- to write data to the MLC block, and SLC programming ates a data migration so that more space becomes avail-
  5. 5. ßÞÛÝ ÜÛÚ Ù Ø× able for subsequent write requests, so as to maximize the capacity of flash memory. This simple approach has two Tbusy Tidle îäÝ åÚ çÛÚ åí Tdelay serious drawbacks. ð åï Ù Ø× First, if the amount of data stored on flash memory Ttrig Ttrig ä åá ßèÚ çáâ æå ßáäã ìÛçèë êé approaches to half of its maximum capacity, almost all îäÝåÚ çóòèñ ÚåßèÚ çáâ Twait the free blocks are exhausted. This is because the ca- ìöõø÷ öõ ôé pacity of the SLC block is half that of the MLC block. t1 t2 Ûâá à At this point, a data migration has to be triggered to free some blocks before writing the requested data. But, this Figure 5: Overview of the background migration reduces the overall I/O performance significantly. To ad- dress this problem, we introduce techniques to reduce the migration penalty, or to hide it from users. During this idle time the background migrator can move Second, the baseline approach degrades the lifetime data from the SLC region to the MLC region, thus free- of MLC flash memory seriously. Each block of NAND ing many blocks. These free blocks can then be used as flash memory has a finite number of erase cycles before SLC blocks to store data, and so we can avoid a compul- it becomes unusable. The baseline approach tends to in- sory data migration if there is sufficient idle time. crease the number of erase operations because of the ex- In designing the background migration technique, cessive data migration. In the worst case, the number of there are two important issues: First, it is important to erasures could be three times more than in conventional minimize the delay in response time Tdelay inflicted on flash file systems. We solve this problem by controlling foreground tasks by the background migration. For ex- the degree of the migration overhead, with the aim of ample, in Figure 5, an I/O request arrives at t1 , but it meeting a given lifetime requirement. cannot proceed until t2 because of interference from the background migration. So Tdelay is t2 - t1 . To reduce this delay, the data migrator monitors the I/O subsystem, 4 Design and Implementation of FlexFS and suspends the background migration process if there is an I/O request. Since the unit of a data migration is a 4.1 Reducing the Migration Overhead single page, the maximum delay in response time will be To reduce or hide the overhead associated with data less than the time required to move a page from SLC to migrations, we introduce three techniques: background MLC (about 1,403 us) theoretically. In addition, we also migration, dynamic allocation, and locality-aware data design the background migrator so that it does not utilize management. The background migration technique ex- all available idle times. Instead, it periodically invokes ploits the times when the system is idle to hide the data a data migration at a predefined triggering interval Ttrig . migration overhead. This technique is effective for many If Ttrig is larger than the time required to move a single mobile embedded systems (e.g., mobile phones) which page, FlexFS reduces the probability that a foreground have long idle time. The dynamic allocation technique, job will be issued while a data migration is running, thus on the other hand, is aimed at systems with less idle time. further reducing Tdelay . By redirecting part of the incoming data into the MLC The second issue is when to initiate a background mi- region depending on the idleness of the system, it re- gration. Our approach is based on a threshold; if the du- duces the amount of data that is written into the SLC ration of the idle period is longer than a specific threshold region, which in turn reduces the data migration over- value Twait , then the background migrator is triggered. heads. The third technique, locality-aware data manage- This kind of problem has been extensively studied in dy- ment, exploits the locality of I/O accesses to improve the namic power management (DPM) of hard disk drives [8], efficiency of data migration. We will now look at these which puts a disk into a low-power state after a certain three techniques in more detail. idle time in order to save energy. However, the transition to a low-power state has to be made carefully because it introduces a large performance penalty. Fortunately, 4.1.1 Background Migration Technique because Tdelay is quite short, more aggressive transition- Figure 5 shows the overall process of the background mi- ing is possible in our background migration technique, gration. In this figure, the X-axis shows the time and allowing Twait to be set to a small value. the Y-axis gives the type of job being performed by the file system. A foreground job represents I/O requests is- 4.1.2 Dynamic Allocation Technique sued by applications or the operating system. Tbusy is a time interval during which the file system is too busy The background migration technique works well when a to process foreground jobs, and Tidle is an idle interval. system has sufficient idle time. Otherwise, the migration
  6. 6. ¢£ÿ¦û¥ ¢£ÿ¦û¥ û¨¢ '( 40F)6 EDC ) ' B )'('A ) 9 t2 û ÿ þ þüû¦¦ '( 40F)6 EDC ) ' B )'('A ) 9 P3 P4 P1 %$#! %' %$#! %$#! P5 P6 %$#! t1 !(677( )6(5(432 1'0#(!# ) ' %( P0 %$#! P1 %' P2 %$#! P3 %$#! SUTPSRQPIH G ¡£¢üÿ ¡ û ÿ þ þýûü ûú ù ¡£¢üÿ ¡ û ÿ þ £ÿ û¦¥ ûú ù P1 P4 %$#! %$#! %' %$#! %'P1 P5 P6 © û ÿ þ û¨¢ÿ ¢ûþ§ÿ¢û¦¥¤ measure T idle © û ÿ þ û¨¢ÿ ¢û¦û ¤ pred T idle !(677( )6(5(2 1'0#(!# ) ' %8 Figure 6: Our approach to idle time prediction t1 t2 )@09 overhead cannot be avoided. But it can be ameliorated Figure 7: A comparison of the locality-unaware and by writing part of the incoming data into the MLC re- locality-aware approaches gion, so as to reduce the amount of data to be moved by different flash regions depending on α. Therefore, the the background migrator. Although this approach results number of pages to be written into the SLC region, in a lower I/O performance than SLC flash memory, it SLC Np , and the amount of data destined for the MLC can prevent significant performance degradation due to a MLC compulsory data migration. region, Np , can be expressed as follows: The dynamic allocator determines the amount of data SLC Np = Np · α, Np LC = Np · (1 − α). M (3) that will be written into the SLC region. Intuitively, it is clear that this must depend on how much idle time Finally, after writing all Np pages, the dynamic allocator there is in a given system. Since the amount of idle time calculates a new value of α for the next Np pages. changes dynamically with user activities, we need to pre- dict it carefully. Figure 6 illustrates the basic idea of our idle time prediction approach, which is based on previ- 4.1.3 Locality-aware Data Management Technique ous work [9]. In this figure, each time window repre- FlexFS is based on a log-structured file system, and sents the period during which Np pages are written into therefore it uses the out-place update policy. Under this flash memory. The dynamic allocator stores measured policy, hot data with a high update frequency generates idle times for several previous time windows, and uses more outdated versions of itself than cold data, which is pred them to predict the idle time, Tidle , for the next time updated infrequently. Our locality-aware data manage- pred window. The value of Tidle is a weighted average of ment technique exploits this characteristic to increase the the idle times for the latest 10 time windows; the three efficiency of data migration. most recent windows are given a higher weight to take Figure 7 compares the locality-aware and the locality- the recency of I/O pattern into account. unaware approaches. We assume that, at time t1 , three pred If we know the value of Tidle , we can use it to calcu- cold pages p0 , p2 , and p3 , and one hot page p1 , exist in late an allocation ratio, denoted by α, which determines the SLC region. Between t1 and t2 , there are some idle how many pages will be written to the SLC region in the periods, and new pages p1 , p4 , p5 , and p6 are written next time window. The value of α can be expressed as into the SLC region. Note that p1 is rewritten because follows: it contains hot data. In the case of the locality-unaware pred approach shown in Figure 7(a), we assume that pages 8 1 if Tidle ≥ Tmig α= T pred pred (1) p0 , p1 , and p2 are moved to the MLC region during idle : idle if Tidle Tmig , Tmig time, but p3 cannot be moved because there is not enough SLC SLC idle time. Therefore, at time t2 , there are five pages in where Tmig = Np · (Ttrig + Terase /Sp ), (2) the SLC region. If the value of Np is 4, the value of α SLC where Terase is the time required to erase an SLC flash should decrease so that data will not accumulate in the SLC block which contains Sp pages. As mentioned in SLC region. However, if we consider the locality of the Section 4.1.1, Ttrig is the time interval required for one data, we can move p3 instead of p1 during idle periods, page to migrate from the SLC region to the MLC re- as shown in Figure 7(b). Since p1 has a high locality, gion. Therefore, Tmig is the migration time, which in- it is highly likely to be invalidated by t2 . Therefore, an cludes the time taken to move all Np pages to the MLC unnecessary page migration for p1 can be avoided, and region and the time for erasing all used SLC blocks. If only four pages remain in the SLC region. In this case, pred Tidle ≥ Tmig , there is sufficient idle time for data mi- we need not to reduce the value of α, and more data will grations, and thus α = 1. Otherwise, the value of α be written into the SLC region. should be reduced so that less data is written into the Using this observation, Eq. (2) can be rewritten as SLC region, as expressed by Eq. (1). follows: Once the value of α has been determined, the dynamic hot SLC SLC allocator tries to distribute the incoming data across the Tmig = (Np − Np ) · (Ttrig + Terase /Sp ), (4)
  7. 7. hot where Np is the number of page writes for hot pages evenly across the flash medium [11, 12]. FlexFS uses stored in the SLC region. For instance, in the above ex- this approach, but also needs to support more specialized hot ample, Np is 1. Because we only need to move Np wear management to cope with frequent data migrations. hot The use of FlexFS means that each block undergoes - Np pages into the MLC region, the value of Tmig can be reduced, allowing an increase in α for the same more erase cycles because a lot of data is temporarily amount of idle time. written to the SLC region, waiting to move to the MLC To exploit the locality of I/O references, there are two region during idle time. To improve the endurance and questions to answer. The first is to determine the local- prolong the lifetime, it would be better to write data to ity of a given data. To know the hotness of data, FlexFS the MLC region directly, but this reduces the overall per- uses a 2Q-based locality detection technique [10], which formance. Therefore, there is another important trade-off is widely used in the Linux operating system. This tech- between the lifetime and performance. nique maintains a hot and a cold queue, each containing To efficiently deal with this trade-off, we propose a a number of nodes. Each node contains the inode num- novel wear management technique which controls the ber of a file. Nodes corresponding to frequently accessed amount of data to be written into the SLC region depend- files are stored on the hot queue, and the cold queue con- ing on a given storage lifetime. tains nodes for infrequently accessed files. The locality of a given file can easily be determined from queue in 4.2.1 Explicit Endurance Metric which the corresponding node is located. Second, the data migrator and the dynamic allocator We start by introducing a new endurance metric which should be modified so that they take the locality of data is designed to express the trade-off between lifetime and into account. The data migrator tries to select an SLC performance. In general, the maximum lifetime, Lmax , block containing cold data as a victim, and an SLC block of flash memory depends on the capacity and the amount containing hot data is not selected as a victim unless very of data written to them, and is expressed as follows: few free blocks remain. Since a single block can con- Ctotal · Ecycles tain multiple files which have different hotness, FlexFS Lmax = , (5) WR calculates the average hotness of each block as the cri- where Ctotal is the size of flash memory, and Ecycles is terion, and chooses a block whose hotness is lower than the number of erase cycles allowed for each block. The the middle. It seems better to choose a block containing writing rate W R indicates the amount of data written in only cold pages as a victim block; if there are only a few unit time (e.g., per day). This formulation of Lmax is bytes of hot data in a victim, this results in useless data used by many flash memory manufacturers [13] because migrations for hot data. However, this approach incurs it clearly shows the lifetime of a given flash application the delay in reclaiming free blocks, because even if the under various environments. small amount of hot data is stored on a block, the block Unfortunately, Lmax is not appropriate to handle the will not be chosen as a victim. trade-off between lifetime and performance because it The dynamic allocator tries to write as much hot data expresses the expected lifetime, and not the constraints to to the SLC region as possible in order to increase the hot be met in order to improve the endurance of flash mem- value of Np . The dynamic allocator also calculates a ory. Instead, we use an explicit minimum lifespan, Lmin , new value of α after Np pages have been written and, for hot which represents the minimum guaranteed lifetime that this purpose, the value of Np for the next time window would be ensured by a file system. Since FlexFS can con- need to be known. Similar to the approach used in our trol the writing rate W R by adjusting the amount of data idle time prediction, we count how many hot pages were written into the SLC region, this new endurance metric written into the SLC region during the previous 10 time hot can be expressed as follows: windows, and use their average hotness value as Np hot for the next time window. The value of Np for each Control W R by changing a wear index, δ window can be easily measured using an update variable, Subject to (6) which is incremented whenever a hot page is sent to the Ctotal · Ecycles Lmin ≈ , SLC region. WR where δ is called the wear index. In FlexFS δ is propor- 4.2 Improving the Endurance tional to W R, and therefore δ can be used to control the value of W R. If δ is high, FlexFS writes a lot of data To enhance the endurance of flash memory, many flash to the SLC region; and this increases W R due to data file systems adopt a special software technique called migrations; but if δ is low, the writing rate is reduced. wear-leveling. In most existing wear-leveling tech- Our wear management algorithm controls δ so that the niques, the primary aim is to distribute erase cycles lifetime specified by Lmin is to be satisfied.
  8. 8. cpbia`YreXgdq W cpbia`YreXgdq W cpbiah`YreXgdq W h h blocks is 256 KB and 512 KB, respectively. Suppose ” ““’ ‘‰ ˆ‡ †…„ nnml pspr rq`po ”“˜˜—– • that 512 KB data is written, and the data migrator moves cbiah`YgXWeV cbiah`YgXWV p fed cbiah`YgXWeV this data from the SLC region to the MLC region. If p fd p fd pspr rq`pvut δ is 1.0, as shown in Figure 8(a), 512 KB is written to cpbia`YreXgdq h W cpbia`YreXgdq h W cba`Y XWq two SLC blocks, and then the data migrator requires one kj ab pih regd ”f—˜–e—d  ˜™ kjab MLC block to store the data from two SLC blocks. In “˜“ihdfg ‘— cbiah`YgXWeV p fd cbiah`YgXWV p fed cbiah`YgXWeV p fd this case, the total amount of a writing budget used is 1.5 €yxw v uts MB because three blocks have been used for writing. If δ ‚y€wv us €y€w v uƒs is 0.5, as shown in Figure 8(b), 1 MB of a writing budget is used, requiring one SLC block and one MLC block. Figure 8: How the number of blocks used depends on δ Figure 8(c) shows the case when δ is 0.0. Only 512 KB is used because there is no data to be moved. 4.2.2 Assigning a Writing Budget This simple example suggests that we can generalize the relationship between the wear index, the amount of The proposed wear management algorithm divides incoming data, and the amount of a writing budget actu- the given lifetime Lmin into n time windows ally used, as follows: (w0 , w1 , ..., wn−2 , wn−1 ), and the duration of each window is given as Ts . The writing rate W R(wi ) IW (wi ) · (2 · δ + 1) = OW (wi ), (8) for each time window wi can also be expressed as where IW (wi ) is the amount of data that arrives during W B(wi )/Ts , where W B(wi ) is the amount of data the window wi , and OW (wi ) is the amount of a writing and represents the writing budget assigned to the time budget to be used depending on δ. In the example of window wi . Figure 8(b), IW (ti ) is 512 KB and δ is 0.5, and thus Since Ts is fixed, the assignment of a writing budget OW (ti ) is 1 MB. IW (wi ) · (2 · δ) is the amount of a to each window significantly impacts the overall perfor- writing budget used by the SLC region and IW (wi ) is mance as well as the rate at which flash memory wears the amount of data to be written to the MLC region. out. For example, if too large a writing budget is as- The wear index should be chosen so that OW (wi ) = signed to each window, it markedly increases the number W B(ti ), and can therefore be calculated as follows: of erase cycles for each block; on the other hand, if too small a writing budget is allocated, it lowers the overall W B(ti ) − IW (wi ) δ= . (9) performance. Therefore, we determine a writing budget 2 · IW (wi) for the window wi as follows: The value of δ is calculated at the beginning of wi when (Ctotal · Ecycles ) − W (ti ) the exact value of IW (wi ) is unknown. IW (wi ) is there- W B(ti) = , (7) fore estimated to be the average value of the previous n − (ti /Ts ) three time windows. If W B(ti ) IW (wi ), then δ is where ti is the time at the start of window wi , and W (ti ) 0, and therefore all the data will be written to the MLC indicates the amount of a writing budget that has actu- region. If IW (wi ) is always larger than W B(ti ), it may ally been used by ti . The remaining writing budget is be hard to guarantee Lmin . However, by writing all the (Ctotal · Ecycles ) − W (ti ), and the number of remain- data to the MLC region, FlexFS can achieve a lifetime ing windows is (n − (ti /Ts )). Therefore, the remaining close to that of a pure MLC flash memory. writing budget is shared equally between the remaining A newly determined value of δ is only used by the dy- windows. The writing budget is calculated at the begin- namic allocator if δ α. Therefore, the wear manage- ning of every time window, so as to take changes in the ment algorithm is only invoked when it seems that the workload pattern into consideration. specified lifetime will not be achieved. 4.2.3 Determining the Wear Index 4.3 Garbage Collection Once the writing budget has been assigned to a time win- The data migrator can make free blocks by moving data dow, the wear manager adjusts the wear index, δ, so that from the SLC region to the MLC region, but it cannot re- the amount of a writing budget actually used approxi- claim the space used by invalid pages in the MLC region. mates the given writing budget. The wear index is used The garbage collector, in FlexFS, reclaims these invalid by a dynamic allocator, similar to Eq. (3), to distribute pages by selecting a victim block in the MLC region, and the incoming data across the two regions. then by copying valid pages in the victim into a different Figure 8 shows how the number of blocks used de- MLC block. The garbage collector selects a block with pends on the value of δ. The size of the SLC and MLC many invalid pages as a victim to reduce the requirement
  9. 9. Table 2: Summary of the schemes used in throughput evaluation Schemes Baseline BM DA LA Background migration × Dynamic allocation × × Locality-aware × × × niques to reduce the overhead of data migrations. For example, the BM scheme uses only the background mi- gration technique, while the LA scheme uses all three proposed techniques. In all the experiments, Twait was set to 1 second, Np was 1024 pages, and Ttrig was 15 ms. To focus on the performance implications of each scheme, the wear management scheme was disabled. Figure 9: A snapshot of the flash development board All the schemes were evaluated on three synthetic used for experiments benchmark programs: Idle, Busy, and Locality. They were designed to characterize several important proper- ties, such as the idleness of the system and the locality for additional I/O operations, and also utilizes idle times of I/O references, which give significant effects on the to hide this overhead from users. Note that, it is never performance of FlexFS. The Idle benchmark mimics the necessary to choose a victim in the SLC region. If cold I/O access patterns that occur when sufficient idle time is data is stored in SLC blocks, it will be moved to the MLC available in a system. For this purpose, the Idle bench- region by the data migrator; but hot data need not to be mark writes about 4 MB of data (including metadata) to moved because it will soon be invalidated. flash memory every 25 seconds. The Busy benchmark generates 4 MB of data to flash memory every 10 sec- 5 Experimental Results onds, which only allows the I/O subsystem small idle times. The Locality benchmark is similar to Busy, ex- In order to evaluate the efficiency of the proposed tech- cept that about 25% of the data is likely to be rewritten niques on a real platform, we implemented FlexFS on to the same locations, so as to simulate the locality of Linux 2.6.25.14 kernel. Our hardware system was the I/O references that occurs in many applications. All the custom flash development board shown in Figure 9, benchmarks issued write requests until about 95% of the which is based on TI’s OMAP2420 processor (running total MLC capacity has been used. To speed up the eval- at 400 MHz) with a 64 MB SDRAM. The experiments uation, we limited the capacity of flash memory to 64 were performed on Samsung’s KFXXGH6X4M-series MB using the MTD partition manager [14]. 1-GB flash memory [4], which is connected to one of Figure 10 compares the throughput of Baseline and the NAND sockets shown in Figure 9. The size of each BM with the Idle benchmark. The throughput of Base- page was 4 KB and there were 128 pages in a block. line is significantly reduced close to 100 KB/s when the To evaluate the FlexFS file system objectively, we utilization approaches 50%, because before writing the used two types of workload. In Section 5.1, we present experimental results from synthetic workloads. In Sec- 3.5 tion 5.2, we evaluate FlexFS using actual I/O traces col- 3 lected from executions of real mobile applications. Throughput (MB/sec) 2.5 2 5.1 Experiments with Synthetic Workloads 1.5 5.1.1 Overall Throughput 1 Baseline 0.5 BM Table 2 summarizes the configurations of the four 0 schemes that we used for evaluating the throughput of 7 14 21 27 34 41 47 54 61 67 74 81 88 94 FlexFS. In the baseline scheme, all the data is first writ- Flash Memory Utilization (%) ten into SLC blocks, and then compulsorily moved to MLC blocks only when fewer than five free blocks re- Figure 10: Performance comparison of Baseline and BM main. Three other schemes, BM, DA, and LA, use tech- with the Idle benchmark
  10. 10. 3.5 3.5 3 3 Throughput (MB/sec) Throughput (MB/sec) 2.5 2.5 2 2 1.5 1.5 1 BM 1 DA 0.5 DA 0.5 LA 0 0 7 14 21 27 34 41 47 54 61 67 74 81 88 94 7 14 21 27 34 41 47 54 61 67 74 81 88 94 Flash Memory Utilization (%) Flash Memory Utilization (%) Figure 11: Performance comparison of BM and DA with Figure 12: Performance comparison of DA and LA with the Busy benchmark the Locality benchmark incoming data, the data migrator should make enough 5.1.2 Response Time free space in the SLC region, incurring a noticeable per- Although the background migration contributes to im- formance degradation. However, BM achieves the same proving the write throughput of FlexFS, it could incur performance as SLC flash memory until the utilization a substantial increase in response time because I/O re- exceeds 94%. Since the Idle benchmark allows FlexFS quests can be issued while the background migrator is a lot of idle time (about 93.6% of the total execution running. In this subsection, to investigate the impact of time), it should be possible to reclaim a sufficient num- the background migration on the response time, we per- ber of free blocks before new write requests arrive and formed evaluations with a following scenario. require them. When the utilization reaches 94%, the per- We first wrote 30 MB of bulk data in order to trigger formance of BM is significantly reduced because almost the background migrator. FlexFS was modified for all all of the available blocks is occupied by valid data, and the incoming data to be written into the SLC region, re- fewer than 5 free blocks remain available. gardless of the amount of idle time. After writing this Figure 11 compares the performance of BM and DA data, we made 10 page write requests. The idle time be- while running the Busy benchmark. In this evaluation, tween two consecutive write requests was generated us- BM shows a better throughput than DA when the utiliza- ing a pseudo-random number generator, but this was ad- tion is less than 67%. However, its performance quickly justed at least larger than Twait so that all write requests declines because the idle time is insufficient to allow BM was randomly issued after the background migrator has to generate enough free blocks to write to the SLC re- been initiated. To collect accurate and reliable results, gion. DA does exhibit a stable write performance, re- we performed this scenario more than 30 times. gardless of the utilization of flash memory. At the be- We performed our evaluation for the following four ginning of the run, the value of α is initially set to 1.0 configurations. In order to know the effect of the idle so that all the incoming data is written to the SLC re- time utilization, we measured the response time while gion. However, since insufficient idle time is available, varying the idle time utilization. The configurations, the dynamic allocator adjusts the value of α to 0.5. DA U100 , U50 , and U10 represent when FlexFS utilizes then writes some of the arriving data directly to the MLC 100%, 50%, and 10% of the total idle time, respectively. region, avoiding a significant drop in performance. This idle time utilization can be easily controlled by the Figure 12 shows the performance benefit of the value of Ttrig . For example, the time required to move locality-aware approach using the Locality benchmark. a single page from SLC to MLC is about 1.5 ms, and so Note that Locality has the same amount of idle time com- the utilization of 10% can be made using Ttrig of 15 ms. pared as the Busy benchmark. LA achieves 7.9% more To clearly show the performance penalty from the back- write performance than DA by exploiting the locality of ground migration, we evaluated the response time when I/O references. The overall write throughput of LA is the background migration is disabled, which is denoted 2.66 MB/s while DA gives 2.45 MB/s. The LA scheme as OPT. The migration suspension mentioned in Section also starts with an α value of 1.0, but that is reduced to 4.1.1 was enabled for all the configurations. 0.5 because the idle time is insufficient. However, after Figure 13 shows the cumulative distribution function detecting a high degree of locality from I/O references, of the response time for the four configurations. As ex- α is partially increased to 0.7 by preventing useless data pected, OPT shows the best response time among all the migrations of hot data, and more data can then be written configurations. However, about 10% of the total I/O re- into the SLC region. quests requires more than 2,000 us. This response time
  11. 11. 1 1200 1 Amount of data written (MB) 0.9 0.9 Cumulative Probability 0.8 1000 0.8 0.7 0.7 800 0.6 0.6 0.5 600 0.5 δ 0.4 OPT 0.4 0.3 400 0.3 U10 0.2 U50 Written data 0.2 200 0.1 U100 δ 0.1 0 0 0 1 2 4 8 16 32 64 128 w0 w1 w2 w3 w4 w5 w6 w7 w8 w9 Response Time (ms) Time window (Ts = 400 s) Figure 13: A comparison of response time delays on dif- Figure 14: The changes in the size of written data and ferent system configurations the δ value delay is caused by the writing of the metadata informa- memory is not considered. tion. Although we wrote 4 KB of data into flash memory, Table 3 shows the amount of data (MB) written to flash the amount of data actually written was slightly larger memory for each window, wi , and Figure 14 shows how than 4 KB because of the metadata overhead. Conse- the proposed wear management scheme adapts to chang- quently, this results in additional page writes, incurring ing write sizes while satisfying the minimum lifetime. the delay in response time. Initially, FlexFS allocates a writing budget of 120 MB (= U10 exhibits a longer response time than OPT for 1.2 GB / 10) to each time window. This budget is large about 10% of the total I/O requests, but it shows a fairly enough to allow all the incoming data to be written to good response time. On the other hand, the performance the SLC region if less than or equal to 40 MB of data of U50 and U100 is significantly deteriorated because arrives during each window. Therefore, during the first they utilize a lot of idle time for data migrations, increas- three windows, the value of δ is set to 1.0. During w3 and ing the probability of I/O requests being issued while w4 , however, about 160 MB of data arrives, and FlexFS the background migrator is working. Especially, when reduces δ to cut the migration cost. Because only 40 MB two tasks (the foreground task and the background mi- of data arrives during w5 and w6 , FlexFS can increase gration task) compete for a single CPU resource, the per- δ to give a larger writing budget to the remaining win- formance penalty caused by the resource contention is dows. We measured the amount of data written to flash more significant than we expect. memory, including extra overheads caused by migrations from the SLC region to the MLC region. FlexFS writes 5.1.3 Endurance about 1.2 GB of data to flash memory, and thus achieving the specified minimum life span of 4,000 seconds. We evaluated our wear management scheme using a We also counted the number of erase operations per- workload scenario in which the write patterns change formed on each block while running FlexFS with and over a relatively long time. We set the size of flash mem- without the wear management scheme using the same ory, Ctotal , to 120 MB, and the number of erase cycles workload scenario. A wear-leveling policy was disabled allowed for each block, Ecycles , was 10, allowing a max- when the wear management scheme was not used. Fig- imum of 1.2 GB to be written to flash memory. We set ure 15 shows distributions of block erase cycles, and Ta- the minimum lifetime, Lmin , to 4,000 seconds, and our ble 4 summarizes the results relevant to a wear-leveling. wear management scheme was invoked every 400 sec- onds. So, there are 10 time windows, w0 , ..., w9 , and the 18 18 duration of each, Ts , is 400 seconds. To focus our eval- 16 16 uation on the effect of the wear management scheme on 14 14 Erase count Erase count 12 12 performance, the system was given enough idle time to 10 10 8 8 write all the data to the SLC region if the lifetime of flash 6 6 4 4 2 2 0 0 1 32 64 96 128 160 192 224 256 1 32 64 96 128 160 192 224 256 Table 3: The amount of data (MB) arrives for each win- Block number Block number dow during the evaluation of wear management policy. (a) Without wear management (b) With wear management Time window w0 w1 w2 w3 w4 w5 w6 w7 w8 w9 Size (MB) 40 40 40 80 80 20 20 40 40 40 Figure 15: Distributions of block erase cycles
  12. 12. Table 4: Summary of results relevant to a wear-leveling Table 6: A performance comparison of FlexFSMLC and Avg. erase cycles Std.Dev. FlexFSSLC under mobile workloads w/ wear management 9.23 1.20 Response time Throughput wo/ wear management 10.73 2.43 Read Write Write (us) (us) (MB/s) FlexFSSLC 34 334 3.02 These results clearly indicate that with the wear manage- FlexFSMLC 37 345 2.93 ment scheme FlexFS gives a good wear characteristic; JFFS2 36 473 2.12 the maximum erase cycle of each block is effectively lim- ited to less than or equal to 10, and the block erase op- erations are evenly distributed across the flash memory 43,000 read and write requests. About 5.7 MB was read medium. from flash memory and about 39 MB was written. 5.2 Experiments with Mobile Workloads 5.2.2 Evaluation Results 5.2.1 Generating Mobile Workloads In order to find out whether FlexFS can achieve SLC- like performance, we evaluated the performance of In addition to the synthetic workloads discussed in Sec- two FlexFS configurations, FlexFSMLC and FlexFSSLC . tion 5.1, which were designed to evaluate one aspect of FlexFSMLC is the proposed FlexFS configuration us- FlexFS at a time, we evaluated FlexFS using I/O traces ing both SLC and MLC programming, while FlexFSSLC collected from a real-world mobile platform to assess the mimics SLC flash memory by using only SLC program- performance of FlexFS with mobile applications. ming. To know the performance benefits of FlexFSMLC , To collect and replay I/O traces from real applica- we evaluated JFFS2 file system on the same hardware. In tions, we developed a custom mobile workload gen- this subsection, we will focus on the performance aspect eration environment based on the Qtopia Phone Edi- only, since the capacity benefit of FlexFSMLC is clear. tion [15], which includes representative mobile appli- For FlexFSMLC , Ttrig was set to 15 ms, Np to 1024 cations such as PIMS, SMS, and media players. This pages, and Twait to 1 second. We assumed a total ca- environment includes three tools: a usage pattern gen- pacity of 512 MB, a maximum of 10,000 erase cycles for erator, an I/O tracer, and a replayer. The usage pattern a block, and a minimum lifetime of 3 years. The wear generator automatically executes mobile applications as management policy was invoked every 10 minutes. if the user is actually interacting with applications dur- Table 6 compares the response time and the through- ing runtime. The I/O tracer captures I/O system calls put of FlexFSMLC , FlexFSSLC , and JFFS2. The response (e.g., fopen, fread, and fwrite) while running the usage time was an average over all the I/O requests in the trace pattern generator on the Qtopia platform, and then stores file, but the throughput was measured when writing a collected traces in a log file. The replayer uses this log large amount of data, such as MP3 files. Compared to file to replay the I/O requests in our development board. JFFS2, FlexFSMLC achieves 28% smaller I/O response Note that this log file allows us to repeat the same usage time and 28% higher I/O throughput. However, the per- patterns for different system configurations. formance difference between FlexFSMLC and JFFS2 is For the evaluation, we executed the several mobile ap- noticeably reduced compared to the difference shown in plications shown in Table 5 on our workload generation Table 1 because of computational overheads introduced environment for 30 minutes. We followed a represen- by each file system. JFFS2 as well as FlexFSMLC re- tative usage profile of mobile users reported in [16] ex- quires a lot of processing time for managing internal data cept that more multimedia data was written in order to structures, such as block lists, a metadata, and an error simulate data downloading scenario. The trace includes detecting code, which results in the reduction of the per- formance gap between two file systems. The performance of FlexFSMLC is very close to that Table 5: Applications used for evaluations of FlexFSSLC . The response times of FlexFSMLC are Application Description 10% and 3.2% slower for reads and writes, compared SMS Send short messages with FlexFSSLC . The I/O throughput of FlexFSMLC is Address book Register / modify / remove addresses 3.4% lower than that of FlexFSSLC . This high I/O perfor- Memo Write a short memo mance of FlexFSMLC can be attributed to the sufficiency Game Play a puzzle game of idle time in the trace. Therefore, FlexFSMLC can write MP3 player Download 6 MP3 files (total 18 MB) most incoming data into the SLC region, improving the Camera Take 9 pictures (total 18 MB) overall I/O performance.

×