Performance of the Greedy Garbage-Collection Scheme in Flash-Based Solid-State Drives

In flash-based solid-state drives (SSD) and log-structured file systems, new data is written out-of-place, which over time exhausts the available free space. New free space is created by the garbage-collection process, which reclaims the space occupied by invalidated data. The write amplification, incurred because of the additional write operations performed by the garbage-collection mechanism is a critical factor that negatively affects the lifetime and endurance of SSDs. A theoretical model is developed to evaluate the impact of the greedy garbage-collection mechanism on the performance of large storage systems. The system operation and behavior are comprehensively characterized for uniformly-distributed random small user writes. Results of theoretical and practical importance are analytically derived and confirmed by means of simulation. Closed-form expressions are derived for both the number of relocated pages and the write amplification. The write amplification is analytically assessed for the key system parameters, i.e., the total system memory space, the proportion of the memory space occupied by valid user data, and the block size in terms of number of pages. Our results demonstrate that as the system occupancy increases, the write amplification increases. Furthermore, as the number of pages contained in a block increases, the write amplification increases and approaches an upper bound. They also show that the number of free pages reclaimed by the greedy garbage-collection mechanism after each block recycling takes one of two successive values, which provides a quasi-deterministic performance guarantee.

By: Ilias Iliadis

Published in: RZ3769 in 2010

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rz3769.pdf

Questions about this service can be mailed to reports@us.ibm.com .