Reliability in Large-Capacity Bandwidth-Limited Storage Systems

As time goes by, storage systems are required to reliably store increasing amounts of data. Some earlier solutions, designed for relatively small systems, are reaching their limit. The problem of storage reliability is aggravated by ongoing tendencies of (networked) storage systems, e.g., the growth gap between the capacity of each device and its bandwidth, and the advent of mass-produced cheap devices of decreasing reliability.
In this work we study bounds and solutions for very large systems of high capacity devices. Using simple information-theoretic bounds, we show the eventual necessity of a scalable hierarchical system. We next describe and analyze such a system. The salient points of our system are protection of data by their estimated access-intensity, unequal error-protection by data importance, assumption of a lenient and realistic disk-replacement policy, and competitive transition of data between different levels by predicted access-intensity.

By: Ami Tavory, Valdimir Dreizin, Shmuel Gal, Meir Feder

Published in: H-0211 in 2003

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

H-0211.pdf

Questions about this service can be mailed to reports@us.ibm.com .