Effect of Codeword Placement on the Reliability of Erasure Coded Data Storage Systems (Updated version: April 8, 2013)

As an alternative to replication, modern data storage systems employ advanced erasure codes to protect data from storage node failures because of their ability to provide high data reliability at high storage efficiency. In contrast to previous studies, we consider the practical case where the length of codewords in an erasure coded system is much smaller than the number of storage nodes in the system. In this case, there exists a large number of possible ways in which different codewords can be stored across the nodes of the system. For replication-based systems, it is well-known that the mean time to data loss is significantly affected by the choice of placement of replicas. In particular, the declustered replica placement scheme provides significantly higher reliability than other placement schemes. In this paper, these results are extended to erasure coded systems and it is shown that a declustered placement of codewords can significantly improve system reliability. A detailed reliability analysis is presented that accounts for the rebuild times involved, the amounts of partially rebuilt data when additional nodes fail during rebuild, and the fact that modern systems utilize an intelligent rebuild process by rebuilding the most critical codewords first.
A shorted version of this report has been published in: "Quantitative Evaluation of Systems", Proc. 10th Int'l Conf. on Quantitative Evaluation of SysTems "QEST 2013," Buenos Aires, Argentina, Lecture Notes in Computer Science vol. 8054 (springer 2013) 241-257.

By: V. Venkatesan, I. Iliadis

Published in: RZ3827 in 2012


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to reports@us.ibm.com .