Using Machine Learning Techniques to Enhance the Performance of Automatic Backup and Recovery System

A typical disaster recovery system will have mirrored storage at a site that is geographically separate from the main operational site. In many cases, communication between the local site and the backup repository site is performed over a network which is inherently slow, such as a WAN, or is highly strained, for example due to a whole-site disaster recovery operation.

The goal of this work is to alleviate the performance impact of the network in such a scenario, and to do so using machine learning techniques. We focus on two main areas, prefetching and read-ahead size determination. In both cases we significantly improve the performance of the system.

Our main contributions are as follows: We introduce a theoretical model of the system and the problem we are trying to solve and bound the gain from prefetching techniques. We construct two frequent pattern mining algorithms and use them for prefetching. A framework for controlling and combining multiple prefetch algorithms is presented as well. These algorithms, as well as various simple prefetch algorithms, are compared on a simulation environment. We introduce a novel algorithm for determining the amount of read ahead on such a system that is based on intuition from online competitive analysis and on regression techniques. The significant positive impact of this algorithm is demonstrated on IBM’s FastBack system.

Much of our improvements have been applied with little or no modification of the current implementation’s internals. We therefore feel confident in stating that the techniques are general and are likely to have applications elsewhere.

By: Dan Pelleg; Eran Raichstein; Amir Ronen

Published in: H-0279 in 2010

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

h-0279.pdf

Questions about this service can be mailed to reports@us.ibm.com .