One of the principal problems in Information Lifecycle Management is to align the business value of data with the most cost-effective and appropriate storage infrastructure. In this paper, we introduce ACE: a framework of tools for ILM, that classifies data and storage resources, and generates a data placement plan for informed utilization of the available storage resources in the system. The goal of ACE is to design a data placement plan that provides cost benefits to an organization while allowing efficient access to all important data. To achieve this goal, ACE uses a policy-based approach to classify data and storage based on the metadata attributes and capabilities respectively. The main advantage of using ACE is that it enables appropriate usage of under-utilized storage systems without extensive human intervention. Another key characteristic of ACE is that it uses a policy-based architecture to automate the process of data valuation and storage classification.

We implement the ACE framework and evaluate its benefits for three real data sets. One data sets consists of 1.28 million anonymous medical industry record files of total size 1461GB, and we show that ACE provides a cost benefit of greater than 70% over the lifetime of the data. In addition to the novel valuation algorithms and overall architecture, we also demonstrate optimizations that reduce the total performance time to 85% of the time taken without these optimizations, while still maintaining classification accuracy of over 85%.

By: Gauri Shah; Kaladhar Voruganti; Piyush Shivam; Maria Alvarez

Published in: RJ10372 in 2006


