Bursting the Cloud Data Bubble: Towards Transparent Storage Elasticity in IaaS Clouds

Storage elasticity on IaaS clouds is a particularly important feature in the context of data-intensive workloads: due to exploding data sizes and increasing scale and complexity, storage requirements can vary greatly during application runtime, making worst-case over-provisioning a poor choice that leads to large waste and unnecessary extra costs. Thus, it is imperative to adapt dynamically to the storage requirements. However, how to leverage elasticity in this context is not well understood. Current approaches simply rely on users to attach and detach virtual disks to the VM instances and then manage them manually, which greatly increases application complexity. Unlike such approaches, this paper aims to provide a transparent solution that presents a unified storage space to the VM in the form of a regular POSIX file system that hides the details of attaching and detaching virtual disks. The main difficulty in this context is to understand the intent of the application in order to pro-actively attach and detach virtual disks, such as to avoid running out of space while minimizing performance overhead of doing so. To this end, we propose a storage space prediction scheme that analyzes multiple system parameters and dynamically adapts monitoring based on intensity of I/O in order to get as close as possible to the real usage. We show the value of our proposal over static worst-case over-provisioning and simpler elastic schemes that rely on a reactive model to attach and detach virtual disks, using both synthetic benchmarks and real life data-intensive applications

By: Bogdan Nicolae, Pierre Riteau, Kate Keahey

Published in: RC25433 in 2013


This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.


Questions about this service can be mailed to reports@us.ibm.com .