Analysis and Improvements of the Table Space Allocation Algorithm

Space allocation is a fundamental operation performed by a database management system (DBMS) when it inserts a record into a table. A good space allocation algorithm quickly locates and reserves enough space for a record, places it closer to its related records, and utilizes the available space. Satisfying these conflicting requirements is challenging and trade-offs are carefully balanced by well-chosen heuristics. As a DBMS evolves over time, especially a commercial DBMS, its space allocation algorithm gets more sophisticated and complex and relies on many heuristics. Technological changes, new applications, and greater data volumes render many legacy heuristics ineffective. These factors hinder understanding of space allocation behavior under many workload conditions and make it difficult to enhance the algorithm without causing performance regressions for some of the workloads.

To facilitate research and study the performance of a table space allocation algorithm of a modern DBMS in real-world workload scenarios, we build an extensible simulation framework. We analyze algorithm behavior and make surprising observations. We use the findings to further improve the existing algorithm by proposing algorithm enhancements and showing their benefits with respect to key performance metrics. In conclusion, the proposed framework has been effective in research to understand the performance, improve the space allocation algorithms, and to guide the developers of a commercial DBMS.

By: Shanchan Wu; Yefim Shuf; Hong Min; Hubertus Franke; Bala Iyer; Frances H. Villafuerte; Julie Watts

Published in: RC24631 in 2008

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc24631.pdf

Questions about this service can be mailed to reports@us.ibm.com .