Cost Models Do Matter: Providing Cost Information for Diverse Data Sources in a Federated System

        An important issue for federated systems of diverse data sources in how to optimize cross-source queries, without building knowledge of individual sources into the optimizer. Garlic is a federated system with an emphasis on extensibility and diverse sources. To achieve these goals, data sources are attached to Garlic by means of a wrapper. Wrappers participate in query planning, telling Garlic what parts of a query a data source can do and how much it will cost. This paper describes a framework through which wrappers provide the necessary cost and cardinality information for optimization, and the facilities Garlic provides to make this task easier. Our framework makes it easy for wrappers to provide cost information, requires few changes to a conventional bottom up optimizer and is easily extensible to a broad range of sources. WE believe that our framework for costing is the first to allow accurate cost estimates for diverse sources within the context of a traditional cost-based optimizer. WE demonstrate the importance of cost information in choosing good plan, the flexibility of our frameworks, the accuracy it allows, and finally, that it works - the optimizer is able to choose good plans even for complex cross-source queries.

By: M. Tork Roth, L. M. Haas, F. Ozcan

Published in: RJ10141 in 1999

This Research Report is not available electronically. Please request a copy from the contact listed below. IBM employees should contact ITIRC for a copy.

Questions about this service can be mailed to reports@us.ibm.com .