Bringing Efficient Advanced Queries to Distributed Hash Tables

Interest in distributed storage is fueled by demand for reliability and resilience combined with ubiquitous availability. Peer-to-peer (P2P) storage networks are known for their decentralized control, self-organization, and adaptation. Advanced searching for documents and resources remains an open problem. The flooding approach favored by some P2P networks is inefficient in resource usage, but more scalable and resource-efficient solutions based on distributed hash tables (DHT) lack in query expressiveness and flexibility. In this paper, we address this issue and introduce new efficient, scalable, and completely distributed methods that strive to keep resource consumption by queries and index information as low as possible. We describe how to improve the handling of multiple subqueries combined through Boolean set operators. The need for these operators is intensified by applications to go beyond simple exact keyword matches. We discuss, optimize, and analyze appropriate extensions to support range and prefix matching in DHTs.

By: Daniel Bauer, Paul Hurley, Roman A. Pletka, Marcel Waldvogel

Published in: Proceedings of 29th Annual IEEE International Conference on Local Computer Networks. Los Alamitos, CA, , IEEE Computer Society., p.6-14 in 2004

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .