LeeWave: Level-Wise Distribution of Wavelet Coefficients for Processing kNN Queries over Distributed Streams

We present LEEWAVE − a bandwidth-efficient approach to searching range-specified k-nearest neighbors among distributed streams by level-wise distribution of wavelet coefficients. To find the top-k similar streams to a rangespecified reference one, the relevant wavelet coefficients of the reference stream can be sent to the peer sites to compute the similarities. However, bandwidth can be unnecessarily wasted if the entire relevant coefficients are sent simultaneously. Instead, we present a level-wise approach by leveraging the multi-resolution property of the wavelet coefficients. Starting from the top and moving down one level at a time, the query initiator sends only the single-level coefficients to a progressively shrinking set of candidates. However, there is one difficult challenge in LEEWAVE: how does the query initiator prune the candidates without knowing all the relevant coefficients? To overcome this challenge, we derive and maintain a similarity range for each candidate and gradually tighten the bounds of this range as we move from one level to the next. The increasingly tightened similarity ranges enable the query initiator to effectively prune the candidates without causing any false dismissal. Extensive experiments with real and synthetic data show that, when compared with a naive one, LEEWAVE uses significantly less bandwidth under a wide range of conditions.

By: Mi-Yen Yeh; Kun-Lung Wu; Philip S. Yu; Ming-Syan Chen

Published in: RC24450 in 2007

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc24450.pdf

Questions about this service can be mailed to reports@us.ibm.com .