Video Query: Beyond the Keywords

Digital video databases are becoming more and more pervasive and finding video in large databases is rapidly becoming a problem. Because of the nature of video (streamed objects), accessing the content of such databases is inherently a time-consuming operation. Enabling intelligent means of video retrieval and rapid video viewing through the processing, analysis and interpretation of visual content are, therefore, important topics of research. In this paper, we propose a model of video retrieval based on iterated sequence of navigating, searching, browsing, and viewing. We describe how video, armed with its rich information media in the forms of image, audio and text, can be appropriately used in each stage of the query process to retrieve segments of relevancy. In addition, we address the problem of automatic video annotation - that of attaching meanings to video segments to facilitate the query steps. Subsequently, we present a novel framework of structural video analysis which focueses on the processing of high-level features in addition to low-level visual cues, and describe several such techniques, to augment the semantic interpretation of a wide variety of long video segments and assist the different stages in the search, navigation and retrieval of video.

By: Ruud M. Bolle, Boon-Lock Yeo and Minerva M. Yeung

Published in: IBM Journal of Research and Development, volume 42, (no 2), pages 253-86 in 1998

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .