Modeling Semantic Concepts to Support Query by Keywords in Video

Statistical modeling for content based retrieval is examined in the context of recent TREC Video benchmark exercise. The TREC Video exercise can be viewed as a test bed for evaluation and comparison of a variety of different algorithms on a set of high-level queries for multimedia retrieval. We report on the use of techniques adopted from statistical learning theory. Our method depend on training of models based on large data sets. Particularly, we use statistical models such the Gaussian mixture models to build computational representations for a variety of semantic concepts including "rocket-launch, outdoor, greenery, sky" etc. Training requires a large amount of annotated (labeled) data. Thus, we explore use of active learning for the annotation engine that minimizes the number of training samples to be labeled for satisfactory performance.

By: Milind R. Naphade, Sankar Basu, John R. Smith, Ching-Yung Lin, Belle L. Tseng

Published in: Proceedings of IEEE International Conference on Image Processing, IEEE, vol.1, p.I/145-8 in 2002

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .