Semantic Annotatino of Multimedia Using Maximum Entropy Models

In this paper we propose a Maximum Entropy based approach for automatic annotation of multimedia content. In our approach, we explicitly model the spatial-location of the low-level features by means of specially designed predicates. In addition, the interaction between the low-level features is modeled using joint observation predicates. We evaluate the performance of semantic concept classifiers using this approach on the TRECVID2003 corpus. Experiments indicate that our approach results are on par with the best results reported on this dataset. This is despite using only unimodal features and a single approach towards model-building. This compares favorably with the state-of-the-art systems which use multimodal features and classifier fusion to achieve similar results.

By: Janne Argillander, Giridharan Iyengar, Harriet Nock

Published in: Proceedings of 2005 IEEE International Conference on Acoustics, Speech, and Signal Processing, Piscataway, NJ, , IEEE. , vol.2, p.153-6 in 2005

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .