Automatic Text Extraction From Video For Content-Based Annotation and Retrieval

Efficient content-based retrieval of image and video databases is an important emerging
application due to rapid proliferation of image and digital video data on the Internet and
corporate intranets and exponential growth of video content in general. Text either embedded or superimposed within video frames is very useful for describing the semantic content of the frames, as it enables both keyword and free-text based search, automatic video logging, and video cataloging. Extracting text directly from video data becomes especially important when closed captioning or speech recognition is not available to generate textual transcripts of audio or when video footage that completely lacks audio needs to be automatically annotated and searched based on frame content. Towards building a video query system, we have developed a scheme for automatically extracting text from digital images and videos for content annotation and retrieval. In this paper, we present our approach to robust text extraction which can handle complex backgrounds in video frames, deal with different font sizes, font styles, and font appearances such as normal
and inverse video. Our algorithm results in segmented characters from video frames that can be directly processed by an OCR system to produce ASCII text. Results from our experiments with over 5,000 frames obtained from twelve MPEG-1 video streams demonstrate the good performance of our system in terms of text identification accuracy and computational efficiency.

By: Jae-Chang Shim, Chitra Dorai, Ruud Bolle

Published in: RC21087 in 1998

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

RC21087.pdf

Questions about this service can be mailed to reports@us.ibm.com .