Maximizing Information Content of Feature Extraction for Improved Classification with Applications in Speech Recognition

In this paper we consider the problem of extracting maximally informative feature vectors for classifcation, and explore the connection between the information content in the feature vector (quantified by the mutual information between the feature vector and the classes
predicted by the classifier) and the classification accuracy. We explore these ideas in the context of a speech recognition system, where the classification problem is one of predicting the phonetic class, given an observed acoustic feature vector. The connection between information content and classification accuracy is first explored in the context of adding features to a baseline cepstral feature extraction scheme that is very commonly used in speech recognition applications. The features we chose to study relate to the spectral location end energy value of the spectral peaks in the speech signal (similar to formant frequencies). We first quantify the amount of incremental information that the spectral peak features can provide over and above cepstral feature vectors, and subsequently show a connection between the incremental information and speech recognition accuracy. Subsequently, the idea of optimizing mutual information to improve recognition accuracy is generalized to a linear transformation of the underlying features. We show that several prior methods to compute linear transformations (such as linear/heteroschedastic discriminant analysis) can be interepreted in this general framework of maximizing the mutual information. Finally, experimental results are provided that show that designing the feature space to maximize the mutual information can lead to improvements in the accuracy.

By: M. Padmanabhan

Published in: RC22212 in 2001

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

RC22212.pdf

Questions about this service can be mailed to reports@us.ibm.com .