Prosody modeling is critical in developing a Concept-to-Speech (CTS) system where both Natural Language Generation (NLG) and Speech Synthesis are used to automatically generate natural, coherent speech. In this article, we empirically verify the usefulness of various natural language features in prosody modeling. Three groups of features are investigated: semantic, syntactic and surface features produced by SURGE, a general purpose surface natural language generator for English, deep semantic and discourse features which are available during domain modeling and content planning, and features statistically derived from text which have previously been suggested to aect prosody. Our experiments identify which features of this large set of features are reactive in prosody model
By: Shimei Pan , Kathleen McKeown, Julia Hirschberg
Published in: Computer Speech and Language, volume 16, (no 3-4), pages 457-90 in 2002
Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.
Questions about this service can be mailed to reports@us.ibm.com .