Exploring Features from Natural Language Generation for Prosody Modeling

Prosody modeling is critical in developing a Concept-to-Speech (CTS) system where both Natural Language Generation (NLG) and Speech Synthesis are used to automatically generate natural, coherent speech. In this article, we empirically verify the usefulness of various natural language features in prosody modeling. Three groups of features are investigated: semantic, syntactic and surface features produced by SURGE, a general purpose surface natural language generator for English, deep semantic and discourse features which are available during domain modeling and content planning, and features statistically derived from text which have previously been suggested to aect prosody. Our experiments identify which features of this large set of features are reactive in prosody model

By: Shimei Pan , Kathleen McKeown, Julia Hirschberg

Published in: Computer Speech and Language, volume 16, (no 3-4), pages 457-90 in 2002

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .