TimeML-Compliant Analysis of Text Documents

Reasoning with temporal information requires a representation of time considerably more involved than just a list of temporal expressions —which typically define the extent of current time extraction efforts. TimeML is an emerging standard for temporal annotation, defining a language for expressing properties and relationships among time-denoting expressions and events in free text. This paper takes the position that TimeML is a good starting point for bridging the gap between temporal analysis of documents and reasoning with information derived from these documents. TimeML-compliant analysis is hard; and the task is made even harder by the small size of the only annotated corpus available to date. To address this, and related, challenges, we have developed and implemented a hybrid TimeML annotator, which uses cascaded finite-state grammars (for temporal expression analysis, shallow syntactic parsing, and feature generation) together with a machine learning component capable of effectively using large amounts of unannotated data. We motivate our mixed strategy; this is work in progress, and we report interim results on the first effort to use the TIMEBANK corpus for building an operational TimeML analyser.

By: Branimir K. Boguraev; Rie K. Ando

Published in: RC23455 in 2004


