TimeML-Compliant Analysis of Text Documents

Reasoning with temporal information requires a representation of time considerably more involved than just a list of temporal expressions —which typically define the extent of current time extraction efforts. TimeML is an emerging standard for temporal annotation, defining a language for expressing properties and relationships among time-denoting expressions and events in free text. This paper takes the position that TimeML is a good starting point for bridging the gap between temporal analysis of documents and reasoning with information derived from these documents. TimeML-compliant analysis is hard; and the task is made even harder by the small size of the only annotated corpus available to date. To address this, and related, challenges, we have developed and implemented a hybrid TimeML annotator, which uses cascaded finite-state grammars (for temporal expression analysis, shallow syntactic parsing, and feature generation) together with a machine learning component capable of effectively using large amounts of unannotated data. We motivate our mixed strategy; this is work in progress, and we report interim results on the first effort to use the TIMEBANK corpus for building an operational TimeML analyser.

By: Branimir K. Boguraev; Rie K. Ando

Published in: RC23455 in 2004

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc23455.pdf

Questions about this service can be mailed to reports@us.ibm.com .