Extraction of Temporal Information from Text Documents

Detailed analysis of time information in documents is a complex problem; the payoffs, however, for advanced applications capable of temporal reasoning are huge. This brief note argues that the graph-like representation typically maintained by temporal reasoners is derivable from what is an emerging standard for rich and robust annotation of temporal information in text.

We highlight some of the main features of TimeML, a temporal annotation language, and outline a mapping process which derives, from a TimeML-compliant representation, an isomorphic set of time-points and intervals. The problem of automatically analysing a document into TimeML is still too complex to tackle fully; however, a non-trivial fragment of TimeML analysis can be carried out by a finite-state based temporal expressions recogniser, running concurrently with a syntactic shallow parser. Broadly, we focus on strategies for identification and temporally anchoring of events. We also present an evaluation of some of the recognition capabilities as they apply to identification of temporal information fragments. The results are encouraging, as an independent evaluation shows that a temporal parser can be grounded into high accuracy recognition of key TimeML components. This, in its own turn, points at the viability of practical end-to-end natural language analysis and reasoning systems for advanced information management

By: Branimir K. Boguraev

Published in: RC22974 in 2003


