An Auxiliary Phrase Table Approach to Closed-Loop Multi-Pass SMT

We describe a new approach to SMT based on closed loop multi-pass decoding as well as on the use of auxiliary phrase tables. In our approach, portions of an initial translation output are automatically selected, matched with their input segments, modified under specific criteria, and reintroduced to subsequent translation passes in the form of phrase tables. The motivation behind this approach is that the generation of rich morphological output (e.g., gender, person, tense) is a problem not easily resolvable within a single decoding iteration but rather, is better addressed after the output of an initial translation has been established. Our SMT experiments show consistent BLEU score improvements under several configurations in an eSupport domain translation test.

By: Juan M. Huerta

Published in: RC25274 in 2012

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc25274.pdf

Questions about this service can be mailed to reports@us.ibm.com .