Labeled Multi-Lingual Data for Cognate Detection

The attachments to this document contain files with data sets that may be used to develop, evaluate, and compare methods for automatic detection of cognates. Two language pairs are included: English-French (filenames with “FR-EN”), and English-Russian (filenames with “RU-EN”).

By: Jiri Navratil, Nanyun Peng

Published in: RC25527 in 2015

rc25527.pdf

;

cognate_FR-EN.data

;

cognate_FR-EN.dev.tsv

;

cognate_RU-EN.data

;

cognate_RU-EN.dev.tsv

Questions about this service can be mailed to reports@us.ibm.com .