Cyril Goutte, Nicola Cancedda, Marc Dymetman, and George Foster (eds)
- Published in print:
- 2008
- Published Online:
- August 2013
- ISBN:
- 9780262072977
- eISBN:
- 9780262255097
- Item type:
- book
- Publisher:
- The MIT Press
- DOI:
- 10.7551/mitpress/9780262072977.001.0001
- Subject:
- Computer Science, Machine Learning
The Internet gives us access to a wealth of information in languages we don’t understand. The investigation of automated or semi-automated approaches to translation has become a thriving research ...
More
The Internet gives us access to a wealth of information in languages we don’t understand. The investigation of automated or semi-automated approaches to translation has become a thriving research field with enormous commercial potential. This book investigates how Machine Learning techniques can improve Statistical Machine Translation, currently at the forefront of research in the field. It looks first at enabling technologies—technologies that solve problems which are not Machine Translation proper but are linked closely to the development of a Machine Translation system. These include the acquisition of bilingual sentence-aligned data from comparable corpora, automatic construction of multilingual name dictionaries, and word alignment. The book then presents new or improved statistical Machine Translation techniques, including a discriminative training framework for leveraging syntactic information, the use of semi-supervised and kernel-based learning methods, and the combination of multiple Machine Translation outputs in order to improve overall translation quality.Less
The Internet gives us access to a wealth of information in languages we don’t understand. The investigation of automated or semi-automated approaches to translation has become a thriving research field with enormous commercial potential. This book investigates how Machine Learning techniques can improve Statistical Machine Translation, currently at the forefront of research in the field. It looks first at enabling technologies—technologies that solve problems which are not Machine Translation proper but are linked closely to the development of a Machine Translation system. These include the acquisition of bilingual sentence-aligned data from comparable corpora, automatic construction of multilingual name dictionaries, and word alignment. The book then presents new or improved statistical Machine Translation techniques, including a discriminative training framework for leveraging syntactic information, the use of semi-supervised and kernel-based learning methods, and the combination of multiple Machine Translation outputs in order to improve overall translation quality.