Vol.1, Issue 2, 2015, pp.115-132

Author: Maria Stambolieva

Affiliation: New Bulgarian University, Sofia, Bulgaria

The paper presents ongoing research in contrastive corpus linguistics with envisaged applications in machine translation (MT) and with focus on Google Translate (GT) performance in English-Bulgarian translation. Structural patterns, forms or expressions where automatic translation fails are identified and analysed in view of creating a GT-editing tool providing improved target language output. The paper presents the corpus and the corpus analysis method applied, including the identification of inacceptable string types, their structural analysis and categorization. For each failure type, pre- or post-GT editing transformations are proposed. A first outline is proposed of a GT-editing tool consisting of a pre-GT editor performing string identification, substitution or deletion operations, a post-GT editor with a set of more complex string transformation rules and an additional module transferring structural information.

Key words: machine translation, pre-editing, post-editing, Google Translate, bitext, computational linguistics, corpus linguistics

Article history:
Received: 8 November 2015;
Reviewed: 20 November 2015;
Revised: 29 November2015;
Accepted: 30 November 2015;
Published: 31 December 2015

Citation (APA6):
Stambolieva, Maria. (2015). Where Google fails. English Studies at NBU, 1(2), 115-132. Retrieved from http://esnbu.org/data/files/2015/2015-2-8-stambolieva-pp115-132.pdf

Copyright © 2015 Maria Stambolieva

This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0), which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited. If you want to use the work commercially, you must first get the author's permission.


