Title: Intelligent Hybrid Man-Machine Translation Evaluation
Author : Ibrahim Ahmed Ibrahim Saleh Sabek
Collection : M.Sc. Computer
Abstract:
Machine Translation (MT) has grasped a lot of attention in translation communities during the recent years and become a crucial part in almost all search engines. However, the widespread of MT technology depends on the trust associated with its outputs.
Different approaches have been introduced to address the issues of evaluating translations from one natural language to another.
Automatic metrics have been developed to predict the quality of MT outputs. Although these metrics are efficient in terms of speed, the existence of reference translations is assumed. Another research direction, known as Quality Estimation (QE), was proposed to exploit human assessments for evaluation based on machine learning techniques and without reference translations.
Both of automatic metrics and QE approaches have drawbacks. Automatic metrics paid little attention to capture any information at linguistic levels further than lexical. Therefore, these metrics are considered superficial. On the other hand, QE approaches rely only on human assessments which are much more expensive to obtain. Moreover, human assessments can vary for the same translated sentence.
In this thesis, the drawbacks of these two directions are addressed. We extracted a set of linguistic and data-driven features from parallel corpora to evaluate MT outputs. The advantages of these features are twofold. First, they provide a deep linguistic insight which addresses a key issue in automatic metrics. Second, these features are extracted from parallel corpora without the need for expensive human assessments. The experimental evaluation shows that our proposed system outperforms state-of-the-art automatic metrics in terms of accuracy.
ليست هناك تعليقات:
إرسال تعليق