Machine Translation


Machine Translation (MT) is the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language. While machine translation is one of the oldest subfields of artificial intelligence research, the recent shift towards large-scale empirical techniques has led to very significant improvements in translation quality. The Stanford Machine Translation group's research interests lie in techniques that utilize both statistical methods and deep linguistic analyses.

Research in our group currently focuses on the following topics:

Better Training in MT

Determining the appropriate weights for a translation system’s decoding model is usually performed using Minimum Error Rate Training (MERT), a procedure that optimizes the system’s performance on an automated measure of translation quality. In our lab, we have developed improved algorithms for performing MERT (Cer et al. 2008). We have also studied the consequences of training to different automated translation evaluation metrics. We found surprisingly that training to different popular word sequence matching based evaluation metrics, such a BLEU, TER, and METEOR, did not seem to have a reliable impact on human preferences for the resulting translations (Cer et al. 2010). However, preliminary results suggest that training to our textual entailment based evaluation metric, which performs a deep semantic analysis of the translations being evaluated, may in fact produce better translation performance (Pado et al. 2009). Currently, we are continuing to investigate the feasibility and effectiveness of training to evaluation metrics that perform a deeper semantic and syntactic analysis of the translations being evaluated.

Chinese MT

Our work also focuses on improving Chinese-to-English translation using deep source-side linguistic analysis. In our Chinese-English system, we train a classifier to categorize each occurrence of 的 (DE) according to its syntactic and semantic context. We use this classifier to preprocess MT data by explicitly labeling 的 constructions, as well as reordering phrases. Our Chinese-English system also uses typed dependencies identified in the source sentence to improve a lexicalized phrase reordering model. Finally, we have also done work to improve the segmentation consistency of our Chinese word segmenter, a characteristic that is often desirable in MT. These three components all show significant gains in translation performance, and are respectively described in (Chang et al., 2009a) (Chang et al., 2009b), and (Chang et al., 2008).

Arabic MT

Although Arabic-to-English translation quality has improved significantly in recent years, pervasive problems remain. One of them is the re-ordering of verb-initial clauses--especially matrix clauses--during translation. We have recently developed a high-precision Arabic subject detector that can be integrated into phrase-based translation pipelines (Green et al., 2009). A characteristic feature of our work is the decision to influence decoding directly instead of re-ordering the Arabic input prior to translation. We have also created a state-of-the-art Arabic parser that can be used for a variety of MT tasks.


NIST Evaluations

Our group has participated in two NIST Open MT evaluations. We submitted one Chinese-English system in 2008, which was ranked as the 8th best system (out of 20 institutions), and submitted one Arabic-English system in 2009, which was ranked as the 2nd best system (out of 13 institutions).

Descriptions of our NIST systems:


We have released as open source Phrasal, the state-of-the-art phrase-based decoder developed by our group.